Signal

WebGL GPU trace

Fingerprints the GPU's execution-unit timing pattern by selectively stalling individual shader EUs and measuring the resulting GPU elapsed time.

Tier 2 engine src/signals/webgl.ts

What it measures

Each GPU model has a characteristic pattern of relative timing across its shader execution units. The signal measures that pattern by running a vertex shader that stalls exactly one EU per draw call — cycling through all 8 EU points across 5 rounds — and recording how long each draw takes. After discarding round 0 as a warmup, the 4 data rows are row-normalised (z-scored) and hashed.

Normalisation removes absolute timing differences caused by GPU clock speed or CPU scheduling jitter, leaving only the shape of the pattern, which is stable for a given GPU model. This approach is based on the DrawnApart research technique.

How it is collected

A WebGL2 context is acquired on an OffscreenCanvas of dimensions GPU_EU_POINTS x 1 (8x1), with a fallback to a DOM canvas. The shader program is compiled and linked; if either step fails the signal returns absent. The EXT_disjoint_timer_query_webgl2 extension is requested for GPU-native timing; when unavailable, performance.now() brackets each draw call instead. Each round runs 8 draw calls with uTargetPoint stepped from 0 to 7 and uStallCount fixed at 50000. The query result is polled asynchronously up to 100 ms per query. After all 5 rounds, the warmup row is discarded, each of the 4 remaining rows is z-scored to 6 decimal places, and the normalised rows are joined and hashed via xxHash64.

glsl
#version 300 es
precision highp float;
uniform int uStallCount;
uniform int uTargetPoint;
in float aPointIndex;
void main() {
  gl_Position = vec4(0.0, 0.0, 0.0, 1.0);
  gl_PointSize = 1.0;
  if (int(aPointIndex) == uTargetPoint) {
    float x = 1.0;
    for (int i = 0; i < uStallCount; i++) {
      x = sin(x);
    }
    gl_Position.x = x * 0.00001;
  }
}

Vertex shader — stalls the targeted EU with 50000 sin() iterations (src/signals/webgl.ts:412)

Confidence rules

ConfidenceTrigger
normalData collected and at least one non-zero timing value in the data matrix
degradedAll values in the data matrix are 0 — timer resolution too low to distinguish EU stalls
absentNo WebGL2 context available, shader compilation failed, or top-level catch fires

Why engine-bound

Raw GPU EU timing values are influenced by browser throttling policies, background tab scheduling, and EXT_disjoint_timer_query_webgl2 availability, which differs per browser. Chrome typically exposes the timer extension; Firefox and Safari may not, forcing the performance.now() fallback that produces lower-resolution timings. The normalisation step partially removes absolute timing differences, but the pattern shape still varies with throttling behaviour. The signal was explicitly reclassified from hardware to engine binding for this reason.

Things worth knowing

  • GPU_EU_POINTS = 8, GPU_EU_ROUNDS = 5 (1 warmup + 4 data), GPU_EU_STALL_COUNT = 50000 — all reduced from original DrawnApart parameters for performance.
  • Round 0 is always discarded as a warmup to avoid cold-start GPU pipeline overhead.
  • When all values in a normalised row are identical, stddev is forced to 1 to avoid NaN (zero-stddev guard).
  • The collector yields after every 2 GPU operations within a round and between rounds to avoid blocking the main thread.