Tier 3 — Telemetry¶

What it is¶

A per-query estimate based on direct GPU and host telemetry from vLLM and NVML, sampled at 100 ms resolution and pinned to live grid carbon intensity at calculation time. The tightest available defensible estimate.

Inputs¶

NVML samples: GPU power draw, memory bandwidth, utilisation, temperature
vLLM execution metrics: per-request GPU time, batch composition, prefill/decode split
Host telemetry: CPU utilisation per core, DRAM allocation
Live grid intensity: ENTSO-E + regional ISO/TSO at calculation time
Provisioned-idle allocation: batch composition × idle Wh share

Where tier 3 is available¶

Tier 3 is available on routes where we have telemetry cooperation:

Calibration fleet — our quarterly calibration runs use a Hetzner H100 instance running vLLM with NVML telemetry. Used to calibrate tier 2 conformal intervals across model families and regions.
Selected upstream providers — providers who expose telemetry via Carbon-Aware SDK (Green Software Foundation) or equivalent. Currently includes Scaleway (selected SKUs) and atNorth (all SKUs). The list expands as we onboard providers.
Customer-dedicated deployments (Enterprise tier) — for customers who run a dedicated deployment, telemetry is by default tier 3.

For the default auto tier, tier 3 is returned where available; tier 2 fallback applies otherwise. The receipt declares which tier was used.

Output¶

{
  "tier": "telemetry",
  "tier_id": "03",
  "co2e": {
    "median_g": 0.97,
    "ci90": {"low": 0.83, "high": 1.11},
    "boundary": "comprehensive"
  },
  "telemetry": {
    "gpu_wh": 0.198,
    "host_wh": 0.034,
    "idle_wh": 0.012,
    "pue_factor": 1.18,
    "samples": 47,
    "sample_resolution_ms": 100
  },
  "pedigree": [1, 1, 1, 1, 1]
}

Pedigree expectations¶

Tier 3 routinely scores [1, 1, 1, 1, 1] on the Weidema axes — the ideal score:

Reliability (1): verified data based on measurements
Completeness (1): representative data from a sufficient sample over an adequate period
Temporal (1): less than 3 years old (in fact, real-time)
Geographic (1): data from area under study
Technological (1): data from enterprises, processes, and materials under study

Calibration role¶

Tier 3 measurements are the calibration set against which tier 2 conformal intervals are fit. Quarterly:

We run a representative workload on the calibration fleet
Tier 3 measures actual energy
Tier 2 produces parametric estimates for the same workload
The conformal residuals are fitted; the resulting 90^th-quantile becomes the tier-2 interval width

This is what makes tier 2's 90% interval honest: it is calibrated to cover the realised tier-3 truth at least 90% of the time.

Where this is implemented¶

methodology/tier3/telemetry.py

Citations¶

vLLM Project. github.com/vllm-project/vllm.
NVIDIA. NVML — NVIDIA Management Library. developer.nvidia.com/nvidia-management-library-nvml.
Green Software Foundation. Carbon Aware SDK. greensoftware.foundation/projects/carbon-aware-sdk.