Tier 3 — Telemetry¶
What it is¶
A per-query estimate based on direct GPU and host telemetry from vLLM and NVML, sampled at 100 ms resolution and pinned to live grid carbon intensity at calculation time. The tightest available defensible estimate.
Inputs¶
- NVML samples: GPU power draw, memory bandwidth, utilisation, temperature
- vLLM execution metrics: per-request GPU time, batch composition, prefill/decode split
- Host telemetry: CPU utilisation per core, DRAM allocation
- Live grid intensity: ENTSO-E + regional ISO/TSO at calculation time
- Provisioned-idle allocation: batch composition × idle Wh share
Where tier 3 is available¶
Tier 3 is available on routes where we have telemetry cooperation:
- Calibration fleet — our quarterly calibration runs use a Hetzner H100 instance running vLLM with NVML telemetry. Used to calibrate tier 2 conformal intervals across model families and regions.
- Selected upstream providers — providers who expose telemetry via Carbon-Aware SDK (Green Software Foundation) or equivalent. Currently includes Scaleway (selected SKUs) and atNorth (all SKUs). The list expands as we onboard providers.
- Customer-dedicated deployments (Enterprise tier) — for customers who run a dedicated deployment, telemetry is by default tier 3.
For the default auto tier, tier 3 is returned where available; tier 2 fallback applies otherwise. The receipt declares which tier was used.
Output¶
{
"tier": "telemetry",
"tier_id": "03",
"co2e": {
"median_g": 0.97,
"ci90": {"low": 0.83, "high": 1.11},
"boundary": "comprehensive"
},
"telemetry": {
"gpu_wh": 0.198,
"host_wh": 0.034,
"idle_wh": 0.012,
"pue_factor": 1.18,
"samples": 47,
"sample_resolution_ms": 100
},
"pedigree": [1, 1, 1, 1, 1]
}
Pedigree expectations¶
Tier 3 routinely scores [1, 1, 1, 1, 1] on the Weidema axes — the ideal score:
- Reliability (1): verified data based on measurements
- Completeness (1): representative data from a sufficient sample over an adequate period
- Temporal (1): less than 3 years old (in fact, real-time)
- Geographic (1): data from area under study
- Technological (1): data from enterprises, processes, and materials under study
Calibration role¶
Tier 3 measurements are the calibration set against which tier 2 conformal intervals are fit. Quarterly:
- We run a representative workload on the calibration fleet
- Tier 3 measures actual energy
- Tier 2 produces parametric estimates for the same workload
- The conformal residuals are fitted; the resulting 90th-quantile becomes the tier-2 interval width
This is what makes tier 2's 90% interval honest: it is calibrated to cover the realised tier-3 truth at least 90% of the time.
Where this is implemented¶
methodology/tier3/telemetry.py
Citations¶
- vLLM Project. github.com/vllm-project/vllm.
- NVIDIA. NVML — NVIDIA Management Library. developer.nvidia.com/nvidia-management-library-nvml.
- Green Software Foundation. Carbon Aware SDK. greensoftware.foundation/projects/carbon-aware-sdk.