Skip to content

Pedigree (Weidema/Ciroth)

What it is

A five-axis quality score attached to every emission factor used in the calculation. Originally proposed by Weidema and Wesnæs (1996) and elaborated by Ciroth et al. (2016), the pedigree matrix scores data quality on a 1-to-5 scale across five axes:

Axis 1 (best) 2 3 4 5 (worst)
Reliability Verified data, measurement Verified data, partly assumption Non-verified data, partly assumption Qualified estimate Non-qualified estimate
Completeness Representative, sufficient sample Representative, smaller set Representative, > 50% sites Representative, < 50% sites Unknown
Temporal correlation < 3 years < 6 years < 10 years < 15 years Unknown / older
Geographic correlation Area under study Similar area Different area Unknown Unrelated
Technological correlation Same technology Related technology Different technology, same materials Different processes, same technology Unrelated

Why we use it

The pedigree score is a structured, auditable representation of "how confident are we in this emission factor for this query?" It feeds two downstream operations:

  1. Monte Carlo prior dispersion — pedigree scores are mapped to log-normal standard deviations using Ciroth's lookup table, which become the priors for the Monte Carlo variance propagation.
  2. Conformal interval width — wider pedigree priors produce wider conformal intervals at calibration time.

Mapping example

For a Mistral Medium 3 query running on Scaleway PAR-1 with live ENTSO-E grid data:

Emission factor Reliability Completeness Temporal Geographic Technological
GPU energy (BoaviztAPI) 2 2 1 1 2
Host CPU+DRAM share 2 3 1 1 2
Datacentre PUE 2 2 2 1 1
Grid intensity (ENTSO-E live) 1 1 1 1 1
Embodied amortisation 3 3 2 2 2

The composite pedigree on the receipt is the median across factors, weighted by their share of the total impact. For tier 2, the typical composite is [2, 2, 1, 1, 2] — verified data, partly based on assumptions, recent, geographically and technologically aligned.

Pedigree score → log-normal SD lookup

We use Ciroth et al.'s lookup, simplified:

Score Reliability SD Completeness SD Temporal SD Geographic SD Technological SD
1 1.00 1.00 1.00 1.00 1.00
2 1.05 1.02 1.03 1.01 1.18
3 1.10 1.05 1.10 1.02 1.50
4 1.20 1.10 1.20 1.10 2.00
5 1.50 1.20 1.50 1.50 3.00

The composite log-normal SD is the geometric combination of the per-axis SDs; this becomes the prior for that emission factor in the Monte Carlo simulation.

Auditor expectations

Assurance partners under ISAE 3000 review the pedigree-score worksheet as evidence of methodological rigour. We provide:

  • The pedigree score for every emission factor used in the period
  • The justification for each score
  • The lookup table version used to convert pedigree to log-normal SD
  • The composite score as it appears on each receipt

Where this is implemented

methodology/uncertainty/pedigree.py

Citations

  • Weidema, B. P., & Wesnæs, M. S. (1996). Data quality management for life cycle inventories — an example of using data quality indicators. J. Cleaner Production 4(3-4).
  • Ciroth, A., Muller, S., Weidema, B. P., & Lesage, P. (2016). Empirically based uncertainty factors for the pedigree matrix in ecoinvent. Int. J. Life Cycle Assessment 21(9).