Carbon-LLM/LLM CO₂ benchmark
Back

LLM CO₂ benchmark (indicative)

Orders of magnitude for product and sustainability teams — same coefficients power the live API, with full citations on the methodology page.

How to read this table
Values are grams CO₂e per 1,000 total tokens (prompt + completion), using consolidated model-level factors. Real deployments vary by region, hardware, and load — use this for comparison and planning, then instrument production with the API for auditable totals.

Peer-reviewed and vendor sources underpin the measured and benchmarked rows; remaining models use transparent estimated class factors. See Methodology for URLs and formulas.

Coefficients by model
Snapshot of the coefficient set used by /estimate and /track.
ModelgCO₂e / 1k tokensConfidence
gpt-4o0.3Benchmarked
gpt-4o-mini0.1Benchmarked
gpt-4-turbo0.35Estimated
gpt-3.5-turbo0.08Estimated
claude-3-5-sonnet0.3Benchmarked
claude-3-opus0.45Benchmarked
claude-3-haiku0.1Benchmarked
mistral-large-22.85Measured
mistral-small0.8Estimated
mistral-medium1.2Estimated
gemini-1-5-flash0.075Measured
gemini-1-5-pro0.12Measured
gemini-2-0-flash0.08Measured
llama-3-70b0.25Benchmarked
llama-3-8b0.05Benchmarked

Need the full source strings and PDF narrative? See the complete table on Methodology.