Skip to main content
Frontier & Future AI

⏱ About 20 min20 XP

Energy and Sustainability Limits

Training a single frontier AI model can consume as much electricity as hundreds of American households use in a year. Running a large language model at scale — answering billions of queries per day — draws power continuously at the level of a small city. These are not hypothetical projections. They are measurements from the data centers that run today's systems. The rapid scaling of AI capability has a physical price paid in watts, liters of water, and tons of carbon dioxide. Understanding this price is not peripheral to understanding AI — it is central to evaluating whether the current development trajectory is sustainable.

Training Cost: The One-Time Expense

Training a frontier model is a one-time but enormous energy event. A rough estimate for training GPT-3 (released 2020, 175 billion parameters) was approximately 1,287 megawatt-hours of electricity — comparable to the annual electricity consumption of over 100 American homes. GPT-4 and comparable 2023-vintage models are believed to require substantially more, though exact figures are not publicly disclosed by most labs. This training energy has a carbon footprint that depends critically on where the data center is located and what its power source is. Training in a region powered primarily by coal generates roughly 500 grams of CO2 per kWh; training in a region with substantial renewables may generate under 50 grams per kWh. The same training run can have a carbon footprint 10 times larger depending on geography and energy source — a fact that has made data center location and power sourcing strategically important decisions. Beyond carbon, training data centers consume large quantities of water for cooling. Cooling systems — which dissipate the heat generated by dense GPU clusters running at full power — use water evaporation as a heat-rejection mechanism. Estimates for large training runs reach tens of millions of liters of water. In water-stressed regions, this creates competition with agricultural and municipal uses. The training cost is a one-time investment per model version. What matters for ongoing sustainability is inference cost.

Inference Cost Dominates at Scale

Training cost is large but one-time. Inference — running the model to answer queries — happens continuously at enormous scale. For a widely-deployed model, the cumulative inference cost can exceed the training cost within months. Sustainability analysis must account for inference, not just training.

Inference Cost: The Ongoing Expense

Inference is the process of running a trained model to produce a response to a user query. Unlike training, which happens once, inference happens millions to billions of times per day for widely-deployed systems. A single query to a large language model — generating a few hundred tokens — uses roughly the equivalent of running a search engine query times 10 in energy terms. Queries that require long outputs, multi-turn conversations, or multiple model calls (as in agentic systems) can cost substantially more per query. Microsoft's own sustainability reports noted that its emissions increased by nearly 30% between 2020 and 2023, partially attributable to the expansion of AI infrastructure. Google's 2024 environmental report similarly disclosed a significant increase in emissions driven by data center growth for AI. These disclosures, while incomplete, confirm that at-scale AI inference is a major and growing contributor to corporate carbon footprints. The inference cost creates a tension at the heart of AI deployment: the most capable models tend to be the largest, and the largest models are the most expensive to run per query. Deploying a frontier model for every query is not economically or environmentally sustainable at web scale. This drives the development of model compression techniques — distillation, quantization, pruning — that attempt to achieve near-frontier performance at dramatically lower inference cost. Smaller, specialized models (often called edge models) that run on device — on a smartphone, a laptop, an IoT sensor — consume orders of magnitude less energy than cloud-based frontier models. This architectural alternative trades some capability for dramatic efficiency gains. The tradeoff is not neutral: edge models cannot match frontier models on the most demanding tasks, but for many common tasks they may be adequate while consuming a fraction of the energy.

Renewable energy is frequently cited as the solution to AI's carbon footprint. In principle, powering data centers with wind and solar eliminates operational carbon emissions. In practice, three complications arise. First, renewable supply is intermittent — the sun does not always shine, the wind does not always blow. Data centers require 24/7 power. Without massive grid-scale battery storage (which does not yet exist at the needed scale), data centers cannot run solely on renewable supply without fossil backup. Second, large-scale AI infrastructure expansion is outpacing renewable capacity additions in many regions. Building and powering new data centers faster than the renewable energy grid grows means that marginal electricity consumption draws on fossil sources even if average consumption is offset by renewable certificates. Third, the physical infrastructure of AI — the GPUs, servers, networking hardware, and building materials of data centers — has a manufacturing footprint that is separate from operational electricity. Producing a high-end GPU involves rare materials, energy-intensive semiconductor fabrication, and global supply chains with their own emissions. None of this means AI development should stop. It means the sustainability calculus is genuinely complex, and claims of 'carbon-neutral AI' require careful scrutiny of what exactly is being offset and how.

Match each sustainability concept to its accurate description in the context of AI infrastructure.

Terms

Training energy cost
Inference energy cost
Water consumption
Model distillation
Edge model

Definitions

The ongoing electricity consumed answering user queries — cumulative cost can exceed training within months at scale
A one-time large electricity expenditure to train a model version, comparable to hundreds of home-years of electricity
Training a smaller model to replicate a larger model's outputs, reducing inference energy per query
A compressed model that runs on local devices, trading some capability for orders-of-magnitude lower energy consumption
Used in evaporative cooling systems to dissipate heat from GPU clusters during training and inference

Drag terms onto their definitions, or click a term then click a definition to match.

A technology company claims its AI service is 'carbon neutral' because it purchases renewable energy certificates (RECs) equal to its electricity consumption. Which of the following is the most accurate critique of this claim?

For a large language model deployed at consumer scale (millions of queries per day), which cost typically exceeds the other over a full year of operation?

Estimate and Compare AI Energy Footprints

  1. You will build a rough energy cost comparison between two AI deployment architectures for the same task: answering customer service queries for a retail company receiving 100,000 queries per day.
  2. Architecture A: Cloud-based frontier model. Each query costs approximately 0.001 kWh. The data center's carbon intensity is 400 grams CO2 per kWh (a coal-and-gas-heavy grid).
  3. Architecture B: On-device small model running on customer smartphones. Each query costs approximately 0.000005 kWh. The phone charges from the user's local grid, average carbon intensity 300 grams CO2 per kWh.
  4. Step 1: Calculate the daily and annual energy consumption (kWh) for each architecture at 100,000 queries per day.
  5. Step 2: Calculate the daily and annual CO2 emissions (grams, then convert to kg) for each architecture.
  6. Step 3: Architecture A's model required 1,500 MWh to train. Architecture B's edge model required 50 MWh to train. At what point in time does Architecture A's cumulative operational emissions equal its training emissions? At what point does Architecture B?
  7. Step 4: List two capability trade-offs you would expect if the company switched from Architecture A to Architecture B.
  8. Step 5: Write a one-paragraph recommendation to the company's sustainability officer and product team. What factors should govern the architecture choice?
  9. Discuss: Is there a scenario where the more capable (and more energy-intensive) model is the more ethical choice? Why or why not?