Skip to main content
Machine Learning & Deep Learning

⏱ About 15 min15 XP

The Cost of Deep Learning

When you type a question into an AI chatbot and get an answer in two seconds, that feels effortless. Behind the scenes, however, something expensive happened — and happened millions of times that day. Deep learning's power comes with real costs: financial, environmental, and social. Understanding those costs is part of understanding what deep learning actually is.

The Financial Cost of Training

Training a large deep learning model from scratch is extraordinarily expensive. GPT-3, released by OpenAI in 2020, is estimated to have cost somewhere between four and twelve million dollars just for the computation during its training run. GPT-4 cost an estimated one hundred million dollars or more. These are one-time costs — but they must be paid before the model ever answers a single user question. Those numbers come from renting thousands of specialized processors for weeks or months at a time. A single high-end AI training chip costs tens of thousands of dollars to purchase, and cloud providers charge by the hour to rent clusters of them. Small universities, independent researchers, and organizations in lower-income countries generally cannot afford to train frontier models from scratch. Running a trained model — called inference — is cheaper per query but scales with every user. A model that serves one hundred million users per day costs millions of dollars per month in inference hardware, electricity, and cooling. The largest AI labs spend hundreds of millions annually just to keep their models running.

Training vs. Inference

Training is the one-time process of adjusting a model's weights using a large dataset. Inference is using a trained model to answer a new question. Training is far more expensive per run, but inference costs accumulate continuously at scale. Both matter to understanding who can afford deep learning.

The Energy and Environmental Cost Compute costs electricity. A single training run for a large language model can consume as much electricity as several hundred homes use in a year. Data centers also require enormous amounts of water to cool their servers — some large facilities use millions of liters of water per day. A 2019 study from the University of Massachusetts estimated that training a large NLP model with extensive hyperparameter search produced roughly the lifetime carbon emissions of five average American cars. The field has become significantly more efficient since then, but the models have also become significantly larger, and the net effect on total emissions is debated. AI companies are increasingly powering data centers with renewable energy, but the grid electricity used during peak demand often still includes fossil fuels. The environmental cost is real, unequally distributed — data centers are often built where land and electricity are cheap, not where AI benefits are concentrated — and ongoing. The Data Cost Frontier models require billions or trillions of words and millions or billions of images as training data. Collecting that data at scale raises its own questions. Much of the text and image data used to train large models was scraped from the web without the explicit consent of the people who created it. Artists, writers, and programmers have raised legal and ethical objections to their work being used to train systems that then compete with them commercially. Data labeling — hiring humans to tag training examples — is another hidden cost. Workers in countries with lower wages are often paid very little to perform emotionally difficult work, such as reviewing graphic content to train content-moderation models.

Who Can Afford Deep Learning?

The combination of financial, energy, and data costs means that frontier deep learning is concentrated in a tiny number of organizations. As of the mid-2020s, the organizations training the largest and most capable models are primarily a handful of US and Chinese technology corporations and a small number of well-funded research labs. This concentration has real consequences. The organizations that control the most powerful models also control the values embedded in those models, the data those models were trained on, and the access policies that determine who can use them. Researchers and policymakers debate whether this level of concentration is healthy for society. Smaller organizations and individuals can use powerful models through APIs — paying per query — or by fine-tuning open-weight models that larger organizations have released. These options lower the barrier to access without eliminating the underlying concentration of who creates and controls frontier AI.

The Cost Is Not Neutral

Deep learning's costs are not shared equally. The environmental impact falls disproportionately on communities near data centers. The data labor burden falls on workers in lower-wage countries. The financial barrier to training frontier models concentrates power in a few large organizations. Recognizing these dynamics is part of thinking critically about AI.

Fill in each blank with the correct term.

The one-time process of adjusting a model's weights on a large dataset is called . Using a trained model to answer new questions is called . The concentration of frontier AI in a few organizations raises concerns about the of power in the AI industry.

Why are only a small number of organizations able to train frontier deep learning models from scratch?

What is a 'data labeler' and why do they matter to AI development?

True Cost Audit

  1. Choose one AI product you use or know about (a chatbot, an image generator, a recommendation system).
  2. For each of the four cost categories — financial, energy, environmental, data labor — write one specific question you would want answered before deciding this product is worth its costs.
  3. Research at least one of your questions using a reliable source and write a two-sentence summary of what you found.
  4. Share your findings with the class. Which cost category generated the most concern?