Feature
Sponsored
AI
DATA + CLOUD

The AI boom has a metrics problem

The critiques of current carbon accounting guidelines are many and varied. But the AI race may be reshaping what's needed.

|
Published
August 13, 2024
Listen to the episode on:
Apple Podcast LogoSpotify Logo

Image credit: Lisa Martine Jenkins (Photo credit: Department of Energy)

Image credit: Lisa Martine Jenkins (Photo credit: Department of Energy)

Nearly two years into the artificial intelligence boom might seem like a good time to do some stock-taking.

As of July, all three of the major cloud companies — Google, Microsoft, and Amazon — have released their sustainability reports detailing carbon emissions from 2023, which was the year that the AI arms race truly began. (OpenAI first unveiled ChatGPT, which is widely considered the race’s starting gun, in November 2022.)

But despite the cumulative hundreds of pages of reporting, the picture of AI’s energy and emissions footprint remains fuzzy and incomplete. The wide range of reporting approaches allowed for by the Greenhouse Gas Protocol has meant that these reports are virtually impossible to directly compare and contrast. And in most cases, it’s unclear how much of a company’s energy consumption comes specifically from data centers. 

While problems with the Protocol have been widely documented, the sudden and meteoric rise in AI is throwing the lack of standardization into sharp relief. The snail-like process of updating the Protocol — and its lack of specificity for the data center energy problem — may mean that we need entirely new metrics and requirements to take stock of AI’s growth.

Rich Kenny, managing director at hardware-focused environmental consulting firm Interact, said the lack of a useful metric for measuring the impact of AI is a foundational problem for the industry.

“AI has always been about how much work you can do and how fast you can do it,” Kenny said. “We’re missing the third dimension of how good it is at doing work…There has to be a metric that says ‘is this system serving AI efficiently?’ and that’s what we don’t have.”

That’s a problem in the short term for measuring AI impact, Kenny said, but it could also be a longer-term over-building problem. There is already “massive wastage” in data centers that are AI-ready but not being used, he added.

“Training a large language model is like a space race. It’s about firing the rocket first, completing the LLM,” Kenny explained. “But what do you do once you’ve built 500 rockets and you’ve already landed on the moon?”

The current reporting landscape

The reports that major tech companies put out each year to illustrate their climate progress — or lack thereof — are essentially useless for assessing the broader state of AI, Kenny said.

“We don’t have a level playing field that we can score everybody against, and I think that’s intentional,” he said. “We don’t have a clear agreement from the hyperscalers, primarily because it would be a really bad story to tell.”

Microsoft seems to come the closest to an AI-specific metric, reporting “power use effectiveness”: a measure of what share of a data center’s total energy use is for computing (as opposed to systems operations). The company doesn’t report the amount of electricity used by its data centers every year, but did report a “carbon intensity” metric, which indicates that the amount of electricity the company consumes per dollar of revenue has jumped 20% in the last year.

Google, which does report a data center-specific electricity metric, said energy consumption by its data centers increased by 17% last year. (Both Microsoft and Google declined to respond to Latitude Media's questions about whether they would consider reporting a work-per-energy metric.)

Amazon, for its part, only reports electricity use on an organization level and declined Latitude Media’s request for AWS- and data center-specific breakdowns. Amazon also doesn’t share location-based metrics, and declined to share what percentage of its scope 2 emissions data was attributable to renewable energy credits. (About a year ago, Amazon was dropped from the UN’s Science Based Targets Initiative, an indication that the company’s reporting methods don’t align with the SBTI requirements. Amazon declined to comment on whether it plans to become recertified.)

Meta — which has not yet released a 2023 report, and has a much smaller data center footprint than the "Big Three" — also reports total emissions from data centers. In 2022, the company said its location-based emissions from data centers were 3.8 million metric tons, a more than 27% increase from 2021.

In short, we’re living in a reporting Wild West.

Listen to the episode on:
Apple Podcast LogoSpotify Logo
What do you do once you’ve built 500 rockets and you’ve already landed on the moon?
Rich Kenny, managing director at hardware-focused environmental consulting firm Interact

Karthik Ramanna, a professor of business and public policy at the University of Oxford, compared the current moment in carbon accounting to the late 1920s, before corporate financial reporting was standardized.

“You had all these companies making these bold claims on the stock market in 1929 about what…they’re going to create in terms of future value,” Ramanna explained. “There was no standardized definition of an asset; there was no standardized definition of revenue; there was no standardized definition of liability. And so companies sort of made up their own definitions as they went along.”

As it turned out, much of that data was “highly performative rather than substantive,” which contributed to the market crash that created the conditions for the Great Depression.

Similarly, in the case of today’s AI boom, it’s not likely that the companies themselves will get ahead of the reporting problem before AI’s electricity suck starts to seriously impact the grid, Ramanna said. While standardizing carbon accounting will help the industry to manage AI’s impact, at a certain point “AI is inevitable,” he added — and so is the fact that it’s energy intensive. 

“At the end of the day they’re in the business of making money, and if they have an opportunity to make another dollar they’ll make another dollar. So the momentum for rigorous carbon accounting standards is not going to come from the companies themselves.”

Something’s gotta give

Today, the tech world’s strategies for clean energy procurement can be split into two camps: 24/7 carbon free energy (Google, Microsoft), and emissions-first (Amazon, Meta).

The former approach is focused on hourly matching — making sure clean energy is available on the same grid where a company is consuming electricity — while the latter takes a borderless approach focused on overall emissions impact.

Current accounting rules are fairly liberal in terms of the pathways a company can take to make a 100% clean energy claim, explained Roger Ballentine, president of climate consulting firm Green Strategies. But regardless of which a company pursues, he said, reaching 100% clean energy becomes a lot more difficult in the face of multiplying load.

“It’s a problem that is difficult to solve with any electron, and it becomes even harder to solve with clean electrons,” Ballentine said. “That’s really where this pressure is building up, and something’s gotta give. Either the rules have to give, technology needs to leapfrog, or we’re going to be stuck.”

It’s not that things aren’t moving, though. Updates to the Greenhouse Gas Protocol reporting requirements have been in the works for several years, and are expected to be finalized in 2025. And tech companies themselves have already been very active in the process. Google, for example, submitted comments calling for “more geographically and temporally granular scope 2 accounting,” among other suggestions.

The problem, Ballentine said, is the timing.

“The Protocol is kind of plodding along at one pace, maybe going to be updated significantly, but we’re not sure which direction they’re going to go, and the process could take two or three more years,” he said. “Meanwhile, the AI challenge is today.”

The Protocol itself has serious shortfalls that should be updated regardless, he added. But the AI boom is “accelerating the problem, no question about it.”

A metrics overhaul

Given the slow pace of the Protocol updates and the unique challenges of large AI workloads, it may be time to introduce something entirely new.

The Uptime Institute, an advisory organization focused on critical infrastructure, has been advocating that the industry needs to move towards a work-per-energy metric to better understand AI’s footprint.

That type of metric would accomplish two things, said Jay Deitrich, the institute’s sustainability research director: “It provides some perspective on what’s actually happening inside the walls of the facility, and it gives you a metric by which you can track progress over time,” Dietrich said. 

For generative AI in particular, Kenny, at Interact, suggests a metric like tokens — the basic units used to process language — served per watt, for example.

EVENT
Transition-AI 2024 | Washington DC | December 3rd

Join industry experts for a one-day conference on the impacts of AI on the power sector across three themes: reliability, customer experience, and load growth.

Register
EVENT
Transition-AI 2024 | Washington DC | December 3rd

Join industry experts for a one-day conference on the impacts of AI on the power sector across three themes: reliability, customer experience, and load growth.

Register
EVENT
Transition-AI 2024 | Washington DC | December 3rd

Join industry experts for a one-day conference on the impacts of AI on the power sector across three themes: reliability, customer experience, and load growth.

Register
EVENT
Transition-AI 2024 | Washington DC | December 3rd

Join industry experts for a one-day conference on the impacts of AI on the power sector across three themes: reliability, customer experience, and load growth.

Register

But Jonathan Koomey, a data center and energy researcher, is skeptical that the companies could reach that type of consensus. “Computing is really complicated, and the AI that Google does is different from what Amazon and Microsoft do,” he said. “There are some common things like tokens, but even if we just agree that’s the right metric, it’s going to be very hard to get them to tell us that data.”

Once models are trained up and out in the world for everyday use, there’s yet another problem: the energy requirements for training a model are vastly different from what’s needed for the model to perform tasks, explained Dietrich.

Uptime predicts that the larger models will ultimately be dispersed onto standard CPU systems, which would make it much more difficult to apply a work-per-watt metric, Dietrich said. At the same time, the energy demand from using a model over time may well be much higher than the energy demand of training it.

Ultimately, Dietrich added, the top priority of a data center should be optimizing efficiency, or delivering as much workload as possible for each megawatt-hour of electricity consumed. But getting to a defined and shared methodology for measuring that efficiency is a “serious problem” that will take several years, he said — and by then it might be too late. 

“What we need to do is have a call to action,” said Kenny. “Can we agree interim measures of work per efficiency, and then we can align the amount of work done to that efficiency measure to say ‘are we getting better at serving AI?’”

That question — of how good we are at the work of AI — is fundamental to what a new metric should be aiming to answer, and answer quickly, Kenny said.

“Until we know how good they are at doing work, we’re just shooting in the dark.”

No items found.