As the buildout of data centers accelerates on a dramatic trajectory, its strain on the electric grid has increased in turn; forecasts suggest they could consume up to 17% of all US power by 2030. To avoid higher rates and slower AI growth, the industry has embraced a promising solution: data center flexibility.
In this episode, Shayle speaks with Varun Sivaram, the CEO of Emerald AI. Coming on the heels of a $25 million investment round led by Energy Impact Partners, Varun returns to the show to provide an update on the “wickedly complicated” challenge of aligning utilities, cloud providers, and the grid.
Shayle and Varun explore topics like:
- Tapping into the 100-plus gigawatts of unused grid capacity
- Why the “Watt-Bit spread” is shifting to make power flexibility profitable
- The differences between training and inference flexibility, including Google’s new “flex” and “priority” tiers
- The “mini dispatch curve” for data centers created by batteries, gas turbines and fuel cells
- Emerald’s plans to collaborate with NVIDIA and other partners on the world’s first 100-megawatt, truly power-flexible AI factory
Resources
- Catalyst: The mechanics of data center flexibility
- Catalyst: The potential for flexible data centers
- Latitude Media: How the world’s first flexible AI factory will work in tandem with the grid
- Latitude Media: Nvidia and Oracle tapped this startup to flex a Phoenix data center
- Latitude Media: A reality check on flexible data centers
- Latitude Media: Can VPPs unlock grid capacity for data centers?
Credits: Hosted by Shayle Kann. Produced and edited by Max Savage Levenson. Original music and engineering by Sean Marquand. Stephen Lacey is our executive editor.
Catalyst is brought to you by FischTank PR, an award-winning climate and energy tech, renewables, and sustainability-focused PR firm dedicated to elevating the work of both early-stage and established companies. Learn more about their PR approach and how they can support your company’s messaging by visiting fischtankpr.com.
Catalyst is brought to you by EnergyHub. EnergyHub helps utilities build next-generation virtual power plants that unlock reliable flexibility at every level of the grid. See how EnergyHub helps unlock the power of flexibility at scale, and deliver more value through cross-DER dispatch with their leading Edge DERMS platform, by visiting energyhub.com.
Transcript
Shayle Kann: I’m Shyale Khan. I lead the early stage venture strategy at Energy Impact Partners. Welcome to Catalyst. So my friend Varun Sivaram came on this podcast back in August 2025 or roughly a century ago in AI terms. At that time, we talked about his mission at Emerald AI to make data centers flexible, specifically at that time by shifting AI workloads to deliver compute flexibility in response to grid signals. It was then and remains now a somewhat controversial concept. Largely, I think because the history of data centers dating back to the emergence of the cloud industry always suggested that they are perhaps the most inflexible load on the planet. Not only would they generally pass on participation in demand response programs, but they actually needed N+ two reliability just to ensure that the spikes would always continue to flow. But the world has changed in a bunch of ways since then.
The strain that the data centers are putting on the grid has become clearer and more present than ever. Pressure on electric power rates is high, and so affordability is top of mind across the board. And data center flexibility, either through compute orchestration or through behind the meter resources, has started to become a mainstream concept. I actually went on my own journey on this concept and spent a lot of time on it over the last year. And here’s where I came out. From first principles, data centers should be flexible assets. They just should. They have many different types of workloads with different degrees of urgency and it’s crazy to think that they couldn’t differentiate. And in fact, some of them are now starting to differentiate. But the actual mechanics of getting them to do so and getting all the players aligned is really tricky. Anyway, long story short, I invested in Emerald.
We announced just a couple weeks ago that we at EIP led a $25 million round in Varun’s company. So I brought him back on today for an update on the extremely dynamic world of data center flexibility. That’s coming up next.
Varun, welcome back.
Varun Sivaram: Shayle, thank you for having me back.
Shayle Kann: It’s nice to be on the other side and talking as partners in crime in your business. So you were back on this podcast in August of last year, which depending on how you look at it, is either a very long time or a very short time ago. I guess I want to start with just your high level perspective on what has happened in the market. And I guess by the market in this context, I mean data centers in the grid, and then specifically also in your market, which is the provisioning of flexibility for data centers in the grid. So just over that last, whatever that is, nine, 10 months, what do you see as having happened?
Varun Sivaram: I think Shayle, it is a very long time given the dynamics of data centers and the grid. But actually before I answer that question, let me first say how delighted I am to get to work with you as a partner in crime. 10 years ago today was when you and I co-authored a Nature Energy article. We wrote this article about setting a new cost target for solar power, really ambitious one, 25 cents per watt fully installed. And I think the latest prices in China show that the cost is roughly double that. So we’re almost there.
Shayle Kann: More to come on that because I have been doing some things that are still trying to hit that target. As you said, even in China, we have not hit that target yet. I still think it’s possible, but yes, it’s fun to know. I did not realize it was 10 years ago to the day, but obviously it’s been a fun journey. So good to be on the same team here.
Varun Sivaram: Absolutely. So let me get back to that overview. It’s been a long 10 months since I was last in the pod and here’s what’s changed. Data centers are even more of an energy issue than we thought was the case back in 2025. The latest numbers show that in January of this year, NERC forecasts a summer peak increase of 224 gigawatts, almost all of which will come from data centers. Data centers now account for 94% of PJMs projected peak load growth. And by 2030, EPRI forecasts that data centers could use up to 17% of America’s power. All of these are incredible, insane statistics, and they’re reshaping the landscape of energy as we know it. And you’ve obviously talked about this a lot on your other podcast sessions. But I also think that this particular topic we’re talking about today, it’s important to keep talking about it.
The last time we talked about it was a different context than today. 224 gigawatts of peak load growth is between a quarter and a third of peak demand. And so that’s a massive increase that data centers are going to be driving. And if we try and build our way out of this, as I told you on the last podcast, we risk higher rates and slower AI growth, and that’s good for nobody.
Shayle Kann: I would add, you said higher rates. I mean, I think one thing to me, yes, it is true that the data center build out has just continued to accelerate over the past nine or 10 months. Nothing has stopped that upward trajectory and maybe it has gone parabolic. It’s hard to tell exactly because we’re in the middle of it. The one thing though that I think is much more front of mind today than it was in August of last year, you alluded to, which is affordability.That has come front and center now in a way that I think it wasn’t quite there nine, 10 months ago is just starting to be. And now it’s like every conversation is about affordability.
Varun Sivaram: Oh, absolutely. And look, we should be clear that historically the drivers of rate increases may very well not have been data centers. Data centers may have been conflated in the data. And some data shows that in areas where data centers grew more quickly, rates actually increased more slowly. But I think it’s incontrovertible that going forward, if data centers do drive most peak load growth and peak load drives most rate increases, data centers absent some mitigations to help with this exorbitant grid build out, data centers could very well drive affordability difficulties. And that’s why I think that there’s a risk that they become a problem, but there’s also a real opportunity for data centers to become the true hero of solving this affordability challenge.
Shayle Kann: Right. And there’s multiple ways to do that, and lots of them are starting to emerge. These unique tariffs that data center operators or developers are starting to sign up utilities, I think are interesting and there’s a bunch of different structures of those. Let’s narrow in though on the portion of that that is most relevant to you, which is making data centers flexible assets. So on that front, over the past nine, 10 months, what has changed?
Varun Sivaram: A lot and a little has changed, I’d say. Let me put it this way. There is this deep divide that I observe between the level of flexibility in the service levels and tiers that the compute industry offers and the level of flexibility and the service tiers and level that the utility and grid operator ecosystem offers in terms of electric service. And that schism is one of the reasons that this is such a hard problem to solve, that even though I’m super excited and want to talk to you about the five commercial demonstrations we’ve done since you and I just last talked at Emerald AI, it’s still a fundamental challenge to make sure that we can take advantage of all of what I call the stranded power on the grid. So to back up, Shayle, for those viewers who didn’t hear the last episode, data center flexibility, why is it important?
Varun Sivaram: Data center flexibility is important because if these new AI factories, as Jensen Huang at Nvidia calls them, if they can be flexible just a little bit of the time, they can utilize this vast amount of unused capacity on the grid. The grid is utilized at roughly 50% or less during most of the year. And therefore, we have all of this 100 plus gigawatts of stranded power capacity that could power AI factories connecting to the grid today if during those rare peak moments, AI factories could ramp down partially for a limited amount of time. That’s why flexibility is so important. Now, I mentioned the schism because just this month, we’ve seen a lot of movement on the AI and compute side in terms of flexibility tiers. Just earlier this week, Google announced its flex and priority inference tiers. So if you’re a developer and you are buying tokens of artificial intelligence, you’re running models, you’re being served inference, you can choose to have those delivered absolutely immediately, or you can choose to have those delivered after a delay and you’ll pay different prices.
Varun Sivaram: But more importantly, it’s not just the price component. It is the service component. The service literally changes between those tiers. And it’s not just Google that does this. Anthropic has a peak period for serving tokens and you tend to hit your rate limits earlier and it has a non-peak period. I know all of this because we have one of these token maxers, Nikil on our Emerald AI team, and he always tells me every time he hits a rate limit. So given that there are literally different service levels, it’s going to be possible, I believe, relatively quickly to take advantage of all this flexibility and the different ways people use compute to throttle their power consumption. The question, however, is on the other side. On the utility side, it is still a work in progress for a utility to offer a different service level. Today, you typically get one service level.
The service level is you get power. There isn’t a service level that says for most of the year you get power, but some of the year we’re going to ask you to be flexible.
Shayle Kann: There is to some extent, which is demand response. That’s the legacy version of what we’re talking about, which is a, you enroll in a program, this is less so for data centers historically, but for large loads. You enroll in a program and that as that enrollment, we are going to send you a signal or we’re going to call you as it’s been historically a couple times a year at peak hours and we’re going to ask you to ramp down. This is kind of an extension of demand response, right? What do you think of as being the same and different?
Varun Sivaram: I completely agree that first of all, there are natural pricing tiers. So if you’re in ERCOT in Texas, you can pay more money for power right now, or you can ramp down and pay less money because power prices are high. Then there are demand response programs, as you mentioned. Utility might say, you can make some money if you agree to curtail during this period. You might own a S thermostat and be enrolled in a smart thermostat program and they will pay you if you are willing to reduce your consumption. There are even mandatory programs where a prerequisite of your enrollment is your promise to curtail and a hefty penalty if you do not curtail. But what’s missing from all of these is the ability to offer the service of curtailment at the large scale that data centers could theoretically provide it and get a real benefit out of it.
And that benefit is not just a cheaper cost of power or a flexibility payment. That benefit is a larger power connection or faster access to the power grid, whether you’re an existing data center seeking to increase your capacity or brand new data center seeking to connect, you should be allowed to harvest the existing stranded capacity on the grid that can be served to you if you are willing to curtail every so often. And that product doesn’t exist today. And with good reason, the electric utility industry for over a century has wanted to promise firm service to its customers. And so there really isn’t this non-firm service tier that allows you to skip the line. And there are all kinds of legal considerations that come into play, but that’s absolutely where we have to go.
Shayle Kann: The interesting thing about that, I think my guess is that most people would say, if you ask most folks who are kind of like in and around this market, what is the limiter on data centers becoming flexible grid assets, particularly leveraging flexibility and compute as opposed to leveraging onsite behind the meter resources, which we will talk about as well. I would guess that most people would say the main limiter is actually on the compute side. It’s because in the legacy cloud world and then AI as it has emerged since then, there was an expectation of extraordinarily high reliability and low latency. And so your expectation of your workload getting curtailed is very low. And so that would have been the limiter. It’s interesting that you’re saying it’s actually on the other side, because on that side, the compute side, it seems to be emerging.
I saw that Gemini announcement as well, and that’s cool to see Google doing that, and that that makes the new limiter on the other side of the equation, which again is, as you said, the problem is not operating differential pricing or savings based on curtailment. It is actually saying, look, if you agree to curtail a certain amount for certain times, we will interconnect you faster or we will give you a larger interconnection. That’s the thing that’s missing.
Varun Sivaram: Yeah, exactly. And let me first just preface by saying it’s not the case that faster and larger connections because you’re willing to be flexible is not happening anywhere in the world. In fact, Google now has reached a gigawatt of contracted flexible capacity across, I think, five different utility territories, at least some of whom are willing to provide some of these benefits. So Google’s really been a pioneer. I believe across the more than 3000 American utilities, we have a long way to go to bring these differential service tiers onto the market, but still we’re making good progress. But the point you make, Shayle, I think is a really important one. The point you made, Shayle, is look, everybody sort of discounts data center flexibility because they’re thinking about the compute side and they say these AI GPUs or accelerators are extremely valuable and produce wildly valuable tokens of artificial intelligence.
Varun Sivaram: If you watched Jensen’s keynote at Nvidia GTC, you saw some of the ridiculous economics of operating token factories, AI factories, right? It’s a great idea to maximize the tokens you are generating per watt of power. And so the intuitive response is, it is a terrible idea to ever curtail any GPU because the economics just absolutely don’t make sense. That, by the way, is one of the reasons why I feel fortunate that basically no one else is doing what Emerald does because that in itself is an intuitive blocker to founding a company like this. But I believe it’s actually on the other side, as you said, Shayle, I believe that if electric power utilities and grid operators offered a range of different service tiers, just like on the compute side, the cloud operators offer a range of different service tiers to their end AI customers, innovation would solve this problem.
And you would absolutely have data centers taking advantage of the lower service tiers. And they’re not that low by the way. It’s just call it 50 or 100 or 200 hours a year that you’d have to curtail. Data centers would very happily take utilities up on these lower service tiers in order to accelerate their connection or get a larger connection.
Shayle Kann: Yeah. I think there’s another way to put it, which is that it is true economically that it’s probably a dumb idea to curtail, to not maximize token generation if the benefit of doing so is purely a cost savings on your electricity bill. And so if it is traditional demand response or something like that, and the benefit that you get is just you save some money on your bill, those numbers don’t pencil. Largely for the reason of, Brian Janous coined the Bitwat spread term, the Bitwat spread is so big that it’s just not that valuable to you to save some money on your electricity bill relative to the revenue that you’re going to generate with your chips. So that is true generally. You could disagree with me if you want, but the economics of getting a data center connected larger or faster are orders of magnitude different. And so if that is a benefit that you can get, it actually does flip those economics.
Varun Sivaram: So I completely agree with your second point and I sort of half agree and half disagree with the first point. On the second point, completely agree. If you’ve got a 200 megawatt data center and you are able to increase its capacity to 230 megawatts just a year ahead of schedule and you can swap out to liquid cooling and next generation NVIDIA GPUs, you create billions of dollars of value, even netting out the cost of some of the downtime of curtailing during those rare peak load hours, you should absolutely take the deal about a new electric utility service tier. The earlier point you made though, which is, is there ever an economic incentive that makes it worthwhile on the fly to reduce your operational expense, your OPEX by reducing your power cost or getting a flexibility payment to curtail a little bit? Even if the answer isn’t yes in all cases today, and I think there are some cases where it is yes, I think the answer will increasingly become yes in the future as the cost of inference, the cost of token generation asymptotically approaches the cost of power, which is the only real operational expense input into the cost of intelligence generation.
And so, and even today, there are lots of efficiencies we can harvest on a temporary basis to mean that I can reduce peak power by a larger amount than the token generation that I’m avoiding simply because by operating, this is getting a little technical, by operating a little differently on the power performance curve on a part of it where I’m not losing as much performance, but I am reducing quite a bit of power. There’s a pretty good trade off to be made on a temporary basis, for example, for some inference workloads. Microsoft has done a great job quantifying this. So there is some low hanging fruit to harvest here, which means I believe even today it makes sense to be responsive and in the medium and longer run, it’s going to make a ton of sense to be responsive just in terms of reducing your power bill.
Shayle Kann: It’s a good point. Yeah. It’s a good point particularly about over time. We’re in a moment right now that is not reflective of where we’re going to be in five years or 10 years, or who knows how many years, two years maybe, but inference should and probably will get cheaper and cheaper. The market will be saturated with it. At that point, the cost of energy does matter. So that is a good point. You alluded to one thing I wanted to ask you about. As you think about workload flexibility, talk to me about the different workloads and what is more and less suited. So obviously there’s the training versus inference distinction, and I’m interested in your perspective on that, but even within a category, even within inference, for example, what have you learned having run all these demos now about what types of workloads seem to be the ones where there is enough volume, they are a big enough portion of the overall workloads to matter and where they have the most flexibility?
Varun Sivaram: Yeah, absolutely. There are many different AI workloads. It’s more than just a simple dichotomy of training and inference. Each of these categories has lots and lots of different subtypes. I will say a good reference, there’s a March 2026 paper that was published actually by Emerald’s chief scientist, Professor Ayse Coskun at Boston University with two others, both at BU and at Emerald AI that shows a 18 to 55% power flexibility opportunity across a range of different representative AI workloads, spanning training, inference, fine tuning, and all of their subtypes. So there’s a lot of inherent flexibility is kind of my overarching point, and then I can go into the various kinds as you suggested. As you mentioned, we’ve now done these five demonstrations at data centers with NVIDIA, with EPRI’s DC Flex Initiative, and with many other partners from Oracle to Nebius to National Grid. And in each of these, we’ve tried to reenact real production grade actual workloads.
Varun Sivaram: So for example, in London, we ran workloads from real models, whether they’re OpenAIs models, Meta’s open source model, even an Alibaba model from China to showcase that we could achieve performance levels that real customers find acceptable, making sure not to throttle workloads that are the customer labels as mission critical or that should not be throttled, while at the same time precisely meeting grid objectives, whether that is responding within seconds to a scenario of a lightning strike or reducing power by 30 or 40% during the halftime of a soccer game, where in England, everybody turns on their tea kettles. So we believe that many AI workloads can be throttled in a way that’s acceptable to customers, and many of these will be fine-tuning, post-training, training type workloads. There are other workloads, and by the way, some inference workloads as well like batch inference. There are other workloads where these workloads, even if they can’t be throttled, they might be batched differently.
Varun Sivaram: So again, you’re still basically reducing the power consumption in one data center, or they can be migrated. So with Oracle, we showcased migrating AI workloads from one location, Virginia to another, Chicago. And this was during the data winner, the Dominion winter peak period, the inference queries got rerouted in such a way that you were able to precisely meet the Dominion Grid’s power constraint while utilizing capacity far away while the queries moved within milliseconds halfway across the country. So the user experience really wasn’t changed. And so if you’re chatting with the chatbot, which by the way, is just one of a million different AI use cases, that isn’t an experience that’s going to change very much if what’s happening is this geo shifting under the surface. So my only point here is there’s so many different use cases. Google, when they announced their flex and priority inference tiers yesterday, they cited cases like background CRM updates, large scale research simulations, agentic workflows where a model is browsing or thinking in the background as cases of workloads that are inherently flexible that you probably don’t need an answer from right this second.
Varun Sivaram: And of course, that helps Google to better optimize its own AI infrastructure. You might be able to use your AI servers for queries that are more urgent, but it also helps us to throttle power use by tapping into the inherent flexibility of computational workloads. And so the last thing I close on this is, historically, data centers have been optimized as a closed system. There are computers, CPUs, GPUs, memory storage, there are fiber optic networks, and so there are multiple data centers across the country, and you optimize this closed system. Nobody’s ever considered adding within this closed system. If you weren’t listening to this podcast and you were looking at my hands, one circle is around the data center system, and a bigger circle is around the data center plus the grid system. No one’s ever added the grid to the closed system. And once we add the grid, then we are optimizing not just for where there are servers available or where there are fiber optic congestion constraints, but also where there are transmission line congestion constraints or where there is inadequate generation.
And that overall optimization problem causes you to harvest computational flexibility in a different way and often in a way that utilizes this massive electric grid fixed asset and save everybody money.
Shayle Kann: One thing that as I’ve spent more and more time with you and sort of learning about what Emerald is building and more broadly learning about the concept of compute flexibility with data centers, one of the challenges it seems to me is just it’s a multi-party situation. There are a bunch of actors in any given situation. It’s not as simple as you want it to be. It’s not like there’s data center and grid. Even within the data center, quote unquote, somebody is operating the data center, somebody else might be the cloud provider to the data center, somebody else might be running the workloads or actually being the customer. So can you walk me through how you think about the stack of who needs to do what? If we’re going to deliver on this promise and we’re going to take advantage of the hundred gigawatts of latent capacity we’ve got on the grid by making data centers flexible, who needs to sign off on what?
Varun Sivaram: Yeah. It is a wickedly complicated multi-party problem. I’m delighted, Shayle, you got comfortable getting your hands around this and now we can work together. Look, I go back to an earlier question you asked, which is, what’s the most critical thing that has to happen? The most critical thing that has to happen is power utilities and system operators and regulators and governors saying, “If you are willing to be power flexible old data center, we want you in our state. You get to skip the line, you get to connect faster as a flexible load, fast track, you get a bigger data center, et cetera.” If that happens, I believe everything else quickly falls in line. Now you’re right, the data center is not one monolithic entity. It comprises a lot of players. You might have, for example, a data center developer, owner and operator, I’ll make one up, digital realty, terrific one that we partner with that is operating a data center within which they have a tenant.
That tenant might be one of the many folks we’ve partnered with like Nebius, for example, or Oracle or Lambda. And within that cloud provider, by the way, it could be a hyperscaler as well. Within that cloud provider, you might then have a customer. And that customer, by the way, may not be the end customer. You might have together.ai or fireworks.ai, which is an inference serving service, which is then serving tokens and enabling an end customer to run models on them. And there may be N layers here. And ultimately, all of those layers have to work together so that the data center at the point of common coupling, that interconnection point to the grid has to adjust its net withdrawal from the system consistent with the grid signals. This is complicated. Emerald seeks to be the easy button for data center flexibility. And in order to do that, we basically have to have modules at every layer of this stack.
We have a module for utilities. We have a module for the data center operator to interact with the utility and communicate. We have modules for the cloud operator, for the end user to have Emerald agents to help them to most gracefully throttle the workloads that they may want to throttle. We have agents elsewhere that are working on the onsite energy resources to harness all of those energy resources as well. So it is a complicated stack, but I will say everybody becomes much more willing to work together when there’s a real economic incentive, and it’s the grid that sets that incentive, which is to say you get connected faster, you get a bigger data center if you’re willing to do this, everybody else will work together.
Shayle Kann: Yeah. I agree with that. The prerequisite, if the grid says the right thing, everybody else falls into line. You just mentioned the onset resources. I want to talk about that for a second too, because I think part of what’s happened as the concept of data center flexibility has gained more prominence is that it has morphed somewhat. Sometimes people, when they say data center flexibility, they’re talking about workload flex. Other times they’re talking about from the grid’s perspective, what makes a flexible data center. And in many cases right now, what’s happening is that data center developers and operators are putting a bunch of assets behind the meter. Usually what that means is gas turbines of one kind of another, maybe some best, some battery storage, maybe there’s some other generation solar could be behind the meter as well, but it’s by the meter generation and storage.
And so there’s one version of data center flexibility, which is just grid says or utility says, “I need you to curtail now this amount.” And you just fire up your behind the meter generator and you don’t do any workload flex at all. There’s another version where these things all play together. So walk me through how you see the landscape emerging with the relationship between behind the meter physical resources and workload flex.
Varun Sivaram: Totally. I’ll say a preparatory point, which is I believe AI factories belong on the grid. I think it’s best for everybody. This is counterintuitive, by the way. A lot of folks might say, “Well, the data centers just went off the grid that would insulate the rate payers from the peak load increase that the data centers would cause and the bill increases, et cetera.” But I believe that’s a little bit shortsighted. The farsighted way of thinking about this is as data centers become by the end of this decade up to 17% of America’s load and in the decade beyond a quarter and a third and half of America’s load, it would be a catastrophe if data centers were entirely decoupled from the electricity system because the system loses their biggest source of anchor tenant revenue and the most exciting engine of American economic growth. It’s a terrible idea to be completely off grid forever.
But in order to achieve that, there may be a period of time in the near term where data centers say, “I need to get online right the second. And so therefore I’m going to build myself bridge power. It’s highly rational.” And the hope is we will be able to quickly bring those resources behind the meter to bear to support the broader electrical system and connect those data centers with NVIDIA, we made a major announcement at CERAWeek that NVIDIA has a reference architecture. It’s called DSX. It’s how AI factories should be laid out and should operate. One element of it is DSX Flex, the capability to be flexible and Emerald is a software partner that helps to operationalize that. And we joined six large, the largest American power companies to say, even if you are putting on bridge power, we’re going to incorporate that into the DSX reference design.
We’re going to call them hybrid AI factories. And we’re going to make sure that they can work together as a single unit to provide services back to the grid. In some sense, this is a super flexible AI factory facility. And the reason for this is you can coordinate the onsite resources, the gas generators, the batteries, alongside the computational flex, because AI factories, these token factories are inherently flexible, as I mentioned, that 18 to 55% inherent flex built in to AI workloads, you can take all of this together. And when you do get a grid connection, and ideally it comes faster than it otherwise should because you’re flexible, you’re able to provide real services back to the grid. So one of the things that at Emerald we’ve been focused on is orchestrating not only the computational resources, slowing down workloads, moving workloads, but also doing that in tandem with the onsite energy resources, generators, cooling, batteries, fuel cells.
And the reason that’s important is you might have a micro grid onsite. It may be operated through the software systems of Siemens or an Eaton or a GE Vernova, all of whom actually just entered this round in Emerald AI that you led, Shayle. And we will play very nicely with all of them. We’ll integrate and we’ll say Emerald will help to recruit the amount of generation and flexibility from these onsite energy resources and coordinate it with the computational flexibility from the GPUs onsite. And that entire unified amount of flexibility is what the grid sees in terms of a change in the net withdrawal of energy from the grid. So I look at bridge power behind the meter resources as really a way to supercharge flexibility and not a way to remain as a permanent island.
Shayle Kann: I’ll tell you the way that I think about it, and you can tell me if this resonates with you. There’s a dispatch curve in electricity in general. And when there’s a certain amount of demand, you start at the bottom of the dispatch curve, which is basically the cheapest resource to generate, and you keep moving up the dispatch curve until you meet the demand. And so in the context of the broader electricity market, the low end of the dispatch curve is stuff with no marginal cost, which is solar and wind mostly, right? And then you get further and further up and other things get dispatched more and more. When there’s a single asset, a single data center that has multiple resources that it can draw upon to meet a need, which in this case is going to be some amount of curtailment from the grid’s perspective, it’s kind of like a little mini dispatch curve, right?
And they may have one thing that can go in that dispatch curve or they may have six. It doesn’t really matter. And if you think about it in that context, then it should be that workload flex is the bottom of that dispatch curve. In other words, the cheapest thing to deploy. Assuming that you still, to your point, you’re taking advantage of the latent inherent flexibility. In other words, you’re not sacrificing customer SLAs, you’re not sacrificing performance to customers, things like that. If you take that to be true, then workload flex is the cheapest thing you can do, and you should do as much of it as you can, as long as you don’t sacrifice customer performance. Then if you need more, which you may well, you should then dispatch things that cost more money. And that may be firing up your generator that you have behind the meter.
It may be firing up your battery or dispatching your battery or fuel cells or whatever it is. All those things come at a significantly higher cost to dispatch, but you might be able to get more out of them. You might have a gas generator behind the meter that is rated to the same capacity as the entire data center. And so if you need to flex down to zero, that’s the way to do it. But if you need to flex down 20%, it actually might make more sense in most cases just to do the workload flex. So I think of it as this little mini dispatch curve that some data centers will ultimately have, but really only the ones that do have the behind the meter resources.
Varun Sivaram: I love the dispatch curve analogy. All I’ll say is I don’t think it’s a static dispatch curve. So in electricity markets, it’s always that gas peaker that’s going to set that marginal price when you have sufficient demand. It’s always that skinny pointy one at the right side of the dispatch curve. Whereas in the data center, you’ll have a complicated, dynamic, constantly changing dispatch curve. I agree with you that there’s always going to be a fat, short part on the left-hand side of the dispatch curve that’s going to be some latent workload flexibility that we can just harvest. There will be customers who are willing to tolerate a little bit of flexibility, and there will be workloads that are inherently tolerant to some flexibility. I’ll just note here, by the way, there are so many reasons that AI users are tolerant to workload flexibility because all other kinds of things can happen in a data center that might require them to be flexible.
So power is just yet the next thing that we ask them to be flexible about. But in addition, there will be workloads that are less inherently flexible or that are higher up on that dispatch curve. They might sandwich the battery. The battery, by the way, might have an operating constraint. It can provide you a certain amount for a certain amount of time that sets the width of that bar, so to speak. But you might have some very interesting dispatch algorithms. One day you might even have what I call energy token arbitrage or what token arbitrage where you might actually choose to throttle tokens even before the grid actually requires you to do so because it’s economically optimal in this particular case to charge your battery, let’s say. And I believe that as we build in intelligence such as forecasting, which many of our five demonstrations now have done, we’ll be able to forecast on both sides, both the grid side when we expect an event to arrive, and on the AI side, when we expect a job that is more or less flexible to arrive on the scheduler. All of this means it’s just a more complex dispatch curve, but I love the analogy. And for us grid wonks, it’s a useful organizing principle for us.
Shayle Kann: Yeah. It’s a good point on the charge of battery one. I think people haven’t really thought this one through. We’re going to put a lot of batteries behind the meter at data centers. I’m pretty convinced that that’s going to happen. But let’s say that you’re a data center that has 200 megawatt data center and you have a 200 megawatt interconnect and you add a battery. How do you charge that battery? Either you need a bigger interconnect, your total load is actually 200 megawatts plus the size of the battery, which is going to be big, or you need to figure out how to be flexible on your power consumption from the data center such that some of the time you can be simultaneously charging the battery and pulling from the grid. And so that’s like an inherent workload flex requirement that you’re going to have to solve unless you are going to get a much bigger interconnection, which nobody can get.
Varun Sivaram: Yeah, completely agree. And Shayle, I guess I just don’t want to lose sight of the overarching story here, which 38 minutes in, I’m now going to share that, what you just shared, Shayle, is an important functionality and I’ll call it the fourth bird that you can kill with a stone. But the first three birds are first, let’s get AI factories, data centers connected much more quickly and at larger capacities to grids thanks to flexibility. Second, let’s keep rates low and stable thanks to flexibility by avoiding unnecessary grid buildouts. We still got to build, but nevertheless, if we can harness flexibility, we can build less quickly to less quickly rising peak demand while bringing on massive amounts of energy demand, megawatt hour demand from data centers that help to pay for the whole system. And third, let’s keep the system reliable. The third bird here is if AI factories can respond to system needs, that lightning strike, that soccer game, tea kettle spike, a heat dome we demonstrated in Portland, Oregon with PGE, the utility and NVIDIA, and many other potential reliability issues.
Well, then we’ll be able to, with one solution, we’ll be able to basically get the grid we want and the AI adoption that we want. It’s that really rare holy grail solution. It’s why there’s so much chatter about it, but you’ve correctly laid out the reasons it’s hard. There’s a lot of actors that you have to coordinate. There’s a lot of ongoing forces such as the push to just go entirely off grid.
And of course, there’s the lack of those differentiated service tiers from the electric power system. I think later this year, as you know, Emerald and Nvidia and some other partners, Digital Realty, EPRI, Dominion, and PJM, we will put together the world’s first 100 megawatt commercial scale AI factory that is truly power flexible. It’s custom designed from the ground up to be power flexible, and it’s going to be able to respond precisely to all of these grid needs, but at a commercial scale. My hope is the community sees that in parallel, we’re getting to the point where more electric utilities are understanding they have to offer these differentiated service tiers and give you accelerated interconnection and larger connection sizes. And that’s when late 2026 and 2027, this really takes off and it kind of solves all three of those problems.
Shayle Kann: All right, Varun, this was fun as always. Appreciate you coming back.
Varun Sivaram: Really appreciate it, Shayle. Thanks for having me.
Shayle Kann: This show is a production of Latitude Media. You can head over to latitudemedia.com for links to today’s topics. Latitude is supported by Prelude Ventures. This episode was produced by Max Savage Levenson, Anne Bailey, and Sean Marquand. Mixing and theme song by Sean Marquand. Stephen Lacey is our executive editor. I’m Shayle Kann, and this is Catalyst.


