Grid edge

How Avangrid built a data foundation for AI operations

The first step for utilities deploying artificial intelligence? Ditch the spreadsheets.

June 13, 2024
Listen to the episode on:
Apple Podcast LogoSpotify Logo

The Avangrid control room. Photo credit: Avangrid

The Avangrid control room. Photo credit: Avangrid

When Mark Waclawiak started running the operational performance team at Avangrid in 2020, he quickly realized the whole utility was hungry for data. Similar to many other utilities, Microsoft’s Excel was the “heart of data management,” limiting the potential for analytics and insights. 

“Our biggest challenge was these underlying infrastructures and systems that are not necessarily built for analytics,” said Waclawiak on the With Great Power podcast. “We were looking at…Access databases from 1997 and a series of Excel spreadsheets.” 

The first thing the operational performance team did was move data out of primitive data management applications into something more nimble. They decided to ditch Access and Excel, and build SQL databases.

“We unified the data. We started cleaning the data. We started improving the processes,” which immediately created “huge improvements in what we were capable of,” said Waclawiak.

The team started with operations data because “so much of their work has had to rely on the subjective.” Waclawiak wanted to make the operational decision making process objective by creating insights with data. “Right off the bat, we started developing…significantly more advanced reporting, but reporting tailored to the needs…of our operational people,” he added.

Vegetation management data was one of the first datasets they targeted because it could immediately improve reliability. 

The operational performance team also worked to combine deficiency maintenance notifications, or equipment failure data, with outage notifications. That way, Waclawiak’s team could “make smarter decisions about where [they] address both our maintenance programs and our investment programs,” he said.

Then, they tackled their first major project: building a unified outage database that integrated all of the outage data across Avangrid’s operating companies. “We were looking at a series of Excel spreadsheets and Access databases, and we essentially dismantled all of that and built pipelines from our different systems to just a centralized SQL server,” Waclawiak said. 

From there, they started adding additional data, like regulatory comments, customer data, and field worker observations: “You start getting all of these different capabilities from just having a simple SQL server database,” he said.

They quickly moved on to building a SQL database for millions of asset records, which are the “core of the utility business.”

From there, they added program data. “When you centralize program data into a SQL server database, you've got three relatively simple, straightforward, relational databases that are all now essentially interlinked,” Waclawiak said.

With these three interlinked databases, Waclawiak said his team could start developing “really powerful tools.” He likens it to building a house: “You can't really build a house from the roof down,” he said.

Waclawiak says AI hinges on good data in available structures that can be continuously built up. “If you're trying to build an AI model…whether it's machine learning or [generative] AI on a series of…disparate Excel spreadsheets or Access databases, you're doomed to fail,” he said. 

Waclawiak’s team is currently building a new model of their service territory, called Geomesh, that combines historical data with various conditions to predict grid performance. 

“You really give yourself the opportunity to start asking these questions that we previously thought were unanswerable,” he said, “just because we took the approach of centralizing all of our data in very clean, very well structured tables, and just simple relational databases.” 

In this episode of With Great Power, host Brad Langley goes deep with Mark Waclawiak about how Avangrid built the foundation for its work in AI.

Listen to the episode on:
Apple Podcast LogoSpotify Logo

With Great Power is a co-production of GridX and Latitude Studios. Listen to all episodes here.


Brad Langley: Connecticut back in the nineties, was a hotbed for changes to the electric utility industry. The state was actually the first in the nation to formally acknowledge that CO₂ emissions drive climate change and pass legislation in 1990 to encourage energy efficiency. It also deregulated electricity markets.

Speaker 2: Pressure is mounting from many fronts, for businesses and governments to get more aggressive about curtailing the emissions that are driving the dangerous warming of our planet.

Speaker 3: With deregulation, customers get a choice of who they can purchase their electricity from.

Brad Langley: At the time, Mark Waclawiak was just a kid, but he was already very plugged into energy issues.

Mark Waclawiak: My father worked in electric utilities and my mother designed power and lighting systems for industrial and commercial clients. So the electrical grid was a topic at almost every night's dinner table.

Brad Langley: Mark took a particular interest in his mom's work. He would spend his afternoons at her office drawing electricity designs for buildings. For a kid who liked solving puzzles, it really couldn't get much better than that.

Mark Waclawiak: Looking at a blueprint of a building and figuring out how are the people in this building going to plug in their appliances, how are they going to have lighting so they can do their work or manufacture parts? That was something that was like an everyday puzzle.

Brad Langley: But the tangible nature of his mom's work is what really left an impression on Mark.

Mark Waclawiak: When we would drive around Connecticut, my mom often pointed out to all of the different buildings and essential services like hospitals that she had designed, and the criticality and necessity of electricity was part of every day.

Brad Langley: Years later, Mark moved to Texas for college and ended up working in publishing, but something just wasn't clicking for him, so he headed back to school for a degree in electrical engineering.

Mark Waclawiak: That kind of pull right back into power engineering, of solving problems and puzzles and developing these systems that are outside and critical necessity, and seeing the world transform around you, got me back into utility work.

Brad Langley: Mark moved back home to Connecticut and got a job interning with the United Illuminating Company, which is where he still works today.

Mark Waclawiak: It was just so incredibly exciting. I actually started in solar and solar interconnections in Connecticut, and being able to go and see how the solar boom on residential was occurring in Shoreline, Connecticut, you realize the world is rapidly, rapidly changing, and I couldn't wait to be a part of that.

Brad Langley: This is With Great Power, a show about the people building the future grid today. I'm Brad Langley. Some people say utilities are slow to change, that they don't innovate fast enough. And while it might not always seem like the most cutting edge industry, there are lots of really smart people working really hard to make the grid cleaner, more reliable and customer-centric. This week I'm speaking with Mark Waclawiak, Senior Manager of Operational Performance at AVANGRID, which is the parent company of United Illuminating Company.

The criticality of the power system is deeply ingrained in Mark. He's always looking for ways to improve reliability.

Mark Waclawiak: Electricity really is a critical part of everyday life and people need electricity 24/7.

Brad Langley: When he joined United Illuminating as an entry-level engineer back in 2017, Mark developed a system to collect and store asset data, which he later used to develop proactive maintenance procedures. His ingenuity led to a promotion to United Illuminating's parent company, AVANGRID, to lead operational performance.

Mark Waclawiak: Our job is to essentially use data to make sure that we keep the lights on.

Brad Langley: Today, Mark and his team are exploring how artificial intelligence can improve reliability, and they've developed a few different programs to test the technology. But before we dug into that, I asked Mark how his team works. Can you describe for us the function of the operational performance team?

Mark Waclawiak: So the operational performance team really is the data science and analytics arm of electric operations. We do reporting for internal and external stakeholders when it comes to reliability data. We have an engineering team that develops capital projects and programs to improve reliability for our customers. And then we've got a core data science analytics team that develops the underlying infrastructure and more complex algorithms and models for electric operations. And really our guiding principle, our North Star in a lot of this work, is reliability and resiliency. When we talk about what are most important to our customers, one of the, if not the always top concern, is are my lights on? And so our job really is to use data science and analytics to best make decisions to ensure our customers have that power when they need it.

Brad Langley: And as part of your efforts, you guys started diving into outage data and you wanted to look at it across all of AVANGRID's utilities, but each of AVANGRID's utilities have different outage management systems, which is not uncommon in utility space, especially if you're bringing multiple companies together. Describe for me the problem y'all were trying to solve and analyzing this outage data.

Mark Waclawiak: As you say, there's a lot of challenges when you're looking at four different operating companies across three different states, three different regulatory environments, but really down to the core of it, our biggest challenge was these underlying infrastructures and systems that are not necessarily built for analytics. And by that I mean we were looking at Access databases from 1997 and a series of Excel spreadsheets. And I think that's one of the big commonalities you'll see in, I think, electric utilities, is that Excel seems to be at the heart of data management. And when you think about really advanced analytics and developing data science models and doing much more advanced work, you've got to develop an infrastructure beyond, I think, what people are used to and comfortable with, which is very much Excel.

And so first we really just said, "Okay, we're going to do a big transformation of this data. We're getting rid of Access, we're getting rid of Excel." And we didn't do anything incredibly complicated. We didn't go into some futuristic next generation cloud computing. We just built some, I think, relatively straightforward SQL databases, relational databases. We unified the data, we started cleaning the data, we started improving the processes. And just from that immediate transformation, you saw these huge, huge improvements in what we were capable of.

Brad Langley: And what did you do with the data once you could look at it across all the companies?

Mark Waclawiak: Well, electric operations in particular, has always been incredibly hungry for data when it comes to making decisions. So much of their work has had to rely on the subjective if they don't have the data to make those decisions. And so what we wanted to do is really transform from the subjective to the objective and use data to make better decisions. So right off the bat, we started developing significantly more advanced reporting, but reporting tailored to the needs really of our operational people. When it comes to vegetation management for example, not only did we trim this tree or how is the circuit performing, but being able to map vegetation outages and vegetation data across our entire system so our arborists and vegetation managers could start really getting insights into how the system is performing. Similar with things like deficiency maintenance notifications, not just saying, hey, here's a report on customer or company equipment failure for the system, but mapping out your different outages and types of outages, and combining that with deficiency data so we can make smarter decisions about where do we address both of our maintenance programs and our investment programs.

And so it really was first, understanding what are the problems that we want to solve, because the analytics is really a means to the end. The value is in making different decisions out in the field

Brad Langley: Can you dig in a bit to the types of roles and positions that are in the operational performance group? What is it made up of specifically?

Mark Waclawiak: What we did is established three different departments: a reporting department, which was really analyst focused and analytics focused, of being able to provide accurate and reproducible results for any type of reliability needs. And that's critical for internal and external. We're heavily regulated and our regulators all have different expectations around reliability reporting. We also wanted to make sure that our analytical and data science work was not just living in the theoretical or overly academic. And so we also built an engineering team of people that had 10 to 30 years worth of power engineering experience within the industry. And they also were then able to act really as SMEs and really, I'd say, experts on how the electrical grid works. Because then that third department and that critical part of it, was that data science department. So that department has data scientists who we've been fortunate to recruit. Really, really brilliant PhDs from the Astrophysics Department of Rochester Institute of Technology.

We also have data engineers from a diverse background to do a lot of that underlying infrastructure development work and pipeline development. And then data analysts to really work through the data to solve those daily problems. And I think that was something where I was confident that I could teach electrical systems and I could teach what fault current means to a data scientist. I would not be able to teach Python to a series of electrical engineers. And so we took the approach of we are going to bring in people whose real specialization was data, and turn them into utility data scientists.

Brad Langley: Talk us through the AI work you're doing and maybe specifically, you believe AI needs a strong foundation. What do you mean when you say AI needs a strong foundation?

Mark Waclawiak: So I consistently use a metaphor of you can't really build a house from the roof down. And so what AI really hinges on is do you have good data in very available structures, that allows you to build this on top of? And AI in itself is a tool, and so by being able to pull together really strong data sets like outage data, asset data, weather data, and then structuring in a way that you can get these different data sets to essentially interact and communicate with one another, then you can start building these advanced AI models. But the foundational aspect of it is, if you're trying to build in AI models, whether it's machine learning or GenAI on a series of disparate Excel spreadsheets or Access databases, you're doomed to fail. You really do need good quality data in good quality, I'd say, systems, and they don't have to be complicated. And I think that's something that I've always wanted to really stress is our AI development is really built on simple SQL relational databases.

Brad Langley: These supporting databases sound really important and foundational to work you're doing. Can you talk us through a bit more on these supporting databases?

Mark Waclawiak: So I kind of said when we started from reliability, we were looking at a series of Excel spreadsheets and Access databases, and we essentially dismantled all of that and built pipelines from our different systems to just a centralized SQL server. It allowed us to then bring in additional data from, say, if we wanted to have ECC comments, what the dispatchers saw when the outage was happening; if we wanted to bring in data from customers or vice versa, integrate this outage data into customer-centric metrics like [inaudible 00:13:25] so you can match it in with what the customer's experience and records are. You start getting all of these different capabilities from just having a simple SQL server database. And we started with outage data and then we quickly moved on to asset data. The core of the utility business is these electric assets and we have millions and millions and millions of asset records.

So then developing a SQL server database with asset records, and then right after that, program data. Now I talked about our outage data being in a series of Excel spreadsheets. Our program data in many ways is in a series of Excel spreadsheets. Now, when you centralize program data into a SQL server database, you've got three relatively simple, straightforward relational databases that are all now essentially interlinked, and so you can start developing really powerful tools and integrating new data sets.

Another data set that we brought in for our unified outage database, was weather data. I mean, weather is one of the most critical external variables for our system. And so now that we had all of this nice clean, well-structured outage data, we brought in daily weather data from every weather station in the Northeast, for the past 10 years. And started doing modeling of how have wind speeds changed over those years, and how have those changing wind speeds changed the impact of trees in upstate New York? You really give yourself the opportunity to start asking these questions that we previously thought were unanswerable, just because we took the approach of centralizing all of our data in very clean, very well-structured tables in just simple relational databases.

Brad Langley: One of your big AI projects is the GeoMesh platform, which uses machine learning models to understand how different parts of the grid respond to different variables. You could probably describe GeoMesh a lot better than I can. Could you tell us what it is in your own words?

Mark Waclawiak: Yeah, so our GeoMesh platform is really a way for us to, I'd say, better model our system when it comes to all of these different data sets. If I want to know how has our vegetation impact in the Finger Lakes changed over the past 10 years, you really need a geospatial kind of model to do it. And so what we wanted to do is build something where we could run models that were of any type of question. Like I said, one of our big things is we wanted to be able to answer questions we previously thought unanswerable. And whether that's vegetation or assets or weather, all of these challenges have a geospatial aspect to it.

So we essentially modeled our entire kind of service territory and broke it down into these little chunks, which allows us to essentially run these historical models and this machine learning models, with the integration of new data to say, okay, if I want to know what part of our territory has experienced the most wind over 35 miles per hour on a daily basis, I can essentially instantly model that. If I want to see how vegetation impact has changed in result of those differing wind speeds, we can model that. We can continuously add on these different data sets as we secure them, as we put them into good positions in order to consistently understand our grid better and better.

What is the changing aspects of flood risk on our territory? That's a geospatial question. And then you start getting into the questions about the future, and I think that's where GeoMesh is really powerful. A lot of our really use for it right now is looking at the historical, but GeoMesh also allows us to do geospatial modeling for what is our grid going to look like over the next 10 years? When you look at forecasts for electrification, not just for EVs, but heat pumps and all the other different kind of forecasts for beneficial electrification, that's a challenge that has a geospatial aspect. Like I said, our territory is huge, spanning New York, Maine and Connecticut. So being able to model not just how electrification is coming into our system, but DERs and other kind of residential and commercial solar and other aspects, these are things that as a utility, we need to be prepared for five, ten years ahead of time in order to invest properly.

Brad Langley: It sounds like a very impressive piece of tech. How was it developed? Maybe geek out a little bit on how this came to be.

Mark Waclawiak: I mean, one of the things that I'm incredibly proud of is we've developed almost everything in-house. That team that we've been building since October of 2020, those data scientists, those data engineers, those data analysts, they've been developing these straightforward SQL databases. And then on top of that, they've just been using Python to essentially build out all of these different platforms. GeoMesh is fundamentally at its heart, a homegrown data science platform built by our data scientists, essentially using available packages. And I think that's something that I find as incredibly exciting. Yeah, it's very impressive technology. It's incredibly powerful technology and it's something that we built.

Brad Langley: And I imagine that a piece of tech like this is constantly evolving. It's not a static solution by any stretch. So I mean, do you have a vision for what you hope GeoMesh can achieve or is there a next objective y'all are working towards?

Mark Waclawiak: I think one of the big things is modeling not only historical weather patterns, but modeling simulations of future weather and understanding how do we best need to invest in the grid in terms of resiliency and reliability, to meet a lot of those changing variables. We can't just look at the past 10 years to know what the next 10 years are going to look like. We've already seen of rapid change that was not always anticipated just from historical perspective, so GeoMesh and its geospatial modeling capabilities is going to continue to evolve based on the problems we're trying to solve. And the problems we are trying to solve here at AVANGRID are problems that the utilities in general are solving.

Brad Langley: So what's next? What other big things you guys working on? I think I heard of a computer vision project, predictive health analytics. Maybe give us a sense of what you guys are tackling next.

Mark Waclawiak: Yeah. We have a big computer vision project that is already in production in terms of poles and transformers and expanding that out to other asset classes. Not only using computer vision to identify what assets are in what pictures, because as a utility, we take an incredible amount of photos of our system as part of day-to-day operations. And historically, you would have to rely on manual intervention and manual classification of these images. I mean, we have over 2 million poles. So if you have a couple of photographs of each pole, you're talking about millions and millions and millions of images. So developing computer vision technology, not only to identify what's in the picture, but the health of the asset, I think that's critical. Because if you can take a look at 5 million photographs of poles and through computer vision, identify which poles are in [inaudible 00:21:20] need of intervention or other challenges, you put yourself at a real advantage as a utility when it comes to making decisions about your capital investment.

And we're taking a similar approach with our predictive health analytics for substation assets. We are taking data from our suite of substation breakers and integrating data sets that were not necessarily available, to make decisions about intervention for end-of-life for these circuit breakers. Because again, as a utility, we have a duty to invest our money as best as possible to improve our service for our customers. And if we can use data for something like poles or circuit breakers to know, hey, this circuit breaker, it's got a year left and this might have five, you can really prudently invest that money to always stay a step ahead of your major equipment failures. So to us, that kind of intersection of using these AI technologies and developing them again in-house so that we own the technology, as these models improve, they benefit our customers, and we're not paying more money to a vendor as they improve either or as we use them more. It's something we're really excited about.

Brad Langley: Last question for you. We call this show With Great Power, which is a nod to the power industry. It's also a Spider-Man quote, "With great power comes great responsibility." What is the superpower that you bring to the energy transition?

Mark Waclawiak: I think if I would have to say a superpower, it would be the golden touch. Because every single data set we've come across, every challenge, every kind of broken system or process, we figured out a way to really make it work. Really to turn this data into gold for our operations personnel, for our planning personnel, for our customers. And to me, that's one of the most exciting parts of the job.

Brad Langley: That's quite the superpower. Well, Mark, thank you so much for your time. Really impressive work you and your team are doing, and I appreciate you sharing it with us.

Mark Waclawiak: No, absolutely. Thank you, Brad.

Brad Langley: Mark Waclawiak is the senior manager of operational performance at AVANGRID. With Great Power is produced by GridX in partnership with Latitude Studios. Delivering on our clean energy future is complex. GridX exists to simplify the journey. GridX is the enterprise rate platform that modern utilities rely on to usher in our clean energy future. We design and implement emerging rate structures and we increase consumer investment in clean energy, all while managing the complex billing needs of a distributed grid. Our production team includes Aaron Hardick and Mary Catherine O'Connor. Anne Bailey is our senior editor. Steven Lacey is our executive editor. The original theme song is from Sean Marquand. Roy Campanella mixed the show. The GridX production team includes Jenny Barber, Samantha McCabe, and me, Brad Langley. If this show is providing value for you and we really hope it is, we'd love it if you could help us spread the word. You can rate and review us at Apple and Spotify, and you can share a link with a friend, colleague, or the energy nerd in your life. As always, thanks for listening. I'm Brad Langley.

No items found.
No items found.
No items found.
No items found.
No items found.