Sunday, August 30, 2015

My Katrina Story - How I Helped Getting the Power Back On in New Orleans

With the ten-year anniversary of Hurricane Katrina hitting New Orleans upon us, I have a story to tell.

In 2005, I was working at ESCA, a company that wrote software to mange electrical grids. In a power company's control center, displayed on huge screens, the electrical grid and its state was represented graphically. Something like this:

(Actually, ESCA had been an independent company, first purchased by the French company ALSTOM, then sold to another French company AREVA, and then sold back to ALSTOM. It's just easier to call it ESCA.)

I could go on in minute details about the issues with ESCA and the software. But a short list is:

  • A lot of twenty-plus old legacy software, much written in Fortran.
  • Software written to run on VAX VMS (although by the time I left ESCA, only Windows and Linux was supported).
  • A horrible, HORRIBLE user interface designed by people who had no business doing UI design ("to get to that action, you have to select this new mode, with completely different main-level menus, a dig down through three layers of submenus").
TagNotes (in C++), the application I worked on was for "tagging" electrical equipment on the network. Back before everything was automated, the tag was a physical tag attached to a piece of equipment to note something, e.g. "Do not turn on this switch because this line is being worked on." With TagNotes, a tag icon would be put on the grid display next the device being tagged. Some tags were just informational, some prevented any change of state of the device (such as closing a switch) for safety or grid management.

All the data for the grid control was kept in a proprietary database, something designed 20 years early, written in Fortran, and definitely not relational, and definitely not up to modern standards. The tables in the database could vary in size, but the total space of the database took was fixed size and could not be changed except by a complete rebuild of the system which took many hours.

To see a list of the tags, there were separate views for your ordering preference: by date of creation, tag type, creator, and a few more that I no longer remember. The tags were in their own database with each view requiring a separate table with the tags listed in the desired viewing order (that is, not like a relational database where a single table can be queried with an ORDER BY clause to get the desired order).

This is where Katrina comes in. A few days after Katrina hits, New Orleans' electrical utility, Entergy, needs to get the electrical grid back up. And to keep track of work to be done, devices need tags, A LOT OF TAGS. So many in fact, that the tag database gets full and they can't add any more tags, seriously hampering the recovery effort. They need a fix, and they need it now.

I'm on an emergency phone call with Entergy, trying to understand the issue and figure out a solution. The tag database is full, and there's no time to spend the hours to rebuild the system to expand the database. And then an idea for a fix comes to me. I ask, "Do you need all the views of the tags right now? Could you get by with, just say, two, like tags in order of creation, and tag type?"

The seat-of-the-pants fix is to only have two tables of tag lists, which will have space to expand in the limited size of the database with the unneeded views and their tables gone. The code that creates the tables is a small part of the system and can be rebuilt in a under 10 minutes. I look through the source code (the Fortran source code) and tell the Entergy engineers what needs to change. Entergy has the source code and the ability to rebuild what's needed. And they now had space in the database for all the tags they need.

And that's my Katrina story.

No comments:

Post a Comment