This past October 29 marked the two year anniversary of super storm Hurricane Sandy making landfall in the Northeastern US. Despite ample warning, critical infrastructure wasn’t prepared for the impact that this storm would have, and essential services failed. Reflecting on this anniversary, I began thinking about the lessons we learned, or should have learned from our past experience. I’ll relate a story from a close friend and respected banking technologist:
“Do you remember where you were on August 14th, 2003? I only ask because it is a day that marks a dramatic change in the way I think about disaster recovery and business continuity. Around Noon on that day I was making plans to join my friends to watch the first preseason NFL game of the year and enjoy some wings. By all measures it was shaping up to be a normal late summer Thursday evening. That’s when my pager went off. “Priority 1, Severity 1. Begin Recovery for All XXXXX Applications.”…
This was not a drill. As the lead architect for a handful of commercial banking applications at one of the largest banks in the United States, I had designed our disaster recovery capabilities, cooperated in both documentation and testing of the overall business continuity plan and was designated as the lead technical resource in the event of an outage. I had tested these systems and processes annually and had always been confident in our ability to recover effectively. Of course, that confidence was based on the limited scope of our threat analysis and the downstream impact analysis. In a post 9/11 World we had planned for a major metropolitan outage, we had planned for a lack of personnel availability and we had planned for availability of technical resources. Our systems were highly redundant, our production systems were geographically disparate from our DR systems and our technical resources were based in multiple cities around the US.
We had not planned for the Northeast Blackout of 2003. In a matter of minutes a rolling blackout cascaded across most of the Northeastern US and Southeastern Canada, leaving both our production datacenters as well as our DR datacenters hundreds of miles away, without power. Nobody really knew what was happening, how widespread the blackout was or what had caused it. We didn’t know if it would last an hour, a day, a week or a month. We didn’t know if it was an accident, an attack or something else. We did know that we hadn’t planned for THIS. The production datacenter had a secondary grid-based redundant power source as it’s primary backup, with diesel generators as the last line of defense. Our disaster recovery (DR) site only had diesel generators as its backup. All of a sudden our ability to stay online, meet government regulations and the needs of our global customers was dependent on the answer to a single question. How do we keep the diesel tanks full? After all, everyone in a six state region was going to need the same limited resource, and put extraordinary strain on the unprepared diesel supply chain.
We dodged a shotgun blast. As luck would have it, power was restored 10 hours later and our business continuity plan/disaster recovery capabilities had performed well. Our business partners were satisfied; our customers felt no impact and our bankers were impressed. My team, however, was not feeling quite so hot about the whole thing. Our design, plan and testing had proven both compliant and effective, only because it hadn’t been pushed to it’s breaking point which would likely have been 10-12 hours away. We came THIS close to failing all of our stakeholders, and we knew it.
What I learned that day, changed the way I think about DR and BC in a regulated environment. Being compliant with the regulator was not enough to protect my business, my customers or my shareholders from the unthinkable. I would have to rethink the threat landscape, look at multiple scenarios for business impact analysis and extend my planning beyond our internal systems and resources, over which we have little control. To do all of this, along with the rest of my job, I was going to need help. I was going to need my vendor community to step up to the plate and show me how to plan for the unthinkable. I was at a top bank with the financial resources and technical talent to attack this head on, yet I still needed my vendor community to be my trusted advisor. I can only imagine how much help the 13,000 smaller US banks and credit unions need to protect their business. “
In thinking about this story I can’t help but consider the lessons the banking industry should have learned. Despite the very similar impacts, scope and geography, many banks in the Northeast were unprepared for Hurricane Sandy and I have heard the same war stories over and over again at banking conferences across the US. The opportunity for solutions providers should be clear. The banking industry needs your help. Avnet is here to ensure that our valued partners have all the resources they need to become that trusted advisor to the banking industry before, during and after the unthinkable happens.