Yesterday, I shared some Golden Nuggets on the benefits of exercising your Crisis Teams and why we exercise. Today, I am going a little deeper on another major hidden and often overlooked benefit that exercises create.
Whether this is for Crisis Teams, Incident Management teams (or whatever you like to call your team), Business Continuity Teams, and especially Information Technology Disaster Recovery (ITDR) teams. Frequent, repeated exercises build confidence.
Confidence among the team(s) themselves, confidence in managers and executives of the business, and confidence from your customers and business partners. The most important place to build this confidence is among the teams that are doing the recovery work.
As you might expect, a lack of conducting exercises among your teams has the opposite effect. It can cause your team to break down and literally destroys their confidence, which also negatively impacts recovery times and overall recovery.
Let me provide some deeper insight by using an example from some previous work I did.
Several years ago, I was consulting for a major airline assisting some of the IT teams to develop Disaster Recovery Plans, getting them to move beyond tabletop walkthroughs and doing “functional” exercises, as well as documenting the exercise to get credit during an audit.
It is important for me to state here that this was a project based on an internal audit outcome. I was working with the bottom performers on remediation based on that audit. These were groups that either:
- Had zero plans in place
- Never conducted an ITDR exercise beyond a tabletop walkthrough
- Conducted a functional exercise but didn’t document it properly and received no credit for doing the exercise
I want to talk about a particular group within that project that I worked with and why they never conducted anything more than a tabletop walkthrough, and why they lacked confidence and were afraid to even think of doing anything functional.
During my first meeting with this group, I specifically asked the simple question:
Why haven’t you done a functional failover exercise in the past?
The reply may come as a surprise to many of you but didn’t surprise me at all. The response they provided to me was that they weren’t allowed to do anything beyond a tabletop walkthrough.
My follow-up question to them was, who said that they were not allowed to conduct a functional exercise?
The Response: “The Business” (specifically operations).
After some discussion, I learned that the “business side” in the operations leadership felt that the systems and application were too critical to do a functional failover exercise while the application was running in production.
However, the systems and application weren’t deemed or signed off as an application that was too critical to for such an exercise. Yet, every time the team submitted a request to conduct a functional failover exercise operations would reject it and say it was too critical.
Normally, with a set of systems and or applications that the business deems too critical to complete these failover exercises they elevate them as such, and the business signs off on it as well as accepting the risk of not having these exercises done.
Not really the best decision as there are ways to do these exercises even while in production. But that is not the purpose of this story.
You see, this team not only lost confidence but felt a distrust in their capabilities from business leadership. So much so. that after working with them in both the development of a runbook and tabletop walkthrough that when I proposed having them submit permission to conduct a functional failover exercise, I was told, “there’s no point, they’ll never sign off on it.”
I told them, let me worry about that, you just pick a date and submit the request.
Behind the scenes, I was working with my engagement manager to either get the business to approve the request, or bump the criticality up to properly accept the risk, and sign off on it.
We got the approval.
Over the next 30 days, I worked with that team on their runbook to ensure that every step was in there and that they knew how to properly document and track the failover exercise, including backout procedures.
When the day of the exercise came, they performed wonderfully and did everything right.
They hit a glitch late into the exercise and couldn’t do a 100% successful failover. But did achieve the following:
- They learned a lot. They ran into several issues during the exercise and were able to overcome them and move forward
- They properly documented what they were doing. Conducting log capture, taking screenshots of before and after states, taking notes as they moved through the process for later use
- Completed an after-action and discussed lessons learned, things that went wrong, and things that went well
All of this, even though the outcome wasn’t a successful failover during the exercise. They learned immensely during the exercise. They learned they could depend on one another to complete their assigned tasks. And the business learned they could trust the team to do the failover exercise, without disrupting the production environment.
The most important part. They were happy as a team and gained massive confidence in their own capabilities. This allowed them to continue to conduct exercises, gain further confidence and learn new skills.
In the end, a successful exercise isn’t always about a successful failover or other such success. In fact, you can learn a great deal when you fail. And when you learn and build the lessons into your plans, that is when the real success comes.
That, and the confidence you gain will boost you and your team during the next exercise or incident.
So. Get out there. Exercise and build confidence in yourself and your team.