In the world of Cybersecurity today, there are many metrics and Key Performance Indicators (KPIs) that can be used to not only gauge what the threat landscape looks like, but to even evaluate just how effective and efficient an IT Security team is in combatting those threat variants. In this regard, there are two key metrics that can be deemed to be amongst the most important. They are as follows:
- The Mean Time to Detect (also known as known as and referred to as in the rest of the article as the “MTTD”).
- The Mean Time to Respond (also known as known as and referred to as in the rest of the article as the “MTTR”).
On average, it takes up to seven months or even longer for the IT Security team to just even detect that a threat variant is lurking amongst their respective IT and Network Infrastructure. There are numerous reasons for this, some of which include the following:
- A lack of training of the IT Security team.
- The IT Security teams in businesses, whether large or small, are often overburdened with what are known as “False Positives”. These are warnings and alerts that have been triggered for no apparent reason. As a result of this, there is a delayed reaction to detecting the threat variants that are real and currently active.
- There could be a lack of technology at the place of business which could hasten the process of determining what warnings and alerts are real or not. Although Generative AI can be used in this regard, the models would have to be trained and always optimized, which could constrain the resources of the IT Security team even more.
- Overall, there could be a severe lack of proactiveness on the part of the IT Security team. In this instance, many businesses today still have a “reactive mindset”. This simply means that an IT Security team will only start to be more alert only after they have experienced an actual security breach.
- The Cyberattacker of today is also becoming very covert and stealthy. This can also lead to a longer time for the IT Security team to detect their presence. One of the primary goals of the Cyberattacker of today is to move in slowly, and transit across the IT and Network Infrastructure in what is known as a “lateral fashion.”
Even after a threat variant is detected, the hope is that the IT Security team will be able to respond to it quickly and contain the security breach if one is actually happening. But once again, this could be a slow and delayed process, especially if there is no Incident Response Plan in place. But today, businesses are coming to understand the gravity of having a solid IR Plan in place, having been catalyzed by the experiences learned from the COVID-19 pandemic.
Not only must have an IR Plan in place to mitigate the security breach, but it also must be rehearsed on at least a quarterly basis to keep it updated, and the lessons learned from each iteration incorporated into it.
Now that some of the major reasons have been provided as to why an IT Security team can be potentially slow to respond and contain a threat variant (which of course results in a very high MTTR and MTTD, respectively), the key question now is how to bring the long time value that is associated with these two key metrics to much more reasonable level. Thus, it is the primary goal of this article to propose a solution in which the time frame of seven months can be brought down to just a matter of minutes. What we propose here is a combination of Cybersecurity technologies that can be envisioned to work amongst one another in a holistic environment. The point of initiation of this proposed solution will be what is known as the “Digital Person.”
However, it is especially important to keep in mind that this proposed solution is still theoretical. It has not been demonstrated yet in a test environment, which is one of the long-term goals. Of course, it is highly anticipated that there could be many tweaks and adjustments made into this solution once this stage is reached, but in the interim, there is a strong belief that this could very well work for the sake of bringing down the current values of the MTTR and MTTD metrics.
An Overview of the MTTD AND MTTR
The MTTD
The MTTD can be technically defined as follows:
“Mean time-to-detect (MTTD) measures the amount of time it takes for an organization to discover a security incident. The lower this metric is, the faster and more dependable its detection systems are. Early detection makes a significant difference in the overall cost of incident response.”1
To illustrate this point further, suppose that Company XYZ has been impacted by a security breach, or for that matter, this can also mean that a Cyberattacker has entered through a backdoor in the IT and Network Infrastructure, but has not deployed a malicious payload, or made any advancements. For reasons cited earlier in this article, the IT Security team simply does not detect this now that any of these two incidents have happened, or anything else similar. As a result, it can be quite a long time until either one of (or even both) of these incidents has been formally noticed or discovered.
This lag time is what the MTTD metric measures. The longer the lag period is, the worse the impact can be not only for the business in question, but also for key stakeholders, employees, and even customers. That is why if any of these events are detected sooner, the better the chances that a successful mitigation can take place.
The MTTD is typically calculated as follows:
Σ The Total Time It Takes to Cyber Incidents Over a Certain Time Period / Σ The Total Number of Cyber Incidents That Have Occurred
As one can see, this can be subjective initially, as determining a realistic time to report on is entirely up to the CISO and his/her IT Security team. But for illustration purposes, let us examine the following scenario:
7 Months to Detect a Cyber Incident / 1 Cyber Incident Has Occurred
The period chosen here is seven months. So, by dividing the two variables, we can see that the MTTD is seven months, which is not acceptable at all.
The MTTR
The MTTR can be technically defined as follows:
“Mean time to repair (MTTR), sometimes referred to as mean time to recovery, is a metric that is used to measure the average time it takes to repair a system or piece of equipment after it has failed.”2
For illustration purposes again, suppose that the IT Security team at Company XYZ has finally determined that a security breach has occurred, or even that a Cyberattacker is lurking around secretly in the IT and Network Infrastructure. Now the plan of action must be how to contain the security breach or completely eradicate the Cyberattacker out. The time it takes to do all of this is reflected in the MTTR metric. Once again, the longer the lag time, the worse it will be for the company, and all the parties involved (as just described).
However, it is important to make a distinction at this time. The term “response” can be subjective. For example, to one person it can mean acknowledging that a threat variant has been found, and plans are currently being developed to contain it. To another individual, it can mean only that but also taking initiative-taking steps immediately to correct the situation. For purposes of this article, the MTTR will refer to the latter scenario.
The MTTR can be computed as follows:
Σ The Total Time Spent on Containment, Mitigation, Repairs / The Total Number of Containments, Mitigations, and Repairs That Are Needed
Just like the MTTD, the MTTR can be a subjective one. For example, the time spent and the total number of remediations that are needed will depend on the type of threat that is being dealt with. For some of them, the solutions are already known, and it is just a matter of implementing them from past history. But if the threat variant is a new complex one, the time and efforts that are needed for both variables respectively, can be difficult to ascertain from the outset.
But for purposes of this article, we will illustrate the MTTR with a simple example, which is as follows:
6 Total Hours for Repairs on One Threat Variant / 1 Repair That Is Needed
Thus, the MTTR will be six total hours. Just like the MTTD, the lower the time threshold is, the better.
Up Next: Why the MTTD and MTTR are of utmost importance
The next article in this series will explain why the MTTD and MTTR are of great importance to businesses of all sizes.
Sources/References:
1.What is Mean Time to Detect (MTTD)? | Lumifi Cybersecurity
2. What is Mean Time to Repair (MTTR)? | IBM
Ravi Das is an Intermediate Technical Writer for a large IT Services Provider based in South Dakota. He also has his own freelance business through Technical Writing Consulting, Inc.
He holds the Certified In Cybersecurity certificate from the ISC(2).