In 2023, 1271 incidents were reported to European Authorities via EIDAS, NISD, and EECC, a 20% increase compared to the previous year. With more and more regulations entering into force in the next few years (such as NIS2 and CRA), a larger number of organisations will be forced to report their incidents too due to the increase of organisations in scope for this mandatory reporting, which will most likely make this number increase even more. Maybe you will be among these and are wondering how to prepare yourself.
In the previous entry in this series, we introduced the 3 pillars for building resilience: Respond, Sustain, and Recover. In this blogpost, we will explore the first pillar: Respond, the dimension that is crucial for mitigating the impact of an incident, particularly in the case of a ransomware attack.
Why exactly should we prepare for a potential incident? Why not wait after our first incident, once we have more experience to learn from and therefore lose less time on the theory and research? While there is merit in learning and adapting your response based on previous incidents, proactive preparation can significantly mitigate the impact of an such incident in the first place. Some of these benefits include:
How and where can you start your incident preparedness journey? In this blogpost, we will cover some of the most important aspects you can work on before facing an incident.
As already depicted previously in this series of blogpost, when dealing with a crisis or incident, multiple teams operate, with very different but important objectives. Depending on the incident and its impact on the Business, one, multiple, or all teams may be involved. Each company has a different structure, different governance, and different standards and regulations to comply with. This inevitably has an impact on how incidents are handled. Therefore, you need to define the scope of your (Cybersecurity) Incident Response Process:
All of these points will drastically change the scope of your incident management process, and defining it correctly is the first step to ensure you do not interfere with another team’s process.
With this scope in mind, you can start determining and drafting the documents that you need. Among these potential documents, we recommend two in particular: the incident Response Plan, and Incident Response Playbooks.
Now that you have defined the scope of your Incident Response Process, we need to figure out how to trigger it. This also raises the question of defining “What is an Incident?”. This seems easy, but has a lot of components to it:
You may also want to have different types of Incident severity or priority based on specific criteria, such as the type of Incident (this is where an Incident Taxonomy can be useful, such as the ENISA RSIT), the scale of the attack, the criticality of the systems impacted, or even the type of user impacted. By defining these criteria, you can start forming a table or matrix defining types of incidents:
Incident Severity | Potential criteria | Examples |
Low | – Only one user or device impacted – Very low impact to Business – Pertains to an incident type typically considered as a low impact for your company (e.g. phishing) – Incident can be handled by 1 or 2 incident handlers | – A malicious executable was detected and blocked, but some analysis should be done – A user provided company credentials through a phishing email, and their account needs to be reset |
Medium | – Impacts multiple devices, potentially servers – Low to medium impact to Business – Pertains to an incident type typically considered as a medium impact for your company (e.g. malware, DDoS) – Might require notification to relevant authorities depending on applicable regulations (e.g. NIS2, DORA) – Incident requires collaboration between a few teams to resolve the incident, prompting the involvement of the Incident Response Team | – Compromise on a single server leads to a downtime of low-criticality applications and requires restoration from backups – A Denial of Service Attack affects the availability of our public-facing services for a few hours |
High | – Impacts multiple departments, crown jewel assets (e.g., Active Directory, backups servers), or even the company as a whole – Significant impact to Business – Pertains to an incident type typically considered as a high impact for your company (e.g. ransomware) – Must notify relevant authorities depending on applicable regulations (e.g. NIS2, DORA) – Incident requires strong collaboration with multiple teams, and prompts the escalation to the Crisis Management Team | – Large scale ransomware outbreak – Large data breach impacting sensitive data about customers and/or intellectual property – Running out of coffee beans in the morning? |
As part of your Incident Response efforts for large incidents, multiple people will need to play a role, whether it’s coordinating teams and communications, analysing the systems, implementing measures, taking decisions, … Among the roles you may want to define, we have:
For each of these roles, ensure you define at least two stakeholders who can assume the responsibilities linked to their role. This will ensure that, in the event of an unavailability of the primary role owner, someone can fill in for them and fulfill their duties.
Dealing with an incident often requires quick actions to contain the incident as fast as possible and avoid further impact to the Business. However, some of these actions may have unintended consequences on the Business and add even more unnecessary impact to the situation (e.g. cutting the internet uplink to stop a data exfiltration, which also halts the Business and affects income-generating processes). In such cases, it may be better to verify with the responsible teams or owners to ensure the actions can be taken safely. At the same time, we do not want to bombard top management with approval every time we need to erase a phishing email from mailboxes or reset the password of a compromised account…
This is where having a clearly defined mandate for your Incident Response Team is important: what actions can we take, without needing any prior approval, and under which circumstances. This can be a very simple “do whatever it takes to protect the business”, or be a very specific and highly defined list of actions and criteria. Among the important actions to consider, in particular for ransomware: can we isolate a workstation or server, can we isolate a network segment or the entire network, can we disable a user account, can we pre-emptively isolate backup systems, …
If you followed all the previous steps in the blogpost and documented your process thoroughly, your Incident Response Plan is probably becoming a sizeable document, with multiple sections, subsections, graphs, matrices, and appendices. You also created a beautiful and complete Ransomware Playbook with every single measure you can take to respond to such an incident. In fact, it’s so well prepared that you now have 100+ pages on this Playbook, and appendices are dangerously reaching double-letters territory. You couldn’t be better prepared even if you tried to… But would you be able to navigate such documents during an actual incident or crisis? Where’s that prioritisation matrix again? Wait, what was the step after the initial containment action again?
These long documents are perfect to properly define the complete process but can become cumbersome to use when dealing with an actual incident. This is where you can introduce short summaries, checklists, flowcharts, and other types of infographs that condense the information in an easy and digestible way. The goal with these one or two-pagers is to guide your response and briefly remind you of steps and actions to discuss. Your experience does the rest (and if you hit a hole in your memory, the full document is still there to help you out if needed).
Just as Monopoly prepares you to lose your entire fortune to your relatives and friends, Dungeons & Dragons prepares you to face a horde of goblins and dragons, and Escape Rooms prepares you to free yourself from the basement of a maniacal scientist, tabletop and simulation exercises prepare you to face potential incidents and crises.
Such an exercise typically lasts a few hours and forces you in a fictitious situation, in this case an incident scenario. The goal is to test your reaction when faced with these situations and determine if there are any lessons to be learned from this test. For ransomware in particular, such tabletops allow you to test the entire process: from the initial detection to the recovery, while also covering the analysis of the incident, protection of backups and containment of infected systems, ensuring sustainability of the operations with the Business Continuity Team, recovering from the incident via backups with the collaboration with the Disaster Recovery Team, and reporting to authorities and other stakeholders through the Crisis Management Team.
Exercises come in all shapes and sizes, from a casual read-through of the process (also sometimes called tabletops), to simulations or even full interruptions (e.g., with the use of a “Chaos Monkey“) that will disrupt some part of your business whenever it wishes to. A Simulation Exercise is a good middle-ground: it allows you to discuss your processes and put you in a situation to challenge your response, but it doesn’t impact your Business and production environment directly.
In the end, the best approach might be a combination of all types of exercises, where you can test multiple scenarios over time in different types of exercises. This is where your Exercise Test Plan comes into play. With this document, your goal is to define the types of exercises you want to organise, at which frequency, and for which types of impacts or scenarios. This allows you to plan multiple years in advance and see your progress over time.
Whichever method you choose, make sure you use it as an opportunity to determine areas of improvement for your processes. Review your documents regularly, test them in a fictitious setting via a mix of tests and exercises, and learn from your potential mistakes. This ongoing cycle of review and refinement will help you build stronger resilience over time.
In this blogpost, we discussed the importance of defining our Incident Response Process through Plans and Playbooks:
There are now of course other actions you can take to improve your readiness to face incidents further, such as creating procedures for specific tasks (e.g. how to recover the Active Directory from scratch), automating complex tasks (e.g., creating a script that will lock every account that you provide in a CSV file), and training your teams on new techniques.
Now that we explored the “Respond” pillar, stay tuned for the next entry in this series of blogposts, where we will explore the “Sustain” pillar.
Damien is a Cyber Strategy & Architecture consultant within NVISO, with a main focus on Incident Readiness. Through creating, reviewing, and testing multiple incident response plans, playbooks, and exercises for companies of different industries and sizes, he gained insights on how to run efficient and effective incident management operations.