This is going to be a multipart blog series revolving around Detection Engineering and more specifically practicing Detection-as-Code in Detection Engineering. Throughout this series, we’ll dive deep into concepts, strategies, and practical blueprints that you can adapt to fit your own workflows. From building a detection engineering repository to validating detections, automating documentation, and delivering them at scale to numerous managed environments. We’ll also explore how to effectively test and monitor your detections to ensure they stay reliable.
In this first part we are going through the basic terminology and concepts of a Detection-as-Code approach in Detection Engineering.
But first things first:
Detection Engineering is the practice of designing, developing, testing, and maintaining threat detection logic.
Depending on the size, needs, and maturity of the organization, the detection engineer can either be dedicated to detection engineering or have a broader role that includes tasks not strictly adhering to the definition above and may fall into a bit of grey area.
While detection engineers are capable of following through these tasks and can certainly provide input on the data needed to support their work, that doesn’t necessarily make those tasks part of detection engineering, strictly speaking.
So,
Before discussing how to implement a Detections-as-Code approach we should first go through the basic steps of developing detections. Similar to software engineers that use the Software Development Life Cycle (SDLC) – a framework for planning, building, and maintaining software, detection engineers are using Detection Development Life Cycle (DDLC) [2] [3]. The Detection Development Life Cycle (DDLC) is basically a structured approach consisting of 6 phases to design, implement, test, and maintain threat detection capabilities.
There are a few variations of the framework online but the one I personally like the most is the following:
A few words about each step:
Requirement Gathering
In this phase, the goal is to understand what threat needs to be detected and identifying the detection goals. This includes understanding the risk and urgency in order to prioritize accordingly, but also defining the success criteria for each detection.
Design
This phase includes selecting the appropriate data sources and events, identifying the fields needed, and mapping the detection to relevant taxonomies. The questions that should be asked during this phase are: What’s the best way to catch it? What logs or telemetry are needed? The design phase also includes considering edge cases, performance concerns, and potential evasion techniques.
Development
In development, the actual detection rule or logic is created. The goal is to write the actual query/rule/detection logic using a platform specific (e.g. KQL, EQL, SPL) or platform agnostic (e.g. Sigma) language of choice. The detection should be developed according to the design and be sufficiently documented.
Testing and Deployment
Testing includes validating the detection by replaying attack data or simulating the triggering behaviour to verify that it triggers correctly and tune it to minimize false positives and false negatives. Deployment includes delivering the validated detections into the production environments.
Monitoring
An important phase without question is monitoring. Continuously reviewing, adapting and adjusting the detection to the monitored environments to ensure optimal False Positive and False Negative rates. It is within this stage that thresholds are tuned, filter outs are performed and detections are even decommissioned if they are deemed unworthy or unmaintainable.
Continuous Testing
This phase involves automated testing and regular threat simulations. Continuous testing ensures the detection remains accurate, resilient to changes, and aligned with new variations or adaptations of the detected technique.
Now that we have an abstract idea of the detection development lifecycle, next, we should define what a Detection-as-Code approach means for Detection Engineering and how we can use it to implement and develop automations for the phases that we described above.
Detection-as-Code (DaC) is the practice of managing and developing threat detections using software engineering principles.
What characteristics of software development are we able to apply in threat detection development though?
We can summarize the key aspects of the above principles in the following graph.
One may ask, why should we go through the trouble of introducing complexity in our processes or put effort in developing automations if one can simply maintain the detection on the SIEM platform. There are many benefits of practicing Detection-as-Code:
Collaboration
Collaboration is improved by having team members reviewing the work of fellow team members, offering their point of view through PR reviews.
Consistency
It enforces consistency by using common formats, standardized field values, taxonomies making the detection library easier to maintain and search.
Quality
Quality is achieved through schemas, query syntax and best practices validation, and peer reviews.
Efficiency
Efficiency is increased by automating development and testing, reducing the manual effort required.
Scaling
Structured formats, clear taxonomies, and automation make it possible to manage and deliver a large and evolving library of detections efficiently.
Improved Documentation
Documentation of developed detections is vastly improved by enforcing the population of metadata fields like references, tags, investigation notes, known false positives and others.
The first answer that comes to mind is clearly Managed Security Service Providers (MSSPs). For MSSPs a Detection-as-Code approach allows for a scalable, consistent, and efficient management of detections across multiple clients. By adapting software engineering principles and treating detection logic in a version-controlled and automated manner, MSSPs can standardize detection rule quality and deploy efficiently content at scale, increasing in that way the quality of services while reducing manual effort and operational costs. But is it only for MSSPs?
No, an in-house SOC that implements Detection-as-Code gains significant advantages in consistency and security maturity. Detection-as-Code promotes maintainable and well-documented detections, making it easier to track changes. Continuous testing and monitoring allows for early identification of gaps or noisy rules, allowing the detection engineering team to quickly fine-tune detections without overwhelming analysts with false positives. This leads to faster issue resolution, better management of resources, and a more effective and focused SOC.
In this first part we provided definitions, described basic concepts and processes of detection engineering and Detection-as-Code and set clear goals for our expectations on practicing Detection-as-Code.
The next blog of this series will be about designing and structuring a repository that will host detection content.
[1] https://cyb3rops.medium.com/about-detection-engineering-44d39e0755f0
[2] https://medium.com/snowflake/detection-development-lifecycle-af166fffb3bc
[3] https://www.picussecurity.com/hubfs/Presentations/Enabling the Detection Development Lifecycle with Attack Simulation.pdf
[4] https://github.com/resources/articles/devops/ci-cd
Stamatis Chatzimangou
Stamatis is a member of the Threat Detection Engineering team at NVISO’s CSIRT & SOC and is mainly involved in Use Case research and development.