(cross posted from Dark Reading, and inspired by a previous version of this blog)
For years, organizations deploying Security Information and Event Management (SIEM) or similar tools have struggled with deciding what data to collect inside their security operation platforms. So the dreaded question — “what data sources to integrate into my SIEM first?” lives on.
How to approach answering this?
First, using “output-driven SIEM” — the best answer to this question — covers it: SIEM collection depends on your security monitoring needs and use cases and how you prioritize them using your risks. Any popular list of top log sources aggregated from many organizations will end up being useless for organizations with different security needs and challenges!
While an output-driven SIEM approach has been known for 10+ years, many organizations are still looking for best practices in collection before they decide on how they plan to use the data. In fact, large organizations often make the decision to integrate a log source into their SIEM or SecOps platform based on factors other than the pure security necessity.
Overall, such factors often include:
Finally, if a SIEM product charges per volume of data collected, the cost of introducing a new data source into the platform may be one of the deciding factors. For example, will you include a data source that will consume 10% of your overall SIEM license if you only plan to use it as context — valuable though it may be — for another data source? Namely, if you don’t plan to write any detection rules or apply other detection logic based on this telemetry. A popular example here would be DHCP logs — how many detections rely solely on DHCP logs? None or very few at most [many Chronicle clients do, BTW]
As a result, experiences with SIEM deployments (going back to 2002) taught us that few people will include DNS or DHCP logs during their initial phases of SIEM roll-out. In fact, some will never include them in their SIEM at all! When asked why, those people explain that while they are convinced of the general utility of DNS logs, they do not see much value in each individual message that costs money to collect.
These logs are essentially “sparse value logs” where the value is in getting the bulk rather than in getting some particularly valuable messages like say Windows Security Event ID 1102. As a result, SIEM operators have doubts about paying for inclusion of this data into their SIEM.
In fact, this gave rise to an architecture where one product is used for high-value logs while another product augments it by storing more voluminous logs. They do work if there are good APIs in the products (such as to query one telemetry repository from another), but it is useful to remember that they do not offer advantages other than cost.
Naturally, I want to test my assumptions, so I obviously did this (and this too)
So, yes, top log sources change over time and firewall and server logs flooding the SIEM tools in the early 2000s were supplemented with new critical sources such as:
At the same time, some of the classic sources remain very popular and very useful:
Also, some log sources qualify as “newly popular” even though some organizations have been collecting them for years if not decades, these include:
Finally, if you integrate a new log source type, make sure that you monitor for the log telemetry actually arriving into your SIEM!
So, to have your SIEM perform better, do the following:
One More Time on SIEM Telemetry / Log Sources … was originally published in Anton on Security on Medium, where people are continuing the conversation by highlighting and responding to this story.
*** This is a Security Bloggers Network syndicated blog from Stories by Anton Chuvakin on Medium authored by Anton Chuvakin. Read the original post at: https://medium.com/anton-on-security/one-more-time-on-siem-telemetry-log-sources-b0a88572dac9?source=rss-11065c9e943e------2