Since I spend so much time talking to and researching SOCs and SOC analysts, I often get asked, "What the biggest difference is between high and low growth SOCs?"
The answer? Expectations.
First, what do I mean by growth? I'm talking about places where analysts can grow their abilities. These are the places that take complete novices and help them achieve competence or take experienced analysts and help them specialize. Growing human ability.
The organizations that support growth well are those where leadership has high (but realistic) expectations for analysts. These most often center around expecting the analyst to be able to make reliable, evidence-driven decisions confidently.
Like lots of folks, I'm pretty miffed by the lack of robust virtualization support on Apple M1 hardware. I hope that gets fixed soon. But, it also got me to thinking about decision making at big vendors like Apple and others.
For example, the security community (myself included) is often critical of Microsoft for some of their decision-making when it comes to usability/flexibility vs. security. Two things immediately come to mind...
1. Macros. The idea that they exist, are default usable, and the UI pushes users more toward enabling them than disabling them.
2. Default logging configs. Fairly minimal with lots of sec relevant stuff left out (integrate sysmon already!).
A lot of tips about good writing are rooted in the psychology of your reader. For example, if you want your reader to understand a risk (a probability), is it better to express that as a relative frequency (1 in 20) or a percentage (5%)?
Typically, people understand risk better as a frequency. For example, consider the likelihood of a kid dropping out of high school. You could say that 5% of kids drop out, or that 1 in 20 does. Why is the latter more effective?
First, it's something you can more easily visualize. There's some evidence you might be converting the percentage into the frequency representation in your head anyway. Weber et al (2018) talked about this here: frontiersin.org/articles/10.33…
Abstractions are something analysts have to deal with in lots of forms. Abstraction is the process of taking away characteristics of something to represent it more simply. So, what does that looks like? 1/
Well, speaking broadly, let's say that I tell you I had scrambled eggs with parsley and tarragon for breakfast. You can probably picture that very clearly in your mind and it will be fairly accurate to reality. However... 2/
What if I just tell you I just had eggs? Or that I just had breakfast? Your perception of reality may differ greatly from what I actually ate. The abstraction increases opportunity for error.
One of my research areas that I write about often is curiosity and how it manifests in infosec education and practice. A topic that relates to curiosity is Boredom, which I've done some recent reading on. I thought I'd share a bit about that. 1/
First, what is Boredom? A consensus definition is that boredom is the uncomfortable feeling of wanting to engage in satisfying activity without being able to do so. 2/
When you're bored, two things happen: 1. You want to do something but don't want to do anything. 2. You are not mentally occupied in a way that leverages your capacities or skills.
Let's talk about some lessons gathered from how a student over the weekend quickly went from struggling on an investigation lab and...
to finished and...
"I don’t know if you just Yoda’d the hell out of me or what"
This particular student emailed and said they were stuck and gave me some misc facts they had discovered. I responded and asked them to lay out a timeline of what they knew already so that we could work together to spot the gaps. 2/
The truth is that when this inquiry is taken seriously, it doesn't often result in us having to spot those gaps together at all because the student figures it out on their own. Why does this happen? Two main reasons... 3/
One of the things I absolutely love about our new @sigma_hq course is that a final challenges includes building your own new rule (we provide a bunch of ideas) with the option of actually submitting it to the public repo. Folks learn and contribute community detection value.
@sigma_hq As part of that, @DefensiveDepth walks students through the process, even if they've never used git before. The Sigma community also does a great job of providing input and additional testing.
It's awesome to watch it all come together. I'm looking at a rule in the public repo now written by a student who didn't know anything about Sigma a month ago. It's been tested, vetted, and now it'll help folks find some evil.
I don't know who needs to hear this today but cyber security work is really hard. Even at the entry level, it's difficult work.
People around you too easily forget that because of the curse of knowledge -- we can't remember what it was like to not know something we know.
Prevalence of incomplete information, lots of inputs, tons of tacit knowledge, an ill-defined domain, high working memory demands, poor tooling and UX, lack of best practices, interpersonal challenges... I could go on. It's really hard.
Even if everybody around you seems to make it look easy -- it isn't. This stuff is complex, difficult, and mentally demanding.
One of the more helpful things new analysts can do is to read about different sorts of attacks and understand the timeline of events that occurred in them. This enables something called forecasting, which is an essential skill. Let's talk about that. 1/
Any alert or finding that launches an investigation represents a point on a potential attack timeline. That timeline already exists, but the analyst has to discover its remaining elements to decide if it's malicious and if action should be taken. 2/
Good analysts look at an event and consider what sort of other events could have led to it or followed it that would help them make a judgement about the sequences disposition. 3/
Investigative Experience -- Tuning detection involves investigating alerts from signatures so you need to be able to do that at some level. A year or two of SOC experience is a good way to start.
Detection Syntax -- You have to be able to express detection logic. Suricata for network traffic, Sigma for logs, YARA for files. Learn those and you can detect a lot of evil. They translate well to vendor-specific stuff.
This relates to my 4th and 5th reasons why these decisions happen -- AV company tactics and giving folks what they need to tune rules. That actually means GIVING analysts the rule logic. I could go on and on about this.
Most companies don't want to give out their rule logic because they see it as a sensitive trade secret. This is nonsense. A rule set isn't a detection companies most valuable intellectual property, it's their processes for creating those rules and the staff that do the work.
Limiting access to detection logic makes it harder for your customer. It is MUCH more difficult to investigate alerts when you don't know what they are actually detecting and how they're doing it.
I've seen many good analysts give clear, compelling explanations as to why tuning is important but fail to convince the decision-makers that this needs a dedicated person or a day a week from an existing person.
The thing that needs to become more commonly accepted is that if you decide your company needs a SOC, then that has to include a detection tuning capability. It also needs to be run by people who've seen this thing work well.
Some of these are companies that developed their own "standard" for expressing detection logic and don't even use it in most of their tools 😂
This comes from a lot of places. Usually, someone develops a detection tool by themselves or part of a small or isolated team and they choose what they want, then the project grows and it becomes painful to change it.
There's often interesting public discussion about vendor detection tools and what they detect vs expectations. There's some interesting decision making that happens behind the scenes at these vendors when it comes to how they manage detection signatures. A thread... 1/
At a vendor, when you build a detection ruleset for lots of customers, you have to approach things a bit uniquely because you don't control the network where these rules are deployed and can't tune them yourself. 2/
One facet of this challenge is a decision regarding how you prioritize rule efficacy...we're talking accuracy/precision and the number of false positives that analysts have to investigate and tune. 3/
I'm really excited to share that our newest online class, Detection Engineering with Sigma, is open this morning. You can learn more and register at learnsigmarules.com.
The course is discounted for launch until next Friday.
If you're not familiar with @sigma_hq, you should be! It's the open standard detection signature format for logs. Said another way, Sigma is for logs what Snort/Suricata are for network traffic and YARA is for files.
Perhaps the best thing about Sigma is that you can easily convert its rules into LOTS of other formats using the Sigmac tool. Things like Elastic, Splunk, Graylog, NetWitness, Carbon Black, and so on.
One of the unique challenges of forensic analysis is that we're focused both on determining what events happened and the disposition of those events (benign or malicious). A failure to do one well can lead to mistakes with the other. 1/
Generally speaking, analysts interpret evidence to look for cues that spawn more investigative actions. Those cues can be relational (indicate the presence of related events), dispositional (indicate the malicious or benign nature of something), or even both at the same time. 2/
Not only do we have to explore relationships, but we also have to characterize and conceptualize them. That means we're constantly switching between cause/effect analysis and pattern matching of a variety of sorts. 3/
One of the things we struggle with in investigations as analysts is even talking about them in an educated way. Someone asks you how you found something and it's, "I looked in the logs". Well, no... you did a lot more than that! 1/
You identified a cue that made you think there were other related events to be found, and those events could indicate an attack. Then you considered which of those events would be most meaningful to disposing the timeline you found. 2/
After that, you formed an investigative question that helped you hone in on exactly what you're looking for. With the question formed, you queried the log evidence to return a data set that you hoped would provide an answer. 3/
When I write about analyst skills I often want to add a section about metacognitive skills. However, it's sometimes redundant because those skills appear alongside all the other skills analysts leverage.
For example, good analysts often know their limitations. They know what evidence sources they are weak in (knowledge regulation) and seek alternative investigative pathways to reach conclusions (knowledge regulation). That's essential metacognitive stuff.
Sometimes that's easy to deal with. There are a lot of ways to prove program execution (OS logs, prefetch, registry, and so on) and most mid-level analysts are comfortable with at least one of them. Not knowing one isn't a massive burden because you can use others.
Let's talk about PREVALENCE ANALYSIS. This is one of the most useful concepts for analysts to understand because it drives so many actions. Prevalence is basically what proportion of a population shares a specific characteristic. 1/
First, prevalence is often an anomaly detection technique. Let's say you've found a process running with a name you don't recognize. If it's running on every host on the network you might say it's more likely to be benign. 2/
If the process is only running on one or a couple of hosts, that could be a bit more suspicious. Is there a common thread between the hosts? A pattern? There's more work here. 3/
Upon notification of potential malware infection, SOC analysts tend to spend more time trying to confirm the malware infection, whereas IR/DF analysts tend to assume infection and move toward understanding impact.
Usually, this results in different investigative actions. Confirming infection focuses more on the leading portion of the timeline relevant to the current event. Exploring impact focuses more on the trailing portion of the timeline.
Sometimes the investigative actions can look the same, but that depends on the malware and how the infection presents. Even with similar investigative actions, the intent is different.
The natural thing for inexperienced analysts to want to do is jump to the worst case scenario and begin investigating that thing. After all, the bad thing is very bad! But, that's usually a bad idea for at least three reasons. 1/
First, all investigations are based on questions. You use existing evidence to drive questions whose answers you pursue in evidence. If there is no evidence that indicates the very bad thing, you are probably jumping the gun by looking for it. It's a reach. 2/
Second, the very bad thing is often very hard to investigate. Exfil is a prime example. The techniques for investigating and proving data exfil are often time-consuming and cognitively demanding. Now you're distracting yourself from the actual evidence you already have. 3/