Tweet

Jon Hencinski

Nov 6 • 14 tweets • 6 min read

What does a #SOC tour look like when the team is remote?

TL;DR - Not a trip to a room with blinky lights - but instead a discussion about mission, mindset, ops mgmt, results and a demo of the tech and process that make our SOC “Go”.

SOC tour in the 🧵...

Our SOC tour starts with a discussion about mission. I believe a key ingredient to high performing teams is a clear purpose and “Why”.

What’s our mission? It's to protect our customers and help them improve.

Our mission is deliberately centered around problem solving and being a strategic partner for our customers. Notice that there are zero mentions of looking at as many security blinky lights as possible. That’s intentional.

Next, we talk about culture and guiding principles - key ingredients for any #SOC. I think about culture as the behaviors and beliefs that exist when management isn’t in the room.

Culture isn't memes on a slide - it's behavior and mindset.

Next, with a clear mission and mindset - how are we organized as a team get there? Less experienced analysts are backed by seasoned responders. If there's a runaway alert (it happens), there's a team of D&R Engineers monitoring the situation ready to respond.

The tour then focuses on operations management and how we do this for a living. You have to have intimate knowledge of what your system looks like so you know when something requires attention. Is it a rattle in the system (transient issue) or a big shift in work volume?

With solid ops management we’re able to constantly learn from our analysts and optimize for the decision moment. We watch patterns and make changes to reduce manual effort. We hand off repetitive tasks to bots because automation unlocks fast and accurate decisions.

Next, our tour focuses on how we think about investigations. Great investigations are stories (based on evidence of course).

When we identify an incident we investigate to determine what happened, when, how it got there, and what we need to do about it. Stories.

Next, how we think about quality control in our #SOC. We make a couple key points:

1. We don’t trade quality for efficiency
2. You can measure quality in a SOC
3. QC checks run daily based on a set of manufacturing ISOs to spot failures to drive improvements

Sidenote: Steal/copy our SOC QC guiding principles:

- We’re going to use industry standards to sample
- The sample has to be representative of the population and done daily
- Measurements of the sample need to be accurate and precise
- Metrics we produce need to be digestible

What about #SOC results? Let's talk about it. Yes, alert-to-fix in <30 minutes is quite good. But a high-degree of automation and SOC retention are equally important.

Before the tour ends we share insights. The security incidents we detect become insights for every customer.

“Identity is the new endpoint”, a lot of BEC in M365 and MFA fatigue attacks are up.

You can d/l a copy of our Quarterly Threat Report here:
expel.com/expel-quarterl…

Then we jump into our platform and provide a demo of the tech and process that enable the #SOC to complete their mission. Here's a video capturing some of the items we cover: expel.com/managed-securi…

Finally, we stop by the #SOC. Most of our analysts will be remote - but a tour is about so much more than seeing a room with monitors and blinky lights. I believe a great SOC tour highlights the people, culture and mindset behind tech and process.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @jhencinski

Jon Hencinski

@jhencinski

Nov 5

A good detection includes:
- Clear aim (e.g, remote process exec on DC)
- Unlocks end-to-end workflow (not just alert)
- Automation to improve decision quality
- Response (hint: not always contain host)
- Volume/work time calcs
- Able to answer, “where does efficacy need to be?”

On detection efficacy:
⁃ As your True Positive Rate (TPR) moves higher, your False Negative Rate moves with it
⁃ Our over arching detection efficacy goal will never be 100% TPR (facts)
⁃ However, TPR targets are diff based on classes of detections and alert severities

Math tells us there is a sweet spot between combating alert fatigue and controlling false negatives. Ours kind of looks like a ROC curve.

This measure becomes the over arching target for detection efficacy.

“Detection efficacy is an algebra problem not calculus.” - Matt B.

Read 9 tweets

Jon Hencinski

@jhencinski

Oct 19

There’s no more strategic thing than defining where you want to get to and measuring it.

Strategy informs what "great" means, daily habits get you started (and keep you going) and measurements tell you if you’re there or not.

A 🧵 on #SOC strategy / metrics:

Before we hired our first #SOC analyst or triaged our first alert, we defined where we wanted to get to; what great looked like.

Here’s [some] of what we wrote:

We believe that a highly effective SOC:

1. leads with tech; doesn’t solve issues w/ sticky notes
2. automates repetitive tasks
3. responds and contains incidents before damage
4. has a firm handle on capacity v. loading
5. is able to answer, “are we getting better, or worse?”

Read 19 tweets

Jon Hencinski

@jhencinski

Jul 5

How to think about presenting good security metrics:

- Anchor your audience (why are these metrics important?)
- Make multiple passes with increasing detail
- Focus on structures and functions
- Ensure your audience leaves w/ meaning

Don’t read a graph, tell a story

Ex ⬇️

*Anchor your audience 1/4*

Effective leaders have a firm handle on SOC analyst capacity vs. how much work shows up. To stay ahead, one measurement we analyze is a time series of alerts sent to our SOC.

*Anchor your audience 2/4*

This is a graph of the raw trend of unique alerts sent to our SOC for review between Nov 1, 2021 and Jan 2, 2022. This time period includes two major holidays so we’ll expect some seasonality to show up around these dates.

Read 17 tweets

Jon Hencinski

@jhencinski

Feb 22

Once a month we get in front of our exec/senior leadership team and talk about #SOC performance relative to our business goals (grow ARR, retain customers, improve gross margin).

A 🧵on how we translate business objectives to SOC metrics.

As a business we want to grow Annual Recurring Revenue (ARR), retain and grow our customers (Net Revenue Retention - NRR) and improve gross margin (net sales minus the cost of services sold). There are others but for this thread we'll focus on ARR, NRR, and gross margin.
/1

I think about growing ARR as the ability to process more work. It's more inputs. Do we have #SOC capacity available backed by the right combo of tech/people/process to service more work?

Things that feed more work: new customers, cross selling, new product launches.
/2

Read 18 tweets

Jon Hencinski

@jhencinski

Feb 11

Julie Zhou's, "The Making of a Manager" had a big impact about how I think about management.

One of the key lessons is that managers should focus on three areas to achieve a high multiplier effect: purpose, people, and process.

Let's apply that lesson to make a #SOC manager..

Purpose: Be clear with your team about what success looks like - and create a team and culture that guides you there. Go through the exercise of articulating your teams purpose.

The "purpose" we've aligned on at Expel in our SOC: protect our customers and help them improve.

People: To get to where you want to go, what are the traits, skills, and experiences you need to be successful?

Traits (who you are)
Skills (what you know)
Experiences (what you've encountered/accomplished)

When we hire new SOC analysts, traits >> skills.

Read 4 tweets

Jon Hencinski

@jhencinski

Nov 6, 2021

A good alert includes:
- Detection context
- Investigation/response context
- Orchestration actions
- Prevalence info
- Environmental context (e.g, src IP is scanner)
- Pivots/visual to understand what else happened
- Able to answer, "Is host already under investigation?"

Detection context. Tell me what the alert is meant to detect, when is was pushed to prod/last modified and by whom. Tell me about "gotchas" and point me to examples when this detection found evil. Also, where in the attack lifecycle did we alert? This informs the right pivots.

Investigation/response context. Given a type of activity detected, guide an analyst through response.

If #BEC, what questions do we need to answer, which data sources? If coinminer in AWS, guide analyst through CloudTrail, steps to remediate.

Orchestration makes this easier.

Read 8 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Share this page!

Jon Hencinski

People who liked this thread also liked...

Try unrolling a thread yourself!

More from @jhencinski

Jon Hencinski

Jon Hencinski

Jon Hencinski

Jon Hencinski

Jon Hencinski

Jon Hencinski

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!