Jessica Joy Kerr Profile picture
Mar 15, 2022 6 tweets 4 min read Read on X
This afternoon at #srecon, Adam Mckaig and Tahia Khan from @datadoghq about the evolution of their metrics backend
The high-level architecture looks very familiar to me. The slightly more detailed less so — many parts!
For scale, break up incoming data, put into kafka.
hash(customer_id) -> partition_id
… but then one kafka topic gets overloaded, so…
hash(customer_id) -> topic_id, partition_id
to send to topics in different clusters.
Later, some customers are too big.
So for those customers:

hash(metric_id) -> topic, partition

Since metrics are queried individually, @datadoghq can split up data to that fine grain and each query will still only need to hit one partition. #SREcon
Partitions still get unbalanced. Some customers, and some metrics, are way bigger than others.

So @datadoghq got smart with its partitioning, implementing Slicer based on a paper from Google.
#srecon
The storage layer knows nothing about the partitioning scheme.
Intake and Query need the mapping from (customer, metric) to (cluster, partition) so they can send to & query from the same node.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jessica Joy Kerr

Jessica Joy Kerr Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @jessitron

Nov 29, 2022
I laugh at people who talk about “exactly-once delivery”
The specs that claim it have been proven wrong.

But we have methods (like idempotency) to do things well.
@mjpt777 #YowLondon
Make handover/resumption protocols.
“This is what I thought I sent to you last, did you get it?”
“Here’s what I got from you last, let’s work it out from there”

@mjpt777 #YowLondon
These handover/resumption protocols are then useful when you need to switchover to a different datacenter.

@mjpt777 #YowLondon
Read 4 tweets
Nov 29, 2022
Most scientific research about fun and humor is…
scientific. Not fun, and not humorous.

But it turns out that ducks are the funniest animal.

@holly_cummins #YowLondon Image
There’s something shameful about fun.

This harks back to Puritans, who even banned Christmas because it was fun.

@holly_cummins #YowLondon Image
Programming is fun, so software development has the potential to be fun.

@holly_cummins #YowLondon Image
Read 15 tweets
Nov 29, 2022
85% of data projects fail.
Yikes! How can we do better?

@jessetanderson at #YowLondon
“AI is not a strategy.” @jessetanderson #YowLondon
You need about ten data engineers and one data scientist.
also an operations team.

@jessetanderson #YowLondon
Read 12 tweets
Oct 27, 2022
Software is magic because it scales so well.
I can take the output of my brain and scale it to the world.

@KentBeck
If we go from Idea to Behavior change to new Idea…
how quickly we can do that depends on the structure.
@kentbeck
If we go Idea to Behavior to Idea to Behavior
as fast as we can,
it’s gonna get slower and slower and then the developers will get frustrated and leave and the new developers will be even slower…

So sometimes, we make a structure change before the behavior change.
@KentBeck
Read 14 tweets
Oct 26, 2022
SRE teams try to keep toil under 50%.

Only 50% of work that has no enduring value...

@DivineOps #QConSF
SREs in the audience? (Dozens of hands)
Experienced SREs? (Like 2.5 hands)
We @RedHat used to ship products. Build a thing, package it, send to customers. Then it was their problem. Customer hires a consultant or figures it out.

Now we mostly ship services. Now it’s our headache, reliability and uptime etc. It’s different

@DivineOps #QConSF
Read 19 tweets
Oct 26, 2022
In which @mipsytipsy speaks up about the Engineer<->Manager Pendulum

#QConSF Image
The team deserves someone
who wants to manage people.
who is not bitter about meetings
who is interested in sociotechnical systems and nurturing careers
whose technical skills are strong enough to evaluate their work.

@mipsytipsy #QConSF
And each of us deserves a long, interesting career.

Trick: don’t self-identify as a manager OR as engineer.

Look at yourself as a Technologist
Or Technical Leader
… who needs engineering AND management skills.

@mipsytipsy #QConSF
Read 27 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(