Cindy Sridharan Profile picture
Jun 25, 2019 9 tweets 8 min read Read on X
Making distribute tracing easier with more sophisticated visualizations - @YuriShkuro

The first is color coded by service graph. The second is a heat map #QConNYC
Now @YuriShkuro is talking about a tool that compares traces.

[ ed- omg this: just yesterday I was talking to a vendor at QCon and was wondering if it’d be possible to compare traces. They said their product didn’t offer this. IMo this is the most important aspect of tracing]
And the diff tool deals with aggregate traces. You can then drill down into an individual trace. @YuriShkuro at #QConNYC

[ed - this. This so freaking much. Starting with a trace is like being in a hiding to nowhere. Need to begin with an aggregate view.]
I missed the first part of this talk, but everything I saw was 💯.

Tracing only becomes useful when you can surface the relevant information from a trace. That requires aggregate analysis.

But this isn’t without its flaws. Finding the right baseline might be hard.
Here’s a tool that’s internal to Uber that helps with “root cause” (sic) analysis to drill down from business metrics to app level telemetry. @YuriShkuro at #QConNYC
One of the challenges of taming microservices complexity is dealing with data.

Tracing can help here by enabling building lineage graphs h/t @palvaro
The hardest part of doing all this analysis is application instrumentation.

Doing the analysis is easy for us (Uber) - getting almighty fidelity data is the challenge. - @YuriShkuro at #QConNYC
In summary:

- tracing really becomes usable when you have creative visualizations
- engineers don’t really know how their services work. Tracing helps unlock unparalleled insights.

@YuriShkuro at #QConNYC
Uber doesn’t use tracing for latency analysis. We use it for root cause analysis. - @YuriShkuro at #QConNYC

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Cindy Sridharan

Cindy Sridharan Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @copyconstruct

Nov 13, 2022
I agree that misguided to suggest the only way for managers to “be technical” is by coding.

But boy, some management folks seem so virulently anti-coding in a way that’s just absurd.

Coding is definitely one way (though not the only way or even the best way) to “be technical”.
There are many ways “leadership” can contribute to the technical betterment of a project (and improve your own credibility) without writing production code:

- build small side projects using libraries your team authors, giving them feedback on code quality, testing, design etc.
There are many benefits to being able to read and use your team’s code.

People working very closely on projects given deadlines etc can have blind spots.

A strong technical leader who can comprehend code or systems can point these out.
Read 4 tweets
Jul 12, 2021
Every time I say something - anything at all - about software quality or dev productivity, I have legions pontificating about unit tests and documentation in my mentions.

So here’s another (slightly contrarian) take: unit tests/docs aren’t always the best yardstick of “quality”
In other words, if you’re looking at a project with inadequate docs/test coverage, and immediately think the way to fix it or improve it is by adding more tests/docs, then it’s possible your immediate impact on the project or team productivity might be rather meagre.
I used to think this way years ago myself; when seeing some code that was ambiguous or where it wasn’t possible to test it easily, my immediate instinct was to “fix it”.

Learning how to work around these issues without “fixing” it was a far more valuable skill.
Read 4 tweets
Oct 21, 2020
So many companies, large and small, end up solving all the wrong problems or the least important ones. Especially common when building infra tools/software.

I see this happen again and again, and we then wonder why the state of the art hasn’t improved in the past 5 years.
When there’s a problem space where frankly everything sucks at every layer, it’s common to try to think the way to tame this space is by taking a bottom-up approach.

this approach fails, time and time again. Because it almost always doesn’t provide any immediate value to users.
Most infra software is hard to use for even other infra engineers.

The UX almost universally sucks, whether it’s APIs, protos, yaml, UI, dashboards.

The thing is, a product doesn’t need to solve *all* of these problems to be great and provide immediate value to users.
Read 4 tweets
Aug 3, 2020
The paper I've been looking forward to the most is now out: zero downtime deployments at Facebook.

Disruption free release of services that speak different protocols and serve different types of requests (long lived TCP/UDP sessions, requests involving huge chunks of data etc.)
"Socket Takeover" should be familiar to traffic nerds. Transferring the listening socket over a Unix Domain Socket with ancillary message (CMSG) + SCM_RIGHTS is *precisely* how HAProxy does seamless reloads.

What *is* novel is how they transfer UDP (QUIC) socket fds.
The second approach is one called Downstream Connection Reuse used for long lived persistent connections. This involves rendering in path proxies stateless and tunnel requests over H2.
Read 5 tweets
Aug 2, 2020
This is true. But there’s a corollary:

Please do not hire junior engineers unless your team/org has the bandwidth for proper mentorship.

Hiring a junior engineer is a commitment - you need to be willing to invest at least 1-2 years. A lot of teams aren’t set up for this.
Some more things to consider:

- inexperienced managers aren’t probably the best suited to hire and mentor junior engineers, unless these managers themselves have mentorship/guidance from senior managers/leadership folks. A bad manager can be a horrendous formative experience.
- mentoring junior devs remotely presents unique challenges. A lot of what I learned as a junior engineer was via osmosis - listening to conversations other senior engineers were having, even if I wasn’t a part of the conversation. This is hard to replicate in a remote setting.
Read 9 tweets
Jul 31, 2020
Wow this article from Dropbox on why @EnvoyProxy shines has tons of super 🌶 takes: A thread covering security, concurrency, opencore 1/7

The hottest take was probably:

“writing modern C++14 is not much different from using Golang or, with a stretch, one may even say Python.”
Still remember the days when @mattklein123 claimed “developer productivity was one of the highlights of modern C++” when introducing Envoy and people raising eyebrows at this claim.

Receipt:
2.

- goroutine-per-request” model and GC overhead greatly increase memory req in high-connection services.

- thread per request model allows better security than pre-forking servers (Nginx, unicorn, gunicorn etc.)

PS @mattklein123 any plans for adopting io_uring in envoy?
Read 8 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(