One thing it took me quite a while to understand is how few bits of information it's possible to reliably convey to a large number of people.
When I was at MS, I remember initially being surprised at how unnuanced their communication was, but it really makes sense in hindsight.
For example, when I joined Azure, I asked people what the biggest risk to Azure was and the dominant answer was that if we had more global outages, major customers would lose trust in us and we'd lose them forever, permanently crippling the business.
Meanwhile, the only message VPs communicated was the need for high velocity. When I asked why there was no communication about the thing considered the highest risk to the business, the answer was if they sent out a mixed message that included reliability, nothing would get done.
The fear was that if they said that they needed to ship fast and improve reliability, reliability would be used as an excuse to not ship quickly and needing to ship quickly would be used as an excuse for poor reliability and they'd achieve none of their goals.
When I first heard this, I thought it was odd, but having since paid attention to what happens when VPs and directors attempt to communicate information downwards, I have to concede that it seems like the MS VPs were right and nuanced communication usually doesn't work at scale.
I've seen quite a few people in upper management attempt to convey a mixed/nuanced message since my time at MS and I have yet to observe a case of this working in a major org at a large company (I have seen this work at a startup, but that's a very different environment).
I've noticed this problem with my blog as well. E.g., I have some posts saying BigCo $ is better than startup $ for p50 and maybe even p90 outcomes and that you should work at startups for reasons other than pay.
People often read those posts as "you shouldn't work at startups".
I see this for every post, e.g., when I talked about how latency hadn't improved, one of the most common responses I got was about how I don't understand the good reasons for complexity.
I literally said there are good reasons for complexity in the post!
As noted previously, most internet commenters can't follow constructions as simple as an AND, and I don't want to be in the business of trying to convey what I'd like to convey to people who won't bother to understand an AND since I'd rather convey nuance
But that's because, if I write a blog post and 5% of HN readers get it and 95% miss the point, I view that as a good outcome since was useful for 5% of people and, if you want to convey nuanced information to everyone, I think that's impossible and I don't want to lose the nuance
If people won't read a simple AND, there's no way to simplify a nuanced position, which will be much more complex, enough that people in general will follow it, so it's a choice between conveying nuance to people who will read and avoiding nuance since most people don't read
But it's different if you run a large org. If you send out a nuanced message and 5% of people get it and 95% of people do contradictory things because they understood different parts of the message, that's a disaster.
I see this all the time when VPs try to convey nuance.
BTW, this is why, despite being widely mocked, "move fast & break things" can be a good value. It coneys which side of the trade-off people should choose.
A number of companies I know of have put velocity & reliability/safety/etc. into their values and it's failed every time.
MS leadership eventually changed the message from velocity to reliability
First one message, then the next. Not both at once
When I checked a while ago, measured by a 3rd party, Azure reliability was above GCP and close enough to AWS that it stopped being an existential threat
Azure has, of course, also lapped Google on enterprise features & sales and is a solid #2 in cloud despite starting with infrastructure that was a decade behind Google's, technically.
I can't say that I enjoyed working for Azure, but I respect the leadership and learned a lot.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
One thing I find interesting is how much tuning of config variables is necessary to get "good" application-level performance.
C-states, P-states, turbo, general prefetch, "buddy" prefetch, etc., at the CPU level, GC configs, etc., at the runtime level, retry policy, thread pools
A doc I saw recently advocated for Go over Java because it has fewer GC config cars because incorrect GC configs are a major cause of incidents. But not having knobs doesn't solve the problem; it's equivalent to picking an arbitrary config.
Some large companies have an automatic config tuning framework, but very few companies are large enough to want to have a full-stack optimization setup where apps get run on specific hosts that have machine-level configs that are compatible with the application.
It's really interesting to be on the other side of this.
Because I'm around programmers, I usually hear programmers explain how some other field is simple, but in this thread we have the opposite, someone from another field explaining how programming / engineering is simple
IMO, one of the most common mistaken beliefs I see is the belief that, outside of one's own field, the world is understood.
It's easy to see that, in one's own field, the world is not understood but, for some reason, people don't realize this is also true of other fields.
An example from the hardware world is Intel's Copy Exact methodology, where they tried to make every fab identical.
It's not obvious this is a good idea because it's very expensive to do this since conditions (land, etc.) are different in every location where you have a fab.
Despite increased centralization over the past 20 years, the internet feels a lot more like the wild west to me in man ways, e.g., the Google index hasn't kept up with the size of the internet, so an increasingly large fraction of the web is undiscoverable via search.
Even 10 years ago, I could basically always find old blog posts I'd read with Google.
Now, an exact string match search with site:[URL] frequently doesn't turn up the result and I have to wget the page and grep for what I'm looking for.
If the site's too large to wget and it doesn't have a custom index, I frequently can't find the page. Large commercial sites, like Twitter, sometimes build complete indexes, but it's a non-trivial effort to index something that's even 1/100th the size of Twitter, so most don't.
One thing I really liked about it was that it suggests/summarizes actionable ways to check your own thinking, which I found useful even when it was discussing a way of thinking that I've had since I was a kid since I still mess up all the time and having concrete checks helps.
Another is that it does a really good job of laying out the case for various ways of thinking. There are 6 blog posts that were on my to-do list that I think I don't need to write anymore since the book describes what I wanted to describe, but better than I would've done it.
I feel like "regretted attrition" is a curiously bad stat to track considering how widely used it is.
On the one hand, it undercounts "attrition we shouldn't have had" by ignoring second order effects that cause people to become "unregretted".
When I've worked in orgs or companies that have low total attrition (~5%), non-regretted attrition has been something like 1% or sometimes as high as 2%.
When regretted is ~15%, non-regretted will be 5% to 10%. Most of that 5% to 10% wouldn't have non-regretted in a good org.
The same things that cause regretted attrition also cause people to burn out and do work that allows the company to call the attrition "non-regretted", but it's only non-regretted if you want to operate a company that sets people up to burn out and lose motivation.
One thing I've wondered about for a long time is why I fail interviews at such a high rate, e.g., see danluu.com/algorithms-int….
People who've mock interviewed me have a variety of theories, but I don't think any of them are really credible, so I'm going to wildly speculate.
The most suggested reason people have is that I get nervous and that's the problem, which people think because I do fine in their mock interviews.
That's a contributing factor, but I only get nervous because I've failed so many interviews and I didn't used to get nervous, so
there must be at least one other cause.
Another explanation that's consistent with the evidence is that when I say something "stupid sounding", people who mock interview me (who know me) assume it isn't stupid whereas interviews assume it is stupid, e.g.,