If you think this is even a controversial statement, you have never dealt with observability at scale OR you have done it wastefully and poorly.
When things are normal, you care about the shapes and trends and representative samples. When things are not or may not be, you care about every gorram detail of every thing.
You need a lot less than you think for confidently mapping patterns and trends.
I cannot predict all the questions I may need to ask. So I have to store raw events, not rollups. Period.
1) store aggregates (this is not observability, you can’t drill down)
2) limit horizontally (capture less context)
3) limit vertically (sample the boring stuff)
4) have infinite money and capacity and humans to burn
That’s why debugging is so fun and easy. ☺️🐝 You don’t have to use your intuition, or predict which details will be relevant to a future outage or question. Got context? Gather it!
No, you are implicitly choosing to drop all of a far more important type of data: literally all of your context.
Are you going to pay for an o11y stack 50x as much as prod? No?
Sample.