Making it so many things can break before users are impacted is the goal. Making it so that any user impact is glaringly obvious and easy to identify and confirm and mitigate is the goal.
Now we just need to expand that a smidge...and develop thru the lens of their instrumentation in production. Build for reality, not a simulacrum.
Reality is code plus architecture and infrastructure, time and elapsed time, dependencies, method of deployment, user activity, and any other concurrent activity.
What they can't do is tell you how confident in your confidence you should be, or how easy it will be to validate or find any bugs, or how many are impacted by the bug, and on and on. You need prod.
The *overwhelming majority* of bugs are far too small and subtle to trip a monitoring check and page someone. (Thank God.)
The answer is to go and look at the shit you just deployed, thru the instrumentation you shipped with it, and verify it is working as you intended.
You might make sure your instrumentation is capturing column data type, before size and after size, a was_compressed flag, time elapsed compressing, compression format,
Obviously, you can watch for elevated errors in the newer version. But also:
2) is it running only on the right data types?
3) is it reclaiming space?
4) what errors or warnings is it generating?
5) look for some data that should be skipped or bailed on. Is it?
6) go look at some data from the perspective of a user. Look ok?
And if you know you can use feature flags to immediately enable/disable the code, and history tells you that most bugs are caught swiftly and trivially... Well.. This is a bad example 😬
BUT! I would totally ship this instrumentation in "dry run" mode and let it run for the weekend to see what it WOULD do. 🥰