It is cheap and easy to find known unknowns, so we should check for them CONSTANTLY.
Every known-unknown is a support problem. A pattern matching problem, an index lookup problem.
Unless you are a world class SRE team, your team can handle *maybe* one per week and still ship some code.
If it recovered on its own somehow, you'd shrug uneasily and get back to work, never actually knowing what happened.
You'd document the symptoms and rely on people's memory, instead of instrumenting the system to be clearer and saner.
And your system would sink a little further into the bog of unintelligibility every time. But I get it -- I've been there. Doing things well takes time.
Write tests to check for software regressions. Write monitoring checks to check for system regressions. Run them a lot.