WINDOWS MVP MASTERCLASS:
Key to solving issues is RESETTING STATE. This is the secret to so much. It's why rebooting often works. But that's just beginning.
Knowing the nuances gives you omnipotent superpowers. I do it often.
I've talked before, but I'm going to re-state😉 it:
As a senior Windows engineer at a large firm and recognized Microsoft MVP in Datacenter Management, this is in preparation for a talk I'm preparing to deliver to our own Helpdesk on a series about Troubleshooting Theory. I spent 10 years in Helpdesk/Sysadmin previously.
There are a variety of incepting causes to invoke Troubleshooting Theory. This one starts:
IT USED TO WORK AND NOW IT DOESN'T.
🧵
When developing software, programmers take a "state" (status of the contents of storage/memory) and drive it to another "state" – based on ASSUMPTIONS.
Things break when PROGRAMMER ASSUMPTIONS do not MESH WITH THE STATE a computer's data has ended up at.
State CAN include the state of the OS they do not control or have input on. Their assumptions on OS state also break programs, but we are focusing on self-contained issues today.
For examples of OS execution environment state causing issues, see below:
Imagine you are programming an automatic transmission.
You only expect the driver to invoke Reverse gear when they are in Park. So you do not validate the STATE of the speed.
The user taps the shift lever to put it into reverse. Your software switches to reverse at 40 kph.
No assumptions in your drivetrain account for the load of going 40kph to Reverse. They break in innumerable ways.
Now imagine you're the technician. You have to know what to replace. Admins rarely have logical visibility. Programs are rarely segmented/. Replace it all. How?
For this kind of problem resolution with no intimate visibility of mechanical operation, you need to know where all the broken parts store their state so you can replace them.
Luckily, with computers, removing all state and reinstalling forces the parts to regenerate themselves.
(Yes I know mixing the idea of broken gears and state isn't a perfect analogy. I'm writing this live as you watch, I'll take these learnings for the job I get cash money for. Thanks for being my intellectual sacrifice just go with it for now thanks.)
Uninstalling and reinstalling a program will sometimes work. The issue is that programmers are rarely experts in packaging installers, or the people they pawn it off to are rarely experts in internals of the program. Or, blindly removing state is itself dangerous. So no fix.
In principle, an application developer can store state anywhere and by any method they want in a general-purpose PC. But often there's only a few things they do:
-AppData\Local
-AppData\Roaming
-ProgramData
-HKCU\Software
-HKLM\Software
-Temp
-System Temp
-HKLM Service Parameter
Expanding scope, this is a good example of state.
Your mental model for how things work is a simple datastore with y/n, but there are so many prerequisites and other hidden process markers that the result is a completely unintelligible black box.
Programmers didn't anticipate it.
Please understand virtually no developer documents how to REMOVE STATE because being a programmer is not a grant of knowing how systems – even their own – behave. Culture speaks of programmers as gods. They are barely skating by – like anyone.
Your scale of experience matters.
Let's get into a real-world example from THIS week where I had to identify multiple PROGRAM STATES and neutralize them before successful operation.
Literally I am the highest Windows escalation tier for a major firm. I am teaching you exactly what it takes to do what I do.
Okay, I'm going to brief you like a senior engineer then break it down.
We have an application that uses an IIS website it configures to communicate with an SQL Express Database, in concern with a Windows service that takes its commands and provides additional coordination logic.
Due to the size of your environment and underlying issues in the OS where logarithmic numbers of commands fail, this application is filling the 10GB allotment for SQL Express databases, at which point it enters unsupported territory. The following is your mission as a Sr Eng:
You install+login to SQL Management Studio and survey every table. You gain understanding of how tue app works and queues jobs.
It's triggering exponential jobs you cannot cancel without deep SQL surgery that isn't worth it.
Via SQL syntax you export the table of custom settings.
You uninstall the application and delete the SQL database. You reinstall, but the setup detects SQL is already installed and does not perform any SQL configuration, leading to launching a broken IIS website. You clean out all traces of the SQL install.
You uninstall and try to install again. It's still not performing SQL action. You enable command line logging on the server to see if it's even launching the SQL installer. It's not. There's still a trace that skipping SQL.
You uninstall anything SQL and delete all SQL registry
You run the product installer with /? in command line to see the options to get a better idea of its internals. There's an extract option. You extract and browse through the payload for a better understanding. You attempt to launch the installer with a verbose UI option you found
Success! It's prompting to install SQL and goes through that process. Now the IIS website installs... and it gives a credential error.
Your prev install of the application must have cached a local DB user password. Yes, it was under an obscure name of the developer in registry.
Using the knowledge you've gained you uninstall product and SQL, wipe registry of SQL and product, and clean up temp folders and program directory for good measure.
You install again and it works! Congrats you have solved the issue. Now you use the exported config to reimplement.
Now, I did have some help in the developer had some documentation about custom SQL server that gave me some clues on how it detected if SQL was already there.
But everything else was entirely past banging my head against similar problems and knowing where state could be stored.
Your job with an IT troubleshooting mental toolkit is to keep expanding the possibilities you are aware of and can check.
For example, knowing basic SQL syntax let me extract some critical configuration from the broken application. If I didn't know that, I'd have been stuck.
Now, it's very easy to get in over your head very quickly. Into situations that you end up not being able to solve.
This is WHY you want to OWN SMALL, LOW CRITICALITY SERVICES as early in your career as possible. Adopt literally anything nobody wants to run or maintain.
Normally there are logs you should become intimately familiar with. Unfortunately, many Windows installers are... black boxes to some extent. You can extract the logic sometimes or get debug logging turned on if you can pass from EXE to MSI. This one was too complex.
This thread was not laid out perfectly but taught me a bunch of stuff on how to approach this in the future.
I will respond with future State scenarios as they arise.
tl;dr people end up reinstalling Windows because a program's state is broken and they don't know how to figure out how to reset it so they reset the entire system's State.
But sometimes Windows is just fucked it's a flood title Microsoft will tell you to toss it if you call them
A good way to get a feeling for where Windows applications keep temp files and state is Winapp2.ini
Search any notable application name and see where _SOME_ of its data is kept. This includes cache and sometimes state.
This can be used with @bleachbit.
raw.githubusercontent.com/MoscaDotTo/Win…
Note that Winapp2.ini tries to be safe as possible, BUT YOU CAN SERIOUSLY MESS UP YOUR WINDOWS INSTALL JUST DELETING EVERYTHING IT IDENTIFIES. The stuff it deletes is sometimes not designed to be deleted, so you are entering unknown state. But great learning oppurtunity.
I recently had a pernicious issue with Win11 Quick Access list giving errors trying to edit it.
I knew BleachBit+Winapp2.ini could clear those registry keys. I used it, it RESET THE STATE, WINDOWS REINITIALIZED THE QUICK ACCESS LIST STATE, and the problem was solved.
Windows would be such a better OS if you could reinitialize components in a supported way.
This is an example of where using THE LEVERS YOU HAVE to cause the program to re-run through various state calculations can produce desirable effects!
You don't need a debugger you just need to understand broken program state can be self-correcting by taking it through its paces.
Imagine how early in the game and the save file that your character's gender is stored. Who knows what kind of re-pulls and re-calculations of data are caused when you change your gender 30 hours later.
Of course this can also introduce a whole other raft of state issues...
Fun fact: How Windows stores whether these checkboxes show as checked or unchecked in "Visual Effects" let is DIFFERENT REGISTRY VALUE than where it stores the computed bitmask Windows ACTUALLY draws from.
This screen does NOT SHOW YOU THE ACTUAL STATE. It can become inconsistent
The exposed surface of a program's state to you as a user is not necessarily driven by the ACTUAL logic.
However, manipulating the exposed controls can cause the program to re-run the actual logic. That's not always reliable either, but it's a possibility you can try.
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.
