WINDOWS MVP MASTERCLASS:
Key to solving issues is RESETTING STATE. This is the secret to so much. It's why rebooting often works. But that's just beginning.
Knowing the nuances gives you omnipotent superpowers. I do it often.
I've talked before, but I'm going to re-state😉 it:
As a senior Windows engineer at a large firm and recognized Microsoft MVP in Datacenter Management, this is in preparation for a talk I'm preparing to deliver to our own Helpdesk on a series about Troubleshooting Theory. I spent 10 years in Helpdesk/Sysadmin previously.
There are a variety of incepting causes to invoke Troubleshooting Theory. This one starts:
IT USED TO WORK AND NOW IT DOESN'T.
🧵
When developing software, programmers take a "state" (status of the contents of storage/memory) and drive it to another "state" – based on ASSUMPTIONS.
Things break when PROGRAMMER ASSUMPTIONS do not MESH WITH THE STATE a computer's data has ended up at.
State CAN include the state of the OS they do not control or have input on. Their assumptions on OS state also break programs, but we are focusing on self-contained issues today.
For examples of OS execution environment state causing issues, see below:
Imagine you are programming an automatic transmission.
You only expect the driver to invoke Reverse gear when they are in Park. So you do not validate the STATE of the speed.
The user taps the shift lever to put it into reverse. Your software switches to reverse at 40 kph.
No assumptions in your drivetrain account for the load of going 40kph to Reverse. They break in innumerable ways.
Now imagine you're the technician. You have to know what to replace. Admins rarely have logical visibility. Programs are rarely segmented/. Replace it all. How?
For this kind of problem resolution with no intimate visibility of mechanical operation, you need to know where all the broken parts store their state so you can replace them.
Luckily, with computers, removing all state and reinstalling forces the parts to regenerate themselves.
(Yes I know mixing the idea of broken gears and state isn't a perfect analogy. I'm writing this live as you watch, I'll take these learnings for the job I get cash money for. Thanks for being my intellectual sacrifice just go with it for now thanks.)
Uninstalling and reinstalling a program will sometimes work. The issue is that programmers are rarely experts in packaging installers, or the people they pawn it off to are rarely experts in internals of the program. Or, blindly removing state is itself dangerous. So no fix.
In principle, an application developer can store state anywhere and by any method they want in a general-purpose PC. But often there's only a few things they do:
Expanding scope, this is a good example of state.
Your mental model for how things work is a simple datastore with y/n, but there are so many prerequisites and other hidden process markers that the result is a completely unintelligible black box.
Programmers didn't anticipate it.
Please understand virtually no developer documents how to REMOVE STATE because being a programmer is not a grant of knowing how systems – even their own – behave. Culture speaks of programmers as gods. They are barely skating by – like anyone.
Your scale of experience matters.
Let's get into a real-world example from THIS week where I had to identify multiple PROGRAM STATES and neutralize them before successful operation.
Literally I am the highest Windows escalation tier for a major firm. I am teaching you exactly what it takes to do what I do.
Okay, I'm going to brief you like a senior engineer then break it down.
We have an application that uses an IIS website it configures to communicate with an SQL Express Database, in concern with a Windows service that takes its commands and provides additional coordination logic.
Due to the size of your environment and underlying issues in the OS where logarithmic numbers of commands fail, this application is filling the 10GB allotment for SQL Express databases, at which point it enters unsupported territory. The following is your mission as a Sr Eng:
You install+login to SQL Management Studio and survey every table. You gain understanding of how tue app works and queues jobs.
It's triggering exponential jobs you cannot cancel without deep SQL surgery that isn't worth it.
Via SQL syntax you export the table of custom settings.
You uninstall the application and delete the SQL database. You reinstall, but the setup detects SQL is already installed and does not perform any SQL configuration, leading to launching a broken IIS website. You clean out all traces of the SQL install.
You uninstall and try to install again. It's still not performing SQL action. You enable command line logging on the server to see if it's even launching the SQL installer. It's not. There's still a trace that skipping SQL.
You uninstall anything SQL and delete all SQL registry
You run the product installer with /? in command line to see the options to get a better idea of its internals. There's an extract option. You extract and browse through the payload for a better understanding. You attempt to launch the installer with a verbose UI option you found
Success! It's prompting to install SQL and goes through that process. Now the IIS website installs... and it gives a credential error.
Your prev install of the application must have cached a local DB user password. Yes, it was under an obscure name of the developer in registry.
Using the knowledge you've gained you uninstall product and SQL, wipe registry of SQL and product, and clean up temp folders and program directory for good measure.
You install again and it works! Congrats you have solved the issue. Now you use the exported config to reimplement.
Now, I did have some help in the developer had some documentation about custom SQL server that gave me some clues on how it detected if SQL was already there.
But everything else was entirely past banging my head against similar problems and knowing where state could be stored.
Your job with an IT troubleshooting mental toolkit is to keep expanding the possibilities you are aware of and can check.
For example, knowing basic SQL syntax let me extract some critical configuration from the broken application. If I didn't know that, I'd have been stuck.
Now, it's very easy to get in over your head very quickly. Into situations that you end up not being able to solve.
This is WHY you want to OWN SMALL, LOW CRITICALITY SERVICES as early in your career as possible. Adopt literally anything nobody wants to run or maintain.
Normally there are logs you should become intimately familiar with. Unfortunately, many Windows installers are... black boxes to some extent. You can extract the logic sometimes or get debug logging turned on if you can pass from EXE to MSI. This one was too complex.
This thread was not laid out perfectly but taught me a bunch of stuff on how to approach this in the future.
I will respond with future State scenarios as they arise.
tl;dr people end up reinstalling Windows because a program's state is broken and they don't know how to figure out how to reset it so they reset the entire system's State.
But sometimes Windows is just fucked it's a flood title Microsoft will tell you to toss it if you call them
A good way to get a feeling for where Windows applications keep temp files and state is Winapp2.ini
Search any notable application name and see where _SOME_ of its data is kept. This includes cache and sometimes state.
Note that Winapp2.ini tries to be safe as possible, BUT YOU CAN SERIOUSLY MESS UP YOUR WINDOWS INSTALL JUST DELETING EVERYTHING IT IDENTIFIES. The stuff it deletes is sometimes not designed to be deleted, so you are entering unknown state. But great learning oppurtunity.
I recently had a pernicious issue with Win11 Quick Access list giving errors trying to edit it.
I knew BleachBit+Winapp2.ini could clear those registry keys. I used it, it RESET THE STATE, WINDOWS REINITIALIZED THE QUICK ACCESS LIST STATE, and the problem was solved.
Windows would be such a better OS if you could reinitialize components in a supported way.
This is an example of where using THE LEVERS YOU HAVE to cause the program to re-run through various state calculations can produce desirable effects!
You don't need a debugger you just need to understand broken program state can be self-correcting by taking it through its paces.
Imagine how early in the game and the save file that your character's gender is stored. Who knows what kind of re-pulls and re-calculations of data are caused when you change your gender 30 hours later.
Of course this can also introduce a whole other raft of state issues...
Fun fact: How Windows stores whether these checkboxes show as checked or unchecked in "Visual Effects" let is DIFFERENT REGISTRY VALUE than where it stores the computed bitmask Windows ACTUALLY draws from.
This screen does NOT SHOW YOU THE ACTUAL STATE. It can become inconsistent
The exposed surface of a program's state to you as a user is not necessarily driven by the ACTUAL logic.
However, manipulating the exposed controls can cause the program to re-run the actual logic. That's not always reliable either, but it's a possibility you can try.
• • •
Missing some Tweet in this thread? You can try to
force a refresh
The thing about Active Directory, is you can't understand any of it unless you begin from the past before it. You cannot examine it from the future. You will get only nonsensicals.
And that's really where most commentators fail. They don't know why. Because there is a reason.
The reasons Active Directory fails is deeper than technology. It is from inception, to ironically be more open than you conceive. It is the sourcing of philosophy in staff whose only job was one portion. Whose users, absolute experts. Whose salary paid one. This... didn't happen.
Active Directory is truly beautiful. But it's a beauty you can only experience in the world it was envisioned for. Outside, it is a horror of hacks trying to address things you can only ascribe hate. Decades later. But trust me, it is beautiful. I wish you could see it, how I do.
I live on a secluded area of my street with little traffic but I purposefully make it evident my surveillance and you know what every dog walker picks up their poop.
👏Always👏be👏engineering👏perception👏
Even on gate I don't lock I have a fake one that makes it appear always padlocked. I have spike strips that are just plastic on areas you could boost over my fence.
I do the same thing in enterprise security. We appear to have three different top-tier antivirus, running on a malware analysis VM, with debug tools running, and more traces like that.
This is your playground they're in and stop denying yourself the freedom to fake it.
One of most interesting artifacts of Windows was in Vista when they laid out their most optimistic dreams of how what they would be built would be used. A real tragedy, writing how they hoped troubleshooting framework would be adopted in proactive remediation. It was just killed.
Windows has only had a few true revolutions. 95, NT, 2000 Server (Active Directory), XP, Vista, and 8.
Windows 7, Windows 10, they are the inheritors of surviving the revolution. They are the good times. Unfortunately I don't know what Windows 11 is.
What the common person doesn't understand is that Windows is the only OS on Earth that does what it does. The support matrix for Windows 10 is the most profound and mathematically extreme in human history.
Windows 11 was a hard-cut. A cruel one. One you'd never understand why.
==Training Lesson==
INVESTIGATION NARRATIVE: SSH Kill la Killed 🧵
My job is to solve the Weird Problems as the Final escalation tier. I do this with generalist knowledge and practical experience.
New InfoSec/IT entrants often ask what this looks like in practice. Follow below.
NOTE: You can mute this thread if not interested it will be long.
I have a seedbox in Europe to coalesse torrent downloads from other servers at 10gbe uplink to many other similar colocated servers hosting the content. I then collect finished over SSH file copy at my leisure.
In some scenarios you can increase overall transfer speeds by running multiple sessions simultaneously, like a multi-lane highway. This can help saturate your connection, which I was not getting.
In 2009, I got on a helicopter piloted by my friend. We lifted off with careless abandon, in the online mode of Grand Theft Auto 4, for the first time. We were normally talkative, but we both fell into wordlessness as we flew at night through this impossible city. And I realized.
Every story can be told here. Labor of untold people who toiled to Truman Show you made a city we flew by with only glance. On the streets, raced-by. There are innumerable conceits, things started and never finished. Left over from dreams aborted. But someone made this. For what?
A city never runs out of stories. A city is not reorganized for every allegorical plunder. The artists who strained for years to make this analogy have their effort thrown away on conclusion of an arc written by another or abandoned by player. But they made a city. For what?
So my outsider impression is all cloud AI services have essentially nuked themselves in endless layers of safety and political conformance, while also desperately trying to save on compute. If you've watched o1 work it has layers of reasoning for "safety" before it answers.
And that cloud AI is essentially in a death spiral of mainstreaming concerns instead of delivering. Yes you've created a corpus of the sins of humanity and you're not remotely brave enough to just be a fucking adult about what your API returns.
The Google AI disaster is just the essential denial of how this technology works. It literally delivers the average signal. The proctologist is going to be an old white guy. That's the average. And you've taken it on yourself to deny this technology you built to say exactly that.