LaurieWired Profile picture
researcher @google; serial complexity unpacker; https://t.co/Vl1seeNgYK ex @ msft & aerospace
13 subscribers
Oct 17 4 tweets 3 min read
You’re (probably) measuring application performance wrong.

Humans have a strong bias for throughput.

"I can handle X requests per second."

Real capacity engineers use response-time curves. Image
Image
It all comes down to queueing theory.

Unfortunately, computers don’t degrade gracefully under load.

70% CPU is smooth sailing. 95% is a nightmare.

Programmers (incorrectly) focus on the absolute value, when really they should be looking at the derivative. Image
Image
Oct 14 4 tweets 2 min read
GPU computing before CUDA was *weird*.

Memory primitives were graphics shaped, not computer science shaped.

Want to do math on an array? Store it as an RGBA texture.

Fragment Shader for processing. *Paint* the result in a big rectangle. Image
Image
As you hit the more theoretical sides of Computer Science, you start to realize almost *anything* can produce useful compute.

You just have to get creative with how it’s stored.

The math might be stored in a weird box, but the representation is still valid. Image
Oct 13 5 tweets 2 min read
Colleges do a terrible job of teaching C++.

It’s not “C with Classes”. Injected into curriculums as a demonstration of early CS concepts, it leaves many with a sour taste.

Students later immediately fall in love with the first language that *doesn’t* feel that way. Image Admittedly, professors are in a tough spot.

To teach the concept, you fundamentally have to constrain the scope of the language. Many schools choose C++ out of practicality.

Controversially, I think toy languages that *aren't* industry standards are better suited for this. Image
Oct 3 4 tweets 2 min read
DDR5 is unstable garbage.

Max out your memory channels? Flaky.
Temperature a bit too hot? Silent Throttle with no logs.
Too “Dense” of a stick? Good luck training.

Last gen was rock solid by comparison. Here's what happened. Image
Image
More than ever, manufacturers have been pushing memory to the absolute limits.

JEDEC, the standards committee, is pretty conservative.

Yet the moment DDR5 launched, everyone threw JEDEC out the window.

Intel + AMD's memory controllers were *not* ready to handle it. Image
Oct 2 4 tweets 3 min read
Virtual Machines render fonts. It’s kind of insane.

TrueType has its own instruction set, memory stack, and function calls.

You can debug it like assembly. It’s also exploitable: Image
Image
Anytime you can run code (albeit very limited code), someone will take advantage of it.

TrueType (TT) is unfortunately famous for many Windows Kernel zero days.

TT is memory bound, therefore not Turing-complete…but you can still do crazy things with it. Image
Oct 1 4 tweets 3 min read
This processor doesn’t (officially) exist.

Pre-production Engineering Samples sometimes make it into the grey market.

Rarer still are Employee Loaner Chips. Ghosts abandoned before ever becoming products: Image
Image
A few days ago, someone found an Intel Pentium Extreme 980.

No laser etched model number; just some scribbled sharpie.

In 2004, Intel (very publicly) canceled the 4Ghz Pentium 4…yet here it is.

It's a hint at some internal politics. Image
Sep 29 4 tweets 2 min read
A common Programmer brag is being extremely adept at keyboard shortcuts.

Tiling WMs, TUIs, Vim keybindings everywhere, etc...

But is it actually faster?

Apple spent $50 million on R&D in 1989 to prove otherwise: Image
Image
Bruce “Tog”, head of UI testing at Apple, claimed their research showed:

1. Users *felt* keyboard was faster
2. Stopwatch tests proved mouse was faster

Hold on. Apple had a huge conflict of interest; they're trying to sell the public on the idea of the mouse. Image
Sep 26 4 tweets 3 min read
Modern Radio Communication is crazy good.

On the Apollo moon landings, the spacecraft used a ~20W Downlink.

Today, we can get that down to about 0.001W.

Waveguides, phased arrays, and of course software make the difference: Image
Image
First things first, keep it cold. Crazy cold.

Thermal noise kills SnR. Keep an amplifier at ~10 Kelvin, and we get close to fundamental limits (quantum noise floor)!

For S-Band (common for space), that’s about a 5x power reduction. Image
Image
Sep 25 4 tweets 2 min read
Encryption is kind of a lie.

Data can be encrypted at rest, and even in transit…but not “in use”.

Fundamentally, CPUs execute arithmetic instructions on decrypted plaintext; even with secure enclaves.

But what if we got *really* clever: Image Mathematically, there is a solution. It’s just really, really slow.

Fully Homomorphic Encryption allows for arithmetic computation *on* encrypted data.

First published in 2009, each individual (x86) operation took 30 minutes!

AKA, about 10^12 times slower. Image
Sep 24 4 tweets 2 min read
Airbags are terrible for hearing.

~170db explosions of air aren't exactly pleasant.

Yet, milliseconds before impact, a clever blast of pink noise can reduce hearing loss 40%!

Mercedes solved it with software 10+ years ago. Still, no one has copied it: Image
Image
If you’re cool, you can activate the stapedius reflex voluntarily. (ear rumblers unite!)

For everyone else, Mercede’s blast of noise manually activates the middle ear muscles.

A taut eardrum taut reduces sound pressure by ~15dB; a LOT on a logarithmic scale. Image
Sep 22 4 tweets 2 min read
SSDs are pretty reliable in a technical sense.

That is, unless you make a really, really bad mistake in firmware.

HP had a line of ~20 different Enterprise SSD models for datacenter use.

In exactly 3 years, 270 days and 8 hours, every one is irrecoverably bricked. Image
Image
If you’re a programmer, you might already guess what happened.

Hint. The bug happens at 32,768 hours of operation time. 2^15.

That’s right, it’s a Signed 16 bit integer overflow. Image
Sep 16 4 tweets 3 min read
Everyone knows that the x86 ISA is big.

Modern CPUs have ~1000+ mnemonics. Guess how many make up 90% of compiled C/C++ code?

TWELVE. I'm not kidding.

The question is…what if we shrank it? Image
Image
x86 suffers from what you would call “long tail syndrome”.

A huge amount of legacy instructions used <0.01% of the time.

SHRINK is a cool paper proposing that we *emulate* the old stuff instead.

AKA Instruction Recycling. Image
Sep 11 4 tweets 3 min read
How do you program an unknown CPU?

The original specs are gone; no compilers exist, and the ISA is completely unrecognized.

It happens more often than you think, behind very closed doors.

It's almost always military hardware. Image
Image
There is *one* glimpse of this that I know of in the wild.

In 2012, two Russian PhDs researchers gave a presentation at RECon.

Titled quite innocuously, they were tasked with reversing a single, unknown binary.

No hardware. No datasheet. No Documentation. Image
Sep 8 4 tweets 2 min read
If you store data forever, you don’t use any power.

Sound’s ridiculous, but computation doesn’t *actually* require energy.

The field of Reversible Computing experiments with “reverse” programming languages, compilers, and yes, even CPUs: Image
Image
The main source of heat in processors is *erasing* bits.

Known as Landauer’s principle, irreversibly destroyed information has a thermodynamic cost.

The trick, is to perform "uncomputing" instead:

Compute -> Copy Result -> Uncompute Image
Sep 3 5 tweets 3 min read
It’s time to get rid of frame rates.

In weird corners of the internet, researchers and standards committees discuss frameless video containers.

Sensor data as a continuous function, down-sampled to any frame rate you want.

Here's what it'll look like in 10 years: Image
Image
There’s two schools of thought, depending on the crowd you hang around.

NeurIPS folks tend to like continuous-time fields (software).

Hardcore EE types discuss event-based sensing (hardware, timestamps).

Bear with me, it's easier than it sounds: Image
Image
Sep 2 5 tweets 3 min read
Much like humans, CPUs heal in their sleep.

CPUs are *technically* replaceable / wear items. They don’t last forever.

Yet, the moment stress is removed, transistor degradation (partially) reverses.

It's called Bias Temperature Instability (BTI) recovery: Image
Image
Transistors are little switches.

When you hold a switch on, especially when it’s hot, a bit of charge gets stuck where it shouldn’t.

Every time that happens, it gets a little bit harder to switch.

In other words, the transistor gets a little “lazier”. Image
Aug 29 4 tweets 3 min read
I’ve been on a filesystem kick, and it’s interesting to see the DNA of older ideas pop up in modern designs.

BeFS was crazy in the 90s; the whole architecture was basically a searchable database.

Skimming through their book…it sounds a lot like current Apple FS design. Image
Image
Turns out there’s a good reason for that.

“Practical File System Design”, was written by Dominic Giampaolo in 1999, for BeOS.

Giampaolo joined Apple in 2002, where he became lead architect for…APFS.

Released in 2016, it's funny to see the same ideas 17 years later. Image
Image
Aug 21 4 tweets 2 min read
Why do most uninterruptible power supplies still use old, lead-acid battery tech?

Nearly every battery in your house (phone, watch, even electric car) is lithium based...except UPSs.

It all has to do with battery chemistry. Lead-Acid has some unique advantages: Image
Image
Contrary to what you might think; lithium batteries are not a “straight upgrade”.

Lead-Acid handles being “floated” at near ~100% capacity for years.

Considering UPS’s spend 99.9% of their life sitting at full charge…waiting for an outage, it's an ideal use-case. Image
Aug 19 4 tweets 2 min read
The West has a blindspot when it comes to alternative CPU designs.

We’re so entrenched in the usual x86, ARM, RISC-V world, that most people have no idea what’s happening over in China.

LoongArch is a fully independent ISA that’s sorta MIPS…sorta RISC-V…and sorta x87! Image
Image
Of course, Loongson (the company) realizes that most software is compiled for x86 and ARM.

Thus, they decided to add some hefty translation layers (LBT) built into the hardware.

LBT gives you for extra scratch registers, x86+ARM eflags, and an x87(!) stack pointer. Image
Aug 15 5 tweets 3 min read
lp0 is a Linux error code that means “printer on fire.”

It’s not a joke. In the 50s, computerized printing was an experimental field.

At LLNL (yes, the nuclear testing site), cathode ray tubes created a xerographic printer.

...it would occasionally catch fire. Image
Image
State-of-the art at the time, the printer was modified with external fusing ovens hit a whopping…

1 page per second!

In the event of a stall, fresh paper would continuously shoot into the oven, causing aggressive combustion. Image
Aug 14 4 tweets 3 min read
PCI link bus designs are incredibly complex.

Standard FR-4 PCB material is basically at its limit. Every year it's harder to keep up with the new standards.

At what point do we flip the architecture on its head...

GPU as the motherboard, CPU as the peripheral? Image
Image
It’s not a new idea.

In the pentium era, Intel created Single Edge Contact Cartridges (SECC).

Instead of being socketed, the CPU was basically slapped on a glorified RAM stick.

Later abandoned due to cost and cooling issues, in the modern era it's starting to make sense. Image
Image