LaurieWired Profile picture
researcher @google; serial complexity unpacker; https://t.co/Vl1seeNgYK ex @ msft & aerospace
13 subscribers
Oct 3 4 tweets 2 min read
DDR5 is unstable garbage.

Max out your memory channels? Flaky.
Temperature a bit too hot? Silent Throttle with no logs.
Too “Dense” of a stick? Good luck training.

Last gen was rock solid by comparison. Here's what happened. Image
Image
More than ever, manufacturers have been pushing memory to the absolute limits.

JEDEC, the standards committee, is pretty conservative.

Yet the moment DDR5 launched, everyone threw JEDEC out the window.

Intel + AMD's memory controllers were *not* ready to handle it. Image
Oct 2 4 tweets 3 min read
Virtual Machines render fonts. It’s kind of insane.

TrueType has its own instruction set, memory stack, and function calls.

You can debug it like assembly. It’s also exploitable: Image
Image
Anytime you can run code (albeit very limited code), someone will take advantage of it.

TrueType (TT) is unfortunately famous for many Windows Kernel zero days.

TT is memory bound, therefore not Turing-complete…but you can still do crazy things with it. Image
Oct 1 4 tweets 3 min read
This processor doesn’t (officially) exist.

Pre-production Engineering Samples sometimes make it into the grey market.

Rarer still are Employee Loaner Chips. Ghosts abandoned before ever becoming products: Image
Image
A few days ago, someone found an Intel Pentium Extreme 980.

No laser etched model number; just some scribbled sharpie.

In 2004, Intel (very publicly) canceled the 4Ghz Pentium 4…yet here it is.

It's a hint at some internal politics. Image
Sep 29 4 tweets 2 min read
A common Programmer brag is being extremely adept at keyboard shortcuts.

Tiling WMs, TUIs, Vim keybindings everywhere, etc...

But is it actually faster?

Apple spent $50 million on R&D in 1989 to prove otherwise: Image
Image
Bruce “Tog”, head of UI testing at Apple, claimed their research showed:

1. Users *felt* keyboard was faster
2. Stopwatch tests proved mouse was faster

Hold on. Apple had a huge conflict of interest; they're trying to sell the public on the idea of the mouse. Image
Sep 26 4 tweets 3 min read
Modern Radio Communication is crazy good.

On the Apollo moon landings, the spacecraft used a ~20W Downlink.

Today, we can get that down to about 0.001W.

Waveguides, phased arrays, and of course software make the difference: Image
Image
First things first, keep it cold. Crazy cold.

Thermal noise kills SnR. Keep an amplifier at ~10 Kelvin, and we get close to fundamental limits (quantum noise floor)!

For S-Band (common for space), that’s about a 5x power reduction. Image
Image
Sep 25 4 tweets 2 min read
Encryption is kind of a lie.

Data can be encrypted at rest, and even in transit…but not “in use”.

Fundamentally, CPUs execute arithmetic instructions on decrypted plaintext; even with secure enclaves.

But what if we got *really* clever: Image Mathematically, there is a solution. It’s just really, really slow.

Fully Homomorphic Encryption allows for arithmetic computation *on* encrypted data.

First published in 2009, each individual (x86) operation took 30 minutes!

AKA, about 10^12 times slower. Image
Sep 24 4 tweets 2 min read
Airbags are terrible for hearing.

~170db explosions of air aren't exactly pleasant.

Yet, milliseconds before impact, a clever blast of pink noise can reduce hearing loss 40%!

Mercedes solved it with software 10+ years ago. Still, no one has copied it: Image
Image
If you’re cool, you can activate the stapedius reflex voluntarily. (ear rumblers unite!)

For everyone else, Mercede’s blast of noise manually activates the middle ear muscles.

A taut eardrum taut reduces sound pressure by ~15dB; a LOT on a logarithmic scale. Image
Sep 22 4 tweets 2 min read
SSDs are pretty reliable in a technical sense.

That is, unless you make a really, really bad mistake in firmware.

HP had a line of ~20 different Enterprise SSD models for datacenter use.

In exactly 3 years, 270 days and 8 hours, every one is irrecoverably bricked. Image
Image
If you’re a programmer, you might already guess what happened.

Hint. The bug happens at 32,768 hours of operation time. 2^15.

That’s right, it’s a Signed 16 bit integer overflow. Image
Sep 16 4 tweets 3 min read
Everyone knows that the x86 ISA is big.

Modern CPUs have ~1000+ mnemonics. Guess how many make up 90% of compiled C/C++ code?

TWELVE. I'm not kidding.

The question is…what if we shrank it? Image
Image
x86 suffers from what you would call “long tail syndrome”.

A huge amount of legacy instructions used <0.01% of the time.

SHRINK is a cool paper proposing that we *emulate* the old stuff instead.

AKA Instruction Recycling. Image
Sep 11 4 tweets 3 min read
How do you program an unknown CPU?

The original specs are gone; no compilers exist, and the ISA is completely unrecognized.

It happens more often than you think, behind very closed doors.

It's almost always military hardware. Image
Image
There is *one* glimpse of this that I know of in the wild.

In 2012, two Russian PhDs researchers gave a presentation at RECon.

Titled quite innocuously, they were tasked with reversing a single, unknown binary.

No hardware. No datasheet. No Documentation. Image
Sep 8 4 tweets 2 min read
If you store data forever, you don’t use any power.

Sound’s ridiculous, but computation doesn’t *actually* require energy.

The field of Reversible Computing experiments with “reverse” programming languages, compilers, and yes, even CPUs: Image
Image
The main source of heat in processors is *erasing* bits.

Known as Landauer’s principle, irreversibly destroyed information has a thermodynamic cost.

The trick, is to perform "uncomputing" instead:

Compute -> Copy Result -> Uncompute Image
Sep 3 5 tweets 3 min read
It’s time to get rid of frame rates.

In weird corners of the internet, researchers and standards committees discuss frameless video containers.

Sensor data as a continuous function, down-sampled to any frame rate you want.

Here's what it'll look like in 10 years: Image
Image
There’s two schools of thought, depending on the crowd you hang around.

NeurIPS folks tend to like continuous-time fields (software).

Hardcore EE types discuss event-based sensing (hardware, timestamps).

Bear with me, it's easier than it sounds: Image
Image
Sep 2 5 tweets 3 min read
Much like humans, CPUs heal in their sleep.

CPUs are *technically* replaceable / wear items. They don’t last forever.

Yet, the moment stress is removed, transistor degradation (partially) reverses.

It's called Bias Temperature Instability (BTI) recovery: Image
Image
Transistors are little switches.

When you hold a switch on, especially when it’s hot, a bit of charge gets stuck where it shouldn’t.

Every time that happens, it gets a little bit harder to switch.

In other words, the transistor gets a little “lazier”. Image
Aug 29 4 tweets 3 min read
I’ve been on a filesystem kick, and it’s interesting to see the DNA of older ideas pop up in modern designs.

BeFS was crazy in the 90s; the whole architecture was basically a searchable database.

Skimming through their book…it sounds a lot like current Apple FS design. Image
Image
Turns out there’s a good reason for that.

“Practical File System Design”, was written by Dominic Giampaolo in 1999, for BeOS.

Giampaolo joined Apple in 2002, where he became lead architect for…APFS.

Released in 2016, it's funny to see the same ideas 17 years later. Image
Image
Aug 21 4 tweets 2 min read
Why do most uninterruptible power supplies still use old, lead-acid battery tech?

Nearly every battery in your house (phone, watch, even electric car) is lithium based...except UPSs.

It all has to do with battery chemistry. Lead-Acid has some unique advantages: Image
Image
Contrary to what you might think; lithium batteries are not a “straight upgrade”.

Lead-Acid handles being “floated” at near ~100% capacity for years.

Considering UPS’s spend 99.9% of their life sitting at full charge…waiting for an outage, it's an ideal use-case. Image
Aug 19 4 tweets 2 min read
The West has a blindspot when it comes to alternative CPU designs.

We’re so entrenched in the usual x86, ARM, RISC-V world, that most people have no idea what’s happening over in China.

LoongArch is a fully independent ISA that’s sorta MIPS…sorta RISC-V…and sorta x87! Image
Image
Of course, Loongson (the company) realizes that most software is compiled for x86 and ARM.

Thus, they decided to add some hefty translation layers (LBT) built into the hardware.

LBT gives you for extra scratch registers, x86+ARM eflags, and an x87(!) stack pointer. Image
Aug 15 5 tweets 3 min read
lp0 is a Linux error code that means “printer on fire.”

It’s not a joke. In the 50s, computerized printing was an experimental field.

At LLNL (yes, the nuclear testing site), cathode ray tubes created a xerographic printer.

...it would occasionally catch fire. Image
Image
State-of-the art at the time, the printer was modified with external fusing ovens hit a whopping…

1 page per second!

In the event of a stall, fresh paper would continuously shoot into the oven, causing aggressive combustion. Image
Aug 14 4 tweets 3 min read
PCI link bus designs are incredibly complex.

Standard FR-4 PCB material is basically at its limit. Every year it's harder to keep up with the new standards.

At what point do we flip the architecture on its head...

GPU as the motherboard, CPU as the peripheral? Image
Image
It’s not a new idea.

In the pentium era, Intel created Single Edge Contact Cartridges (SECC).

Instead of being socketed, the CPU was basically slapped on a glorified RAM stick.

Later abandoned due to cost and cooling issues, in the modern era it's starting to make sense. Image
Image
Aug 13 4 tweets 2 min read
Is your compiler a boy or a MAN?

Created by Donald Knuth, it’s a test to check if recursion is implemented properly.

Originally written in ALGOL 60, a precursor to C, but can adapt to nearly any language.

It really stresses the stack and heap, pushing insane call depths: It’s a fun little program that creates an explosion of self-referential calls in just a few lines of code.

At a recursion depth of 20, the call stack already hits the millions!

Keep in mind, this test was designed in the 1960s; yet even modern systems struggle. Image
Aug 6 4 tweets 3 min read
Ring 0 is a highly-privileged state on CPUs.

Negative Ring Levels have even *higher* privilege. You just haven’t heard of them.

For X86, Ring -1 is Hardware Virtualization, Ring -2 is System Management Mode, Ring -3 is Intel ME / AMD PSP.

Arm get's even weirder: Image
Image
Negative rings are mostly due to X86 being really old; as the ISA got more complex, we got "above 0" states.

Armv8 moves in a positive direction; higher numbers have more privilege. From EL0 (user space) to EL3 (State-Switching).

Apple does something extra funky: Image
Image
Aug 2 4 tweets 2 min read
An early rule you learn in computer science is:

“Never store currency as floats”

Nearly every popular language has special, built-in types for money. But why?

The *majority* of money-like numbers have no float representation, accumulating to massive errors over time: Image
Image
Go ahead and try this. Let’s add three dimes. Open up a python terminal, and type in:

0.10 + 0.10 + 0.10

Uh oh. See that little remainder?

It may seem trivial, but this mistake happens more often than you’d expect! Image