Shawn Presser Profile picture Looking for AI work. DMs open. ML discord: projects:
Mother Data Profile picture 1 added to My Authors
Feb 6 12 tweets 3 min read
Being forced to learn Haskell had an upside: I’m able to reason about the type signatures of the functions I use, even in Python. I didn’t think that way before.

I’m less enthusiastic about Haskell than other languages, but I was surprised there was any benefit at all. Why is this useful? And how is it different from my prior mental model?

Before, I thought of functions as little machines. So if I pass a function into map, it was similar to telling a Roomba to clean your house. Whether you ask a Roomba or a maid or clean it yourself, there’s \
Jun 5, 2021 9 tweets 5 min read
So, I'm a huge fan of FF7 speedrunning. There's a certain boss that has an 8% chance of killing you at the start of the fight. But speedrunner Caleb seems to die much more than 8% of the time.

To my delight, @AceZephyr1 made a *fully automated testing harness*. Incredible! The goal is to statistically verify whether Caleb's luck is worse than 8%. There might be something else going on. For example, FF7 uses a separate RNG for enemy encounter rate, and you can manipulate it by walking a certain number of steps in certain rooms.
Jun 4, 2021 6 tweets 4 min read
Wow. I'm SSH'd into a TPU v3-8. It has 96 CPUs and 335GB of RAM. Incredible. I installed npm:

snap install npm
npm i -g http-server
sudo http-server -p 80

Then I added Cloudflare DNS.

Presto: a 96-core NodeJS website (for the next 3h):

It was so easy! If you haven't heard about SSH'ing into TPU VMs, it's a new feature! @jekbradbury's team recently released it:

They've been working on this for quite some time. And holy moly, it was worth the wait.
Jun 3, 2021 7 tweets 4 min read
Discovery for my notes: I came up with a variant of FFT I call "FST" (for Fast Shawn Transform, ha)

- FST is its own inverse: fst(fst(x)) = x
- FST of an NxM signal returns NxM real numbers. No phase!
- FST is frequency space, just like FFT. Multiplication is convolution.

Code: import numpy as np; from numpy.fft import fft, fft2

def fst(x): return fft((1 + 1j)*x).real / (area(x) ** 0.5)

def fst2(x): return fft2((1 + 1j)*x).real / (area(x) ** 0.5)

def area(x): return[-2:])

>>> fst(fst(np.arange(5)))
[0, 1, 2, 3, 4]
Jun 2, 2021 5 tweets 1 min read
So this is incredibly strange and cool. For my notes:

It's well-known that if you take the FFT of an NxN image, you only need NxN floats to recover the original image. But usually those are (NxN)/2 complex numbers, e.g. rfft2 is complex.

I've discovered a real-only alternative: Here's how it works. Suppose you have a picture of a cat. First, you multiply the cat by (1 + 1j), so that you end up with a complex number where both the .real and the .imag parts are the cat image. Then you take the FFT of that.
Oct 25, 2020 12 tweets 5 min read
Suppose you wanted to train a world-class GPT model, just like OpenAI. How? You have no data.

Now you do. Now everyone does.

Presenting "books3", aka "all of bibliotik"

- 196,640 books
- in plain .txt
- reliable, direct download, for years:…

thread 👇 I wrote up some details here:…

In OpenAI's papers on GPT-2 and 3, you'll notice references to datasets named "books1" and "books2".

books1 appears to be bookcorpus, or similar.

But OpenAI will not release information about books2; a crucial mystery.
May 28, 2020 14 tweets 4 min read
lol. So, we're doing some image processing with TPUs. We want to save the results directly to our cloud bucket, rather than having the results be transmitted to our VM, saved locally, then uploaded to our cloud bucket. Got a funny idea...

I guess this will be a ramble: TPUs support a limited number of operations. But what you get in exchange is a blazingly-fast TPU.

A TPU consists of 8 cores, plus a CPU. (Yes, the TPU has a CPU -- weird concept, but think of it like a big computer with 8 GPUs. Obviously, a computer with GPUs has a CPU.)
Jan 31, 2020 10 tweets 3 min read
Success: I trained ResNet-50 on imagenet to 75.9% top-1 accuracy in 3.51 minutes using a 512-core TPUv3.

(480,000 images per second. 224x224 res JPG.)

Before you think highly of me, all I did was run Google’s code. It was hard though.

Logs:… It uses the code from their official MLPerf imagenet benchmark.…

(3.51 minutes for v3-512 is slightly faster than their posted results of 3.85min, too!)