Lawrence Chan Profile picture
I do AI Alignment Research. Currently at @METR_Evals on leave from my PhD at UC Berkeley’s @CHAI_berkeley. Opinions are my own.
May 2 10 tweets 4 min read
A recent viral paper claims to reverse-engineer the parameter counts of frontier models: GPT-5.5 = 9.7T, Opus 4.7 = 4.0T, o1 = 3.5T, etc.

@ben_sturgeon and I investigated and found serious issues in the paper; fixing them gives GPT-5.5 as ~1.5T (90% CI: 256B-8.3T). Image The paper, “Incompressible Knowledge Probes” (by @bojie_li), constructs a dataset of 1400 factual questions, and fits accuracy against parameter count.

By inverting the fit, Li infers the parameter count of closed-source models from their dataset scores.
arxiv.org/abs/2604.24827Image
Mar 3 9 tweets 3 min read
The amendment for the DoW-OAI deal may help, but I think it still fails to address key problems.

The core surveillance prohibition is limited to "intentional"/"deliberate" surveillance. If the DoW says the use is incidental, it's seemingly permitted, regardless of scale. 🧵 Image Why isn’t “intentional” enough? The DoW has long claimed that "incidentally" sweeping up Americans' data while targeting foreigners isn’t "intentional" domestic surveillance. And prior surveillance scandals often involved “incidental” collection.
eff.org/pages/Incident…
Feb 28 6 tweets 3 min read
OpenAI has released the language in their contract with the DoW, and it's exactly as Anthropic was claiming: "legalese that would allow those safeguards to be disregarded at will".

Note: the first paragraph doesn't say "no autonomous weapons"! It says "AI can't control autonomous weapons as long as existing law (that doesn't exist) or the DoD says so."

Similarly, the mass surveillance use cases will "comply with existing law", but many forms of data collection that we'd consider "mass surveillance" are things that the NSA has consistently argued are legal under current law.Image This, of course, did not stop OpenAI from blatantly misrepresenting this language in the blog post and in Sam Altman's tweets! Image
Dec 20, 2024 14 tweets 5 min read
Besides o3, today OpenAI also published a “new paradigm” for alignment – “Deliberative Alignment” – which, if I’m reading the paper correctly, is Anthropic’s Constitutional AI approach straightforwardly applied to o1. Image In the Deliberative Alignment paper, Guan et al. take a dataset of OAI policy-violating prompts, first use supervised finetuning to distill policy specifications into the generation model, then use a rating model with access to the policy specs to RLAIF the model. Image