Kyle Lo Profile picture
language model pretraining @allen_ai, co-lead of data research for OLMo w/ @soldni, he/him, more fun stuff at 👉🏻https://t.co/5Hm9cx3mC1🧋
Nov 20 7 tweets 5 min read
we released Olmo 3! lot of exciting stuff but wanna focus on:

🐟Olmo 3 32B Base, the best fully-open base model to-date, near Qwen 2.5 & Gemma 3 on diverse evals

🐠Olmo 3 32B Think, first fully-open reasoning model approaching Qwen 3 levels

🐡12 training datasets corresp to different staged training recipes, all open & accessible

since I'm a pretraining person, I'll share some of my fav Base model ideas:Image
Image
🦈Invest in your experimental design!

"Fit scaling ladders" is right, but there's more we can understand about suitability of different benchmarks for scaling experiments

We create evals better suited for different compute scales, with our "easy" set of tasks+metrics able to support very small scale experiments before switching to our "main" set of evals, on which smaller models are below noise floorImage
Apr 25, 2023 11 tweets 4 min read
The Semantic Reader project combines AI & HCI research to explore the future of scientific reading. We ask: “Can we create intelligent, interactive, and accessible reading interfaces for research papers, even atop existing PDFs?” 1/nsemanticscholar.org/reader/67a5bac… Papers can be hard to read! In the last 2 years we've prototyped 10 interfaces to better support scholarly reading. We address 5 key user challenges: Discovery, Comprehension, Efficiency, Synthesis, and Accessibility. 2/10