Breakthrough AI to solve the world's biggest problems.
› Get our newsletter: https://t.co/tvb1VpySfL
Jan 30 • 4 tweets • 3 min read
Here is Tülu 3 405B 🐫 our open-source post-training model that surpasses the performance of DeepSeek-V3! The last member of the Tülu 3 family demonstrates that our recipe, which includes Reinforcement Learning from Verifiable Rewards (RVLR) scales to 405B - with performance on par with GPT-4o, and surpassing prior open-weight post-trained models of the same size including Llama 3.1
Benchmarking Tülu 3. Interesting finding: Reinforcement Learning from Verifiable Rewards (RLVR) framework improved the MATH performance more significantly at a larger scale, i.e. 405B compared to 70B and 8B, similar to the findings in the DeepSeek-R1 report.
Nov 21, 2024 • 8 tweets • 5 min read
Meet Tülu 3 -- a set of state-of-the-art instruct models with fully open data, eval code, and training algorithms.
We invented new methods for fine-tuning language models with RL and built upon best practices in the community to scale synthetic instruction and preference data.
It’s a state-of-the-art LLM and we are releasing it with all pre-training data and code. Let’s get to work on understanding the science behind LLMs. Learn more about the framework and how to access it here: blog.allenai.org/olmo-open-lang…
Huge shout out to all our partners including @databricks, @AMD, @LUMIhpc and @KempnerInst for their support in making the OLMo framework possible. We are excited to build a future together where AI is truly open.
Nov 7, 2021 • 4 tweets • 2 min read
How large an emergency fund do I need? Do I have enough time to grab lunch before my next meeting?
We intuitively solve questions like these every day. Renowned physicist Enrico Fermi had a particular knack for it — these questions have become well known as Fermi Problems.
1/N
Solving Fermi Problems requires recursive decomposition, science/commonsense reasoning, abstraction, and creativity. The inherent complexity of these problems makes them an ideal candidate for #AI reasoning.
2/N