Our top 9
▪️ SigLIP 2
▪️ Intuitive Physics Understanding Emerges from Self-Supervised Pretraining on Natural Videos
▪️ Native Sparse Attention
▪️ OctoTools
▪️ ReLearn
▪️ On the Trustworthiness of Generative Foundation Models
▪️ S* Test Time Scaling for Code Generation
▪️ Autellix (Serving Engine for LLM Agents)
▪️ Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering
▪️ SurveyX
▪️ From RAG to Memory: Non-Parametric Continual Learning for LLMs
▪️ How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?
▪️ Train Small, Infer Large
▪️ Eager Updates for Overlapped Communication and Computation in DiLoCo
▪️ S^2R: Teaching LLMs to Self-verify and Self-correct via RL
▪️ Logic-RL
▪️ Discovering Highly Efficient Low-Weight Quantum Error-Correcting Codes with RL
▪️ Armap
▪️ Thinking Preference Optimization
▪️ Rethinking Diverse Human Preference Learning through Principal Component Analysis
▪️ Craw4LLM
▪️ LLMs and Mathematical Reasoning Failures
▪️ Small Models Struggle to Learn from Strong Reasoners
▪️ Flow-of-Options: Diversified and Improved LLM Reasoning by Thinking Through Options
They introduced new types of attentional bias strategies in LLMs and reimagined the "forgetting" process, replacing it with "retention."
All of this is wrapped up in Miras – their new framework for designing efficient AI architectures using 4 building blocks:
• Memory architecture – how the memory is built
• Attentional bias – how the model focuses
• Retention gate – how it forgets or keeps information
• Memory learning algorithm – how it’s trained
Details 🧵
1. Forgetting? No, it's “retention”
Instead of saying the model forgets, Google researchers use the idea of retention. So the term "forget gate" turns into "retention gate."
The model doesn’t erase past memory—it just decides not to hold on to some things as tightly.
2. New attentional biases:
• Using different ℓₚ norms: Adjust sensitivity to noise (ℓ₁ resists outliers, ℓ₂ is standard, ℓ∞ targets largest errors).
• Huber loss: Blends ℓ₂ (when things are going well) and ℓ₁ (when errors are big) for stable learning with outliers.
• Memory robust to value shifts: Prepares memory for small input variations using worst-case training.
▪️ The AI Scientist v2
▪️ Debug-gym
▪️ OLMoTrace
▪️ Scaling Laws for Native Multimodal Models
▪️ MegaScale-Infer
▪️ Hogwild! Inference
▪️ Self-Steering Language Models
▪️ VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks
▪️ Are You Getting What You Pay For?
▪️ MM-IFEngine
▪️ HybriMoE
▪️ C3PO
▪️ Quantization Hurts Reasoning?
▪️ Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
▪️ Concise Reasoning via RL
▪️ Missing Premise exacerbates Overthinking
▪️ DDT
▪️ Adaptive Weighted Rejection Sampling
🧵
1. The AI Scientist v2 by @SakanaAILabs, @UBC, @VectorInst, and @UniofOxford
It's an autonomous LLM-based agent that formulates hypotheses, runs experiments, analyzes data, and writes papers. It uses agentic tree search and VLM feedback for iterative refinement, removing human-authored code templates. Of three papers submitted to ICLR 2025 workshops, one passed peer review with a 6.33 score.
Provides an interactive sandboxed coding environment for LLMs to learn step-by-step debugging using tools like pdb. It supports repository-level reasoning and includes benchmarks (Aider, Mini-nightmare, SWE-bench) to assess debugging agents.
▪️ CORLEO from Kawasaki
▪️ Demis Hassabis's @IsomorphicLabs raised $600 million in its first external round
▪️ @genspark_ai Super Agent
▪️ @OpenAI's PaperBench
▪️ @GoogleDeepMind’s Dreamer RL agent
▪️ @AnthropicAI Claude for Education
Details below 🧵
1. CORLEO - A horse from Kawasaki
Just take a look ->
2. Demis Hassabis's @IsomorphicLabs has raised $600 million in its first external round, led by Thrive Capital with GV and Alphabet.
The DeepMind-born biotech firm advances its AI drug discovery toward clinical impact across various therapeutic areas.