Post

@sprekeler

More from @RobertTLange

Robert Lange

@RobertTLange

Feb 20

🎉 Stoked to share The AI CUDA Engineer 👷 - our end-to-end approach for automating the design and optimization of CUDA Kernels using agentic systems.

Blog 📰: sakana.ai/ai-cuda-engine…
Paper 📜:pub.sakana.ai/ai-cuda-engine…
WebUI 📈: pub.sakana.ai/ai-cuda-engine…
Dataset 💽: huggingface.co/datasets/Sakan…

Awesome team work done with @_Aaditya_Prasad, @Floating_Martin, @maxencefaldor, @yujin_tang, @hardmaru 🤗

🔎 We design an agentic system capable of first translating PyTorch modules into working CUDA kernels 💬. Afterwards, we use evolutionary LLM-driven code optimization 🧬 to iteratively discover ‘stepping stone’ kernels and improve runtime speedups.

We leverage a combination of the following agentic ingredients & innovations:

1️⃣ Model and temperature ensembling to incentivize diverse kernel proposals
2️⃣ Least-to-most sorted prompting with profiling data for in-context learning
3️⃣ Crossover prompting based on kernel text-embedding clusters to combine parents
4️⃣ An innovation archive of stepping stone kernels RAG-retrieved to seed the context

The AI CUDA Engineer can robustly translate PyTorch operations to CUDA (>90% success rate), and optimize these operations better than native torch (∼75% success rate) or even torch compile (∼60% success rate). For certain operations, such as Instance Normalization and lower triangular matrix multiplication, we demonstrate remarkable speedups of up to 381x and 147x, respectively.

Furthermore, the AI CUDA Engineer reaches a 1.34x median speedup over torch native across all 250 considered KernelBench tasks, and a 1.52x speedup over the 186 successful tasks.

Read 6 tweets

Robert Lange

@RobertTLange

Aug 29, 2024

📢 Two weeks since we released The AI Scientist 🧑‍🔬!

We want to take the time to summarize a lot of the discussions we’ve been having with the community, and give some hints about what we are working on! 🫶

We are beyond grateful for all your feedback and the community debate our work has sparked ✨

In public discussions of this paper, we frequently refer to it as the “Will Smith eating spaghetti” moment for AI Science 🍝.

While there are often minor errors in the outputs of the papers, we believe, like Will Smith’s fingernails being the wrong size originally, these problems will only improve - with newer models, more compute, and better methods.

This is the worst the AI Scientist will ever be! 📈

https://x.com/bensmith_sv/status/1828891541445710318

We consciously decided to open-source all the code to democratize access to all the individual tools 🔨 introduced in the AI Scientist agent pipeline.

Everyone can critically assess its competence and usage for their projects!

One useful data point we’ve been hearing is that people have often been surprised that AI can generate interesting research ideas for their own fields at all!

https://x.com/bensmith_sv/status/1828891541445710318

Read 7 tweets

Robert Lange

@RobertTLange

Aug 13, 2024

🎉 Stoked to share The AI-Scientist 🧑‍🔬 - our end-to-end approach for conducting research with LLMs including ideation, coding, experiment execution, paper write-up & reviewing.

Blog 📰:
Paper 📜:
Code 💻:

Work led together with @_chris_lu_, @cong_ml and jointly supervised by @j_foerst, @jeffclune, @hardmaru 🤗sakana.ai/ai-scientist/
arxiv.org/abs/2408.06292
github.com/SakanaAI/AI-Sc…

Given a starting code template 📝 we ask an LLM to propose new research directions. It checks the novelty of its idea proposals 💡 using Semantic Scholar and scores the "interestingness" as well as "novelty". Below you can find a Diffusion idea on "adaptive dual-scale denoising":

The LLM afterwards implements all the required code-level changes 🦾. We leverage the amazing aider tool by @paulgauthier with various different LLM backends including GPT-4o, Sonnet 3.5, DeepSeek Code and Llama 3.1 405B.

Afterward, the AI Scientist iteratively executes experiments to obtain statistics and plots. Below you can find an example code diff:

Read 9 tweets

Robert Lange

@RobertTLange

Jun 9, 2024

📺 Exciting talk on the xLSTM architecture and the challenges of questioning the first-mover advantage of the Transformer 🤖 by @HochreiterSepp @scioi_cluster

📜:
💻:

arxiv.org/abs/2405.04517
github.com/NX-AI/xlstm

🗿 The LSTM architecture has been a foundational pillar of modern Deep Learning. E.g including various breakthrough results in Deep RL (e.g. OpenAI's Dota), forecasting (e.g. weather) and the initial seq2seq models.

💡 xLSTM tackles several challenges in scaling the original architecture to long sequences (via exponential gating and memory mixing) and distributed training (via associative memories). Furthermore, it combines several advances in training large sequence models.

https://x.com/jo_brandstetter/status/1798952614568116285

📉 The 1.3B parameter results are very impressive and the scaling results appear far from having reached a saturation point. Very much looking forward to the next generation!

Furthermore, it has recently also shown promising results on Vision tasks 📸

https://x.com/jo_brandstetter/status/1798952614568116285

Read 4 tweets

Robert Lange

@RobertTLange

Jun 25, 2022

🚀 I am very excited to share gymnax 🏋️ — a JAX-based library of RL environments with >20 different classic environments 🌎, which are all easily parallelizable and run on CPU/GPU/TPU.

💻[repo]: github.com/RobertTLange/g…

📜[colab]: colab.research.google.com/github/RobertT…

gymnax inherits the classic gym API design 🧑‍🎨 and allows for explicit functional control over the environment settings 🌲 and randomness 🎲

reset and step operations can leverage JAX transformations such as jit-compilation, auto-vectorization and device parallelism 🤖

@matteohessel

It accelerates rollouts & facilitates the distributed Anakin Podracer (@matteohessel et al. 21) architecture 🏃 Data collection/learning directly runs on accelerators using replication/aggregation across devices.

👇 speed comparisons for different # workers, hardware, policies:

Read 7 tweets

Share this page!

Enter URL or ID to Unroll

Robert Lange

Try unrolling a thread yourself!

More from @RobertTLange

Robert Lange

Robert Lange

Robert Lange

Robert Lange

Robert Lange

Did Thread Reader help you today?

Don't want to be a Premium member but still want to support us?

Send Email!