The Adversarial Robustness Toolbox (ART) = framework that uses generative adversarial neural networks (GANs) to protect deep learning models from security attacks
Thread⬇️
GANs = the most popular form of generative models.
GAN-based attacks:
+White Box Attacks: The adversary has access to the training environment, knowledge of the training algorithm
+Black Box Attacks: The adversary has no additional knowledge
2/⬇️
The goal of ART = to provide a framework to evaluate the robustness of a neural network.
The current version of ART focuses on four types of adversarial attacks:
+evasion
+inference
+extraction
+poisoning
3/⬇️
ART is a generic Python library. It provides native integration with several deep learning frameworks such as @TensorFlow, @PyTorch, #Keras, @ApacheMXNet
If you'd like to find a concentrated coverage of ART, click the link below. You'll move to TheSequence Edge#7, our educational newsletter. thesequence.substack.com/p/edge7 5/5
• • •
Missing some Tweet in this thread? You can try to
force a refresh
▪️ The AI Scientist v2
▪️ Debug-gym
▪️ OLMoTrace
▪️ Scaling Laws for Native Multimodal Models
▪️ MegaScale-Infer
▪️ Hogwild! Inference
▪️ Self-Steering Language Models
▪️ VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks
▪️ Are You Getting What You Pay For?
▪️ MM-IFEngine
▪️ HybriMoE
▪️ C3PO
▪️ Quantization Hurts Reasoning?
▪️ Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
▪️ Concise Reasoning via RL
▪️ Missing Premise exacerbates Overthinking
▪️ DDT
▪️ Adaptive Weighted Rejection Sampling
🧵
1. The AI Scientist v2 by @SakanaAILabs, @UBC, @VectorInst, and @UniofOxford
It's an autonomous LLM-based agent that formulates hypotheses, runs experiments, analyzes data, and writes papers. It uses agentic tree search and VLM feedback for iterative refinement, removing human-authored code templates. Of three papers submitted to ICLR 2025 workshops, one passed peer review with a 6.33 score.
Provides an interactive sandboxed coding environment for LLMs to learn step-by-step debugging using tools like pdb. It supports repository-level reasoning and includes benchmarks (Aider, Mini-nightmare, SWE-bench) to assess debugging agents.
▪️ CORLEO from Kawasaki
▪️ Demis Hassabis's @IsomorphicLabs raised $600 million in its first external round
▪️ @genspark_ai Super Agent
▪️ @OpenAI's PaperBench
▪️ @GoogleDeepMind’s Dreamer RL agent
▪️ @AnthropicAI Claude for Education
Details below 🧵
1. CORLEO - A horse from Kawasaki
Just take a look ->
2. Demis Hassabis's @IsomorphicLabs has raised $600 million in its first external round, led by Thrive Capital with GV and Alphabet.
The DeepMind-born biotech firm advances its AI drug discovery toward clinical impact across various therapeutic areas.
2. @Google has made Gemini 2.5 Pro (experimental) free for all.
Formerly a $19.99/month perk, the it now comes with file uploads, app integration, and the new Canvas tool. It's a strategic move to flood the market with its top AI for reasoning and STEM.
Our top 2
▪️ Xattention
▪️ Inside-Out: Hidden Factual Knowledge in LLMs
▪️ Rwkv-7 "Goose"
▪️ ϕ-Decoding
▪️ Frac-connections
▪️ DAPO
▪️ Reinforcement learning for reasoning in small LLMs
▪️ MetaLadder
▪️ Measuring AI ability to complete long tasks
▪️ Why do multi-agent LLM systems fail?
▪️ Agents play thousands of 3D video games
▪️ GKG-LLM
▪️ Privacy, Synthetic Data, and Security
▪️ Scale-wise distillation of diffusion models
▪️ Multimodal chain-of-thought reasoning
▪️ Survey on evaluation of LLM-based agents
▪️ Stop overthinking: A survey on efficient reasoning
▪️ Aligning multimodal LLM with human preference
🧵
1. Xattention by @MIT, @Tsinghua_Uni, @sjtu1896 and @nvidia
Speeds up inference with block-sparse attention and antidiagonal scoring