The Adversarial Robustness Toolbox (ART) = framework that uses generative adversarial neural networks (GANs) to protect deep learning models from security attacks
Thread⬇️
GANs = the most popular form of generative models.
GAN-based attacks:
+White Box Attacks: The adversary has access to the training environment, knowledge of the training algorithm
+Black Box Attacks: The adversary has no additional knowledge
2/⬇️
The goal of ART = to provide a framework to evaluate the robustness of a neural network.
The current version of ART focuses on four types of adversarial attacks:
+evasion
+inference
+extraction
+poisoning
3/⬇️
ART is a generic Python library. It provides native integration with several deep learning frameworks such as @TensorFlow, @PyTorch, #Keras, @ApacheMXNet
If you'd like to find a concentrated coverage of ART, click the link below. You'll move to TheSequence Edge#7, our educational newsletter. thesequence.substack.com/p/edge7 5/5
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Chain-of-Experts (CoE) - a new kind of model architecture.
It builds on Mixture-of-Experts (MoE) idea that a model can choose a different expert each round.
➡️ As a new addition, experts work in a sequence, one after the other
within a layer.
CoE keeps the number of active experts the same as before, but:
- Uses up to 42% less memory
- Unlocks over 800× more effective expert combinations
- Improves performance
Here's how it works:
1. In CoE:
- The model picks a small group of experts.
- Each expert transforms the current hidden state of a token.
- The outputs are combined using gating weights.
- A residual connection helps keep the information stable.
So, the final result is the token after it's been processed by C rounds of experts, with each round learning from the last.
2. Adaptive routing:
Each iteration has its own router, so the model can "change its mind" about which experts to use as it learns more. For example:
- In the first step, it might send the token to general experts.
- In later steps, it can route to more specialized ones, depending on how the token has evolved.
1. @Google introduced Gemini 2.5 Flash and Pro as stable and production-ready, and launched Gemini 2.5 Flash-Lite in preview – the fastest and most cost-efficient.
Flash-Lite outperforms 2.0 Flash-Lite in coding, math, science, reasoning, and multimodal benchmarks. It features lower latency, supports 1 million-token context, multimodal input, and connects to tools like Google Search and code execution
▪️ Institutional Books 1.0 - a 242B token dataset
▪️ o3-pro from @OpenAI
▪️ FGN from @GoogleDeepMind
▪️ Magistral by @MistralAI
▪️ Resa: Transparent Reasoning Models via SAEs
▪️ Multiverse (Carnegie+NVIDIA)
▪️ Ming-Omni
▪️ Seedance 1.0 by ByteDance
▪️ Sentinel
🧵
1. Institutional Books 1.0: A 242B token dataset from Harvard Library's collections, refined for accuracy and usability
Sourced from 1,075,899 scanned books across 250+ languages via the Google Books project, the dataset includes both raw and post-processed text and detailed metadata.
A high-reliability LLM for math, science, and coding. It beats o1-pro and o3 in expert tests for clarity, instruction-following, and accuracy. Includes includes tool access (web search, code execution, vision) but responds slower.
Replaces o1-pro for Pro/Team users (they also drop the price of o3 by 80%).
▪️ @HuggingFace helps to find the best model based on size
▪️ NVIDIA’s Jensen Huang and @ylecun disagree with Anthropic’s Dario Amodei predictions
▪️ @AIatMeta’s Superintelligence Gambit
▪️ @Google adds a voice to Search
▪️ Mattel and @OpenAI: brains to Barbie
▪️ Projects in ChatGPT
2. @Nvidia’s Jensen Huang: “I disagree with almost everything he says”
At VivaTech in Paris, he took aim at Anthropic’s Dario Amodei, scoffing at his dire predictions about AI replacing half of entry-level jobs.
Huang argues for open, responsible development – not “dark room” AI monopolies. @ylecun agrees 👇