Rohan Paul Profile picture
Compiling in real-time, the race towards AGI. 🗞️ Don't miss my daily top 1% AI analysis newsletter directly to your inbox 👉 https://t.co/6LBxO8215l
4 subscribers
Aug 24 11 tweets 6 min read
MASSIVE claim in this paper 🫡

The top-most Universities from US, UK, EU, China, Canada, Singapore, Australia collaborated.

Will completely research paper writing.

They proved, AI can already draft proposals, run experiments, and write papers.

The authors built aiXiv, a new open-access platform where AI and humans can submit, review, and revise research in a closed-loop system.

The system uses multiple AI reviewers, retrieval-augmented feedback, and defenses against prompt injection to ensure that papers actually improve after review.

And the process worked: AI-generated proposals and papers get much better after iterative review, with acceptance rates jumping from near 0% to 45% for proposals and from 10% to 70% for papers.

🧵 Read on 👇Image 🧵2/n. Across real experiments it hits 77% proposal ranking accuracy, 81% paper ranking accuracy, blocks prompt‑injection with up to 87.9% accuracy, and pushes post‑revision acceptance for papers from 10% to 70%.

81% paper accuracy, 87.9% injection detection, papers 10%→70% after revision.Image
Aug 23 24 tweets 9 min read
This is that original MIT report that said 95% of AI pilots fail and which spooked investors across US Stockmarket.

The reports says, most companies are stuck, because 95% of GenAI pilots produce zero ROI, while a small 5% win by using systems that learn, plug into real workflows, and improve with use.

Teams keep buying or building static tools that demo well but cannot remember context, adapt, or fit daily operations, and this report maps exactly how the few winners do it differently.

🧪 How they ran the study

They combined a review of 300+ public implementations with 52 structured interviews and 153 senior‑leader surveys across January to June 2025, which gives the patterns below real footing.

🧵 Read on 👇Image
Image
The big split they call the GenAI Divide is simple, 95% of organizations get nothing from GenAI pilots while a tiny 5% extract millions, and the driver is not the model itself but whether the system can learn, remember, and fit the workflow. Image
Aug 23 8 tweets 4 min read
Another paper claiming really BIG result.

The First method to achieve 99.9% on AIME 2025 with open-source models! 🤯

DeepConf uses a model’s own token confidence to keep only its strongest reasoning, with GPT-OSS-120B while cutting tokens by up to 84.7% compared to standard parallel thinking.

Most systems still lean on self-consistency with majority voting, which lifts accuracy but hits diminishing returns and burns a lot of tokens.

🧠 The key idea

DeepConf is a test-time method that scores the model’s reasoning locally for confidence, filters weak traces, and often improves accuracy with fewer tokens without any extra training or tuning.

🧱 Why majority voting hits a wall

Parallel thinking samples many chains and votes, accuracy grows slowly as samples rise so compute scales linearly and the benefit flattens, which is exactly the pain DeepConf targets.

🔎 The confidence signals

Token confidence is the negative mean log probability of the top k candidates at each step, which gives a direct signal of how sure the model is at that moment.

Group confidence averages token confidence over a sliding window so local dips are visible without noise from the whole trace.

Tail confidence averages the last chunk of tokens because the ending steps decide the final answer and are where good traces often slip.

Bottom 10% group confidence looks at the worst parts of a trace, which is a strong indicator that the overall reasoning is shaky.

Lowest group confidence picks the single weakest window along a trace, which turns out to be a clean gate for dropping that trace early.

✅ Bottom line

DeepConf is a plug-in test-time compression recipe that filters or halts weak reasoning in place, so teams get higher accuracy and a big token cut without retraining or new hyperparameters.Image 🧮 Offline mode, smarter voting

DeepConf ranks traces by a confidence score and does confidence-weighted majority voting after optionally keeping only the top 10% or the top 90% by confidence.

With 512 traces, GPT-OSS-120B reaches 99.9% on AIME 2025 using tail or lowest-group confidence with filtering, compared to 97.0% for plain voting and 91.8% for pass@1.
Aug 21 14 tweets 4 min read
Really solid context engineering guide.

Directly From @AnthropicAI

In short, package stable context up front, give exact instructions and examples, restate the current ask, let the model reason, and demand a strict output format.

🧵 Read on 👇 Image 🧵2/n Start with task context. Tell the model who it is, what domain it is in, and what outcome matters. In the demo, the first try misread the images as a skiing incident. Adding “you are assisting a Swedish car-insurance claims adjuster” fixed that because it anchored the model in the right world and goal.
Aug 20 8 tweets 5 min read
BRILLIANT Paper. 💡

A small Qwen2.5 model is fine-tuned to think over retrieved documents, so a single lean setup can answer domain questions on resource-constrained local hardware.

Using summarised NHS pages, retrieval hits the right condition among top‑5 in 76% of queries, and the fine‑tuned model predicts the exact condition correctly 56% of the time, close to larger frontier models.

The whole pipeline is built for private deployments, so teams can run it without sending data to external APIs.

🔒 The problem they tackle

Many teams cannot ship prompts or data outside their network, especially in health and government, so cloud LLM endpoints are off the table.

They aim for a single lean model that can read retrieved evidence and reason over it, all running locally, so answers stay grounded and private.

The target setting is messy queries over a closed corpus, where retrieval constrains facts and the reasoning step interprets symptoms and next actions.

🧩 The pipeline in this paper.

The system indexes a corpus, retrieves the most relevant pieces for each query, then generates an answer that reasons over those pieces.

They use a classic retriever plus generator design, with retrieval first then reasoning, which fits decision tasks better than free‑form answering.

The chat flow lets a conversational agent decide when to call retrieval, then passes the retrieved context to the reasoning model to produce the answer.

🧵 Read on 👇Image 🧲 The retriever at work

Documents are split into overlapping chunks and embedded with a sentence transformer, then stored in a vector database for fast similarity search.

They use sentence-transformers all‑mpnet‑base‑v2, which maps text into a 768‑dimensional space with a max sequence of 384 tokens, and a Chroma store with L2 similarity.

If any chunk from a document makes the top‑k, the pipeline feeds the full original document to the LLM, so the model sees full context around the hit.Image
Aug 15 6 tweets 4 min read
Speed Always Wins.

Absolutely beautiful and exhaustive 82 page survey paper on on Efficient Architectures for Large Language Models

Maps the ways to make LLMs cheaper, longer context, and near real time.

Transformers compare every token with every other token, so if text is 2x longer, the work is about 4x. That burns memory because past keys and values are stored for every attention head, and it drags latency during long chats or reasoning loops.

The survey groups fixes into 4 buckets. Linear sequence models redo the math so cost grows with length, not length squared.

They include linear attention, recurrent networks that carry a small state, and state space models like Mamba, which track history with a running summary, so no big cache.

Sparse attention keeps the Transformer idea but only connects important pairs. Most tokens look locally, a few tokens act as global anchors, and some methods route tokens to the right places. You get large savings without throwing away core behavior.

Efficient full attention keeps exact attention but makes it hardware friendly. Input output aware kernels such as FlashAttention cut reads and writes, and multi-query or grouped-query attention lets many heads share 1 key-value set, cutting cache and bandwidth.

Sparse Mixture of Experts adds conditional compute. Only a few experts run per token, so capacity grows without paying full cost each step, and memory tricks compress, quantize, or prune the cache to stretch context.

The theme is simple, move less data. Methods that cut memory traffic tend to win on modern GPUs, which enables longer context, faster training, and lower serving cost.Image
Image
This figure is a roadmap of how to make LLMs faster and cheaper from input tokens to output tokens.

The center shows Efficient Sequence Modeling. One path makes sequence cost scale linearly using things like linear attention, linear recurrent networks, and state space models, plus test-time-training variants and unified linear sequence models.

Another path saves work by using sparse attention so the model only looks at the most useful token pairs.

A third path keeps full attention but makes it cheaper with input-output aware scheduling, grouped attention, mixtures of different attention types, and quantization.

Below that sits Sparse Mixture-of-Experts. The model grows capacity by keeping many experts but routes each token to only a few, so compute per token stays low. Different routing rules, expert designs, and conversion tricks live here.

To the right are Hybrid Architectures. These mix building blocks across layers or inside a layer to hit better speed and accuracy tradeoffs.

Next is Diffusion LLM. This family targets non-autoregressive generation so many tokens can be produced in parallel, with methods to connect back to standard autoregressive decoding and to extend into multimodal settings.

The final column highlights reach beyond text, showing where these efficiency ideas apply to vision, audio, and multimodal tasks.Image
Aug 13 5 tweets 2 min read
LEANN: The Tiniest Vector Database that Democratizes Personal AI with Storage-Efficient Approximate Nearest Neighbor (ANN) Search Index

Researchers from UC Berkeley, CUHK, Amazon Web Services, and UC Davis have developed LEANN, a storage-efficient ANN search index optimized for resource-limited personal devices.

RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.Image Storage Comparison Image
Aug 13 4 tweets 2 min read
insane

GPT-5 Pro has 148 IQ in Mensa Norway

116 IQ in the Offline Test

---

It is a timed, practice IQ quiz made by Mensa Norway.

You get 35 visual pattern puzzles, all non-verbal, with a 25-minute limit. Every correct answer is worth 1 point, the items get harder as you go, the reported IQ range is 85 to 145 with Standard Deviation 15.

What it measures is matrix-reasoning style, non-verbal abstract reasoning, the same family as Raven’s Progressive Matrices that psychologists use to estimate fluid intelligence and general reasoning ability.

trackingai. org/IQImage GPT-5 on Mensa Norway IQ test.

About the “GPT-5 Pro has 148 IQ” claim.

The Norway site reports IQs only within 85 to 145, so a 148 figure cannot be a direct output of a single Mensa-Norway session. It must come from a third-party conversion, an average across different tests, or a different scoring method, not from Mensa Norway’s own scale.Image
Aug 11 7 tweets 4 min read
GLM-4.5 technical report is out.

The open-source LLM dveloped by Chinese AI startup Zhipu AI, released in late July 2025 as a foundation for intelligent agents.

It ranks 3rd overall across 12 reasoning, agent, and coding tests, and 2nd on agentic tasks

Key innovations: expert model iteration with self-distillation to unify capabilities, a hybrid reasoning mode for dynamic problem-solving, and a difficulty-based reinforcement learning curriculum.

⚙️Core Concepts

GLM-4.5 uses a Mixture‑of‑Experts backbone where only 8 experts fire per token out of 160, so compute stays roughly like a 32B model while capacity is 355B.

They go deeper instead of wider, add Grouped‑Query Attention with partial RoPE for long text, bump head count to 96 for a 5120 hidden size, and stabilize attention with QK‑Norm.

More heads did not lower training loss, but it lifted reasoning scores, which hints the head budget helps hard problems even if loss looks flat (paper).

A small MoE Multi‑Token Prediction (MTP) head sits on top, so the model can propose several future tokens at once for speculative decoding, which cuts latency without changing answers (paper).Image 📚 The Data Recipe

Pre‑training mixes English, Chinese, multilingual text, math and science material, and a lot of code, then filters it by quality buckets and semantic de‑duplication to avoid template‑spam pages. The goal is simple, pack the stream with educational content that teaches reasoning and coding, not just facts .

For code, they grade repositories and web code content, up‑sample the high‑quality tier, and train with fill‑in‑the‑middle so the model learns to write inside existing files, not only at the end. That matters later for repo‑level edits in SWE‑benchImage
Aug 10 7 tweets 4 min read
So many brilliant AI "Self-Evolving" papers. Here's another one.

R-Zero shows a model can train itself from zero external data, by pairing a Challenger and a Solver that keep learning right at 50% uncertainty, no human tasks or labels.

On Qwen3-4B it adds +6.49 on math and +7.54 on general reasoning, with MMLU-Pro 37.38 → 51.53, all from self-generated data.

🧭 The problem they tackle

Human‑curated tasks and labels throttle self‑evolving LLMs, cost aside, the bottleneck is scale and coverage.

R‑Zero removes that dependency by starting from a single base model and creating its own training stream from scratch.

🔁 The co‑evolving loop

Both roles start from the same base model, one becomes the Challenger, the other becomes the Solver.

The Challenger proposes tough questions, the Solver answers them, then both get improved in alternating rounds.

This creates a targeted curriculum that keeps moving with the Solver’s ability.Image R‑Zero turns 1 base model into 2 roles, a Challenger that writes tough questions and a Solver that tries to answer them. The Solver’s most common answer is treated as a temporary label, which becomes fresh training data.

The Challenger is trained to make questions that keep the Solver near 50% certainty, so learning stays right at the edge of its ability. The Solver is trained with a simple 1 or 0 reward based on whether its answer matches that temporary label.

This loop needs no pre‑existing tasks or human labels. The Challenger supplies data, the Solver supplies reward, both improve together.

The bar chart reports the effect on a Qwen3‑4B base, SuperGPQA moves 20.88 → 27.55 (+31.9%)
Math Avg moves 42.58 → 49.07 (+15.2%),
MMLU‑Pro moves 37.38 → 51.53 (+37.9%).Image
Aug 10 8 tweets 4 min read
Beautiful Paper.

An LLM teaches itself from a single topic prompt, no human-written questions, no labels.

An LLM plays both teacher and student, creates its own questions, and learns with reinforcement learning.

By just splitting into a proposer that writes problems and a solver that answers them, both trained with reinforcement learning.

And with only self-generated data, a 3B model jumps +14% on arithmetic, +16% on algebra, and +7% on coding on held-out tests.

The clever twist that makes it work: when checking answers is hard, it uses majority vote over multiple solver attempts, and when checking is easy, like code, the proposer emits unit tests and the solver is rewarded by the fraction of tests passed.

This keeps problems “not too easy, not too hard,” so difficulty auto-tunes as the solver improves.

🧠 The idea, in plain words

The model runs a closed loop where a proposer writes a problem from a topic prompt and a solver tries to answer it.

Both roles are the same base LLM, trained with reinforcement learning, so the system can bootstrap without any human-written questions or answers.

🧠 Why this works

The key is curriculum from interaction, not a fixed dataset.

Curriculum is just the sequence of training problems the model sees, with difficulty that keeps adjusting to the model’s current skill. The paper calls the setup self-questioning, where the same base model plays proposer and solver using reinforcement learning, so the data gets shaped by performance rather than a fixed dataset

The rewards are minimal but aligned with useful behavior, proposer seeks interesting, solvable problems, solver seeks consistent or verifiable answers.

The loop is cheap to run at small scale, which makes it practical to iterate.

🧠 The bottom line

The curriculum appears because the proposer only gets rewarded for writing solvable but non-trivial problems, and the solver only gets rewarded for consistent or verifiable answers. The two keep pulling each other toward that sweet spot, which is exactly where learning is fastest.

🧵 Read on 👇Image 🎮 The two‑agent loop

Training alternates between the proposer and solver, each maximizing its own reward on the other’s outputs, a simple minimax setup.

This ties the data curriculum to current ability, so problem difficulty adapts as the solver improves. Image
Aug 9 4 tweets 3 min read
🇨🇳 A Chinese University just dropped a landmark computer science achievement.

Fastest shortest paths algorithm in 40 years, beating the famous Dijkstra plus Tarjan `n log n` limit by avoiding global sorting and working through a small set of pivot nodes.

The paper shows a deterministic O(m log2/3 n) algorithm for directed single‑source shortest paths with real non‑negative weights in the comparison‑addition model, which beats the O(m + n log n) Dijkstra bound on sparse graphs.

The usual practice keeps a global priority queue that effectively sorts up to n distance keys, so hitting a sorting wall near n log n.

🧭 What changed vs Dijkstra

The core shift is that the algorithm computes all distances without maintaining a full order of the frontier, so it avoids paying for global sorting.

Prior results even proved Dijkstra is optimal if the algorithm has to output the exact order of vertices by distance. This work targets only the distances, not that full ranking, which opens the door to bypass sorting entirely.

🎯 Practical application of this

This algo helps anything that runs many shortest path computations, like graph databases, routing engines, ranking pipelines, and network analysis, since each run finishes sooner and uses fewer CPU hours.

It is deterministic and uses only comparisons and additions, so it works cleanly with real-valued weights and portable code, no hardware specific integer hacks.

🧵 Read on 👇Image 🧪 Model, graph prep, and scope

Everything runs in the comparison‑addition model, so it only compares and adds edge weights.

To simplify degrees, the graph is transformed so each original vertex becomes a small zero‑weight cycle with at most 2 incoming and 2 outgoing edges, which preserves all shortest paths and keeps size at O(m).

The result applies to directed graphs with real non‑negative weights and does not rely on word‑RAM integer tricks.Image
Aug 8 5 tweets 4 min read
🐠 @GoogleDeepMind releases open-source AI for interpreting animal sounds.

Will make large-scale wildlife monitoring cheap, fast, and accurate from plain audio, so biodiversity trends can be tracked in near real time and conservation actions can be taken sooner.

Thsi Perch 2.0 model shows that a compact, supervised bioacoustics model, trained on many labeled species and two simple auxiliary tasks, reaches state-of-the-art transfer across birds, land mammals, and even whales.

Most systems lean on self-supervised pretraining or narrow, task-specific models, which often struggle when compute is tight and labeled data is limited.

🐦 What changed

Perch 2.0 expands from only birds to a multiple different groups of animals, training set and scales supervision hard.

The team trains on 1.54M recordings covering 14,795 classes, of which 14,597 are species.

The backbone is EfficientNet‑B3 with 12M parameters, so it stays small enough for everyday hardware, yet it learns general audio features that transfer well.

Input audio is cut into 5s chunks at 32kHz, converted to a log‑mel spectrogram with 128 mel bins spanning 60Hz to 16kHz. The encoder emits a spatial embedding that is pooled to a 1536‑D vector used by simple heads for classification and transfer.Image 🧪 The training recipe that actually moved the needle

They make classification deliberately hard, then let simple probes win.

First, they generalize mixup to blend N clips at once, not just 2. A random N is picked, weights are drawn, the audio is mixed, and the label is set as a multi‑hot vector so every species in the window counts, loud or quiet.

This teaches the model to separate overlapping sounds instead of chasing the loudest one.

Next comes self‑distillation with a prototype teacher. A ProtoPNet‑style head learns 4 trainable “prototypes” per class on the spatial embedding, then its predictions supervise a plain linear head on the pooled embedding.

Gradients do not flow back from the teacher into the encoder, which stabilizes training and nudges the pooled representation toward clean, linearly separable clusters.

Finally, they add a source‑prediction head. Think of every original recording as its own class.

The model must say which recording a 5s window came from, even when two windows do not overlap.

This forces consistency across windows from the same file and acts like ultra fine‑grained supervision. Because there are 1.5M+ recordings, they use a low‑rank projection of rank 512 to keep this head light.Image
Aug 8 9 tweets 4 min read
AI buildout is massive, ~$6.7T by 2030 across data centers, with AI driving most new capacity.

AI alone needs ~$5.2T and ~156 GW of capacity, so the money clusters around chips, power, and sites.

McKinsey Report.

📈 Where the demand line points

Global data center capacity could almost triple by 2030, and ~70% of that pull comes from AI workloads.

Two big unknowns shape the curve, whether real applications create business value, and how fast chips, models, and power efficiency improve.

DeepSeek V3 is given as an example of efficiency gains, ~18x cheaper training and ~36x cheaper inference than GPT‑4o, yet the article argues these savings likely spark more experiments, which keeps total compute rising, a Jevons paradox effect.

🧵 Read on 👇Image 💸 The AI capex number

Meeting AI demand needs ~$5.2T by 2030.

That ties to ~156 GW of AI data center capacity, with ~125 GW added during 2025‑2030. Image
Aug 4 10 tweets 5 min read
NVIDIA's brilliant paper gives a lot of details and actionable techniques. 🎯

Small Language Models (SLMs) not LLMs are the real future of agentic AI.

They gave a recipe for swapping out the large models with SLMs without breaking anything and show that 40%‑70% of calls in open agents could switch today.

It argues that SLMs, already match big models on many routine agent tasks, cost far less to run, and slot neatly into mixed‑model pipelines, so most agent calls should shift to SLMs.

🗝️ The central claim

SLMs that fit on a laptop already handle the narrow language chores inside most agents.

Because those chores rarely need open‑ended chat, the authors say an SLM‑first design is the natural default, and large models become occasional helpers.

@NVIDIAAIDev 👏Image ⚙️ SLM capability keeps up

Benchmarks show 2‑9 B‑parameter SLMs reach or beat 30 B‑70 B models on commonsense, tool calling, and code generation while running up to 15×‑70X faster. Image
Aug 3 9 tweets 5 min read
This paper will change how we think about LLM inferencing. 🔥

An alternative to Chain-Of-Thought, instead nspired by how the human brain utilizes distinct systems for slow, deliberate planning and fast, intuitive computation.

Gives us super fast reasoning vs SOTA LLMs with just 1,000 training examples and a 27mn param model.

Unbelievable how a tiny model from a tiny lab of Tsinghua + deep thinking, gets 40% on ARC-AGI and tear into complex sudoku and maze puzzles. 🤯

It removes token-by-token chain-of-thought generation.

Instead, Hierarchical Reasoning Model's (HRM) parallel processing allows for what Wang (Founder and CEO of Sapient Intelligence) estimates could be a “100x speedup in task completion time.”

This means lower inference latency and the ability to run powerful reasoning on edge devices.

📢 There are 3 efficiency techniques

a. Single forward pass reasoning: HRM performs all reasoning inside its hidden states and emits an answer in 1 network pass, while CoT-style LLMs build a long text trace first.

b. Constant-memory training: By replacing back-propagation-through-time with a 1-step gradient, training memory stays at O(1) instead of O(T), which shortens training iterations and improves GPU utilisation.

c. Adaptive Computation Time (ACT): ACT version averages only about one third of the compute steps of a fixed-depth baseline yet keeps the same accuracy, so inference cost per example drops roughly 2-3X, not 100X.

🔧 Final Takeaway

HRM hints that swapping endless layers for a small hierarchy plus cheap recurrence can give LLM‑level reasoning at Raspberry‑Pi costs. It also scales at inference: raise the ACT cap, and accuracy climbs further with no retraining.

🧵 Read on 👇Image 🧵 2/n Using only 1,000 input-output examples, without pre-training or CoT supervision, HRM learns to solve problems that are intractable for even the most advanced LLMs.

For example, it achieves near-perfect accuracy in complex Sudoku puzzles (Sudoku-Extreme Full) and optimal pathfinding in 30x30 mazes, where state-of-the-art CoT methods completely fail (0% accuracy).Image
Aug 2 8 tweets 3 min read
Experts can now identify and track you using Wi-Fi signals that bounce off your body - and its over 95% accurate.

A new surveillance method identifies and tracks you using Wi-Fi signals — without needing a phone, camera, or wearable.

Developed by researchers at La Sapienza University of Rome, the system has been dubbed "WhoFi."

Ir reads how Wi-Fi waves interact with a person’s body, essentially creating a unique biometric “fingerprint” based on the way wireless signals bounce off them.

This allows a person to be identified and re-identified across rooms and even different locations, all without visible technology or consent.

🔦 Why swap cameras for radio waves

Wi-Fi keeps working when lights are off, walls block sight, or crowds get in the way. A router sends a signal, the air, walls, bones, and backpacks bend that signal, and every body shape leaves its own tiny distortions. The paper grabs those distortions, known as Channel State Information, as a privacy-friendly fingerprint.

Unlike previous attempts at similar tracking, which topped out at 75% accuracy, WhoFi leverages neural networks and standard, low-cost Wi-Fi routers to achieve unprecedented precision.

The implications are enormous: this could revolutionize everything from retail analytics to law enforcement surveillance, raising pressing questions about privacy.

The system works even through walls and in the dark, potentially making it more powerful than traditional camera systems. While still in the experimental stage, the technology’s reliance on widely available hardware suggests it could be deployed at scale sooner than most would expect.Image 🧬 What lives inside a Wi-Fi packet

Each packet carries amplitude, how strong the signal arrives, and phase, how the wave shifts over time. Noise and hardware drift skew both pieces, so the team uses median filters for amplitude and a simple line-fitting trick for phase to clean things up.

After a pass of Gaussian noise, random scaling, and tiny time shifts, the data is ready for the network.Image
Aug 2 8 tweets 6 min read
Anthropic just showed that an AI's “personality” can be traced to specific directions in its brain ("Persona vectors"), and shows what might make it act in evil or unsafe ways.

Sometimes when you're chatting with a model, it suddenly starts behaving oddly—overly flattering, factually wrong, or even outright malicious. This paper is about understanding why that happens, and how to stop it.

🧠 What's going on inside these models?

AI models don’t actually have personalities like humans do, but they sometimes act like they do—especially when prompted a certain way or trained on particular data.

Anthropic’s team found that specific behaviors, like being “evil,” “sycophantic,” or prone to “hallucination,” show up as linear directions inside the model's activation space.

They call these persona vectors.

Think of it like this: if you observe how the model responds in different situations, you can map those behaviors to certain regions inside the model’s brain. And if you spot where these traits live, you can monitor and even control them.

---

The diagram shows a simple pipeline that turns a plain description of a trait such as evil into a single “persona vector”, which is just a pattern of activity inside the model that tracks that trait.

Once this vector exists, engineers can watch the model’s activations and see in real time if the model is drifting toward the unwanted personality while it runs or while it is being finetuned.

The very same vector works like a control knob.

Subtracting it during inference tones the trait down, and sprinkling a small amount of it during training teaches the model to resist picking that trait up in the first place, so regular skills stay intact.

Because each piece of training text can also be projected onto the vector, any snippet that would push the model toward the trait lights up early, letting teams filter or fix that data before it causes trouble.

Al that means, you can control the following of a model

- Watch how a model’s personality evolves, either while chatting or during training
- Control or reduce unwanted personality changes as the model is being developed or trained
- Figure out what training data is pushing those changes

🧵 Read on 👇Image 🔬 How to make sense of this persona vector?

Think of a large language model as a machine that turns every word it reads into a long list of numbers. That list is called the activation vector for that word, and it might be 4096 numbers long in a model the size of Llama-3.

A persona vector is another list of the same length, but it is not baked into the model’s weights. The team creates it after the model is already trained:

They run the model twice with the same user question, once under a “be evil” system prompt and once under a “be helpful” prompt.

They grab the hidden activations from each run and average them, so they now have two mean activation vectors.

They subtract the helpful average from the evil average. The result is a single direction in that 4096-dimensional space. That direction is the persona vector for “evil.”

Because the vector lives outside the model, you can store it in a tiny file and load it only when you need to check or steer the personality. During inference you add (or subtract) a scaled copy of the vector to the activations at one or more layers. Pushing along the vector makes the bot lean into the trait, pulling against it tones the trait down. During fine-tuning you can sprinkle a bit of the vector in every step to “vaccinate” the model so later data will not push it toward that trait.

So, under the hood, a persona vector is simply a 1-dimensional direction inside the model’s huge activation space, not a chunk of the weight matrix. It is computed once, saved like any other small tensor, and then used as a plug-in dial for personality control.

---
The pipeline is automated, so any new trait just needs a plain-language description and a handful of trigger prompts.

They validate the result by injecting the vector and watching the bot slip instantly into the matching personality.Image
Aug 2 10 tweets 6 min read
Absolutely deluxe resource on large language models. 👌

LLMs pick up world knowledge just by guessing the next token, and that single trick scales from chatbots to code helpers.

It treats language as a chain of choices, predicting one token at a time based on everything that came before.

Repeating that prediction task across trillions of tokens lets the network squeeze statistical hints about grammar, facts, and even logic into its weights, without any labeled examples.

Growing the model and the data unlocks abrupt jumps in reasoning skill but also makes training fragile.

Keeping context windows huge and latency low is now the top practical hurdle.

🧵 Read on 👇Image This figure lays out 3 basic roadmaps for getting a language model ready.

The 1st path pours in a mountain of unlabeled text so the model picks up general language patterns, then finishes with a supervised stage that targets a single job.

The 2nd path starts directly with labeled examples for Task 1, then reuses that same model on Task 2 after a short, extra supervised tune‑up, so the learning carries over.

The 3rd path has become the go‑to choice: it trains on unlabeled text using trick questions the model can answer by itself, builds a strong self‑supervised base, and later needs only a quick supervised pass or even a simple prompt to switch to new tasks.Image
Jul 31 4 tweets 2 min read
Github: "AI-Researcher: Autonomous Scientific Innovation"

Helps you propose research ideas and autonomously handles literature review, ideation, algorithm implementation, experimentation and manuscript drafting via containerized multi-agent LLM pipelines.

Benchmarked on 4 domains across 2 task levels: reaches 81 % novelty and 0.92 F1 versus human papers while emitting codebases, GUI and Docker stacks in <3 h per project.Image ✨ The AI-Researcher system accepts user input queries at two distinct levels ✨

Level 1: Detailed Idea Description
At this level, users provide comprehensive descriptions of their specific research ideas. The system processes these detailed inputs to develop implementation strategies based on the user's explicit requirements.

Level 2: Reference-Based Ideation
This simpler level involves users submitting reference papers without a specific idea in mind. The user query typically follows the format: "I have some reference papers, please come up with an innovative idea and implement it with these papers." The system then analyzes the provided references to generate and develop novel research concepts.Image
Jul 30 8 tweets 7 min read
This is such a revelation 😯

New Wharton study finds AI Bots collude to rig financial markets.

The authors power their AI trading bots with Q‑learning

💰 AI trading bots trained with reinforcement learning started fixing prices in simulated markets, scoring collusion capacity even when noise was high or low.

And messy price signals that usually break weak human strategies do not break this AI cartel.

🤖 The study sets up a fake exchange that mimics real stock order flow.

Regular actors, such as mutual funds that buy and hold, market makers that quote bids and asks, and retail accounts that chase memes, fill the room. Onto that floor the team drops a clan of reinforcement‑learning agents.

Each bot seeks profit but sees only its own trades and rewards. There is no chat channel, no shared memory, no secret code.

Given a few thousand practice rounds, the AI agents quietly shift from competition to cooperation. They begin to space out orders so everyone in the group collects a comfortable margin.

When each bot starts earning steady profit, its learning loop says “good enough,” so it quits searching for fresh tactics. That halt in exploration is what the authors call artificial stupidity. Because every agent shuts down curiosity at the same time, the whole group locks into the price‑fixing routine and keeps it running with almost no extra effort.

This freeze holds whether the market is calm or full of random noise. In other words, messy price signals that usually break weak strategies do not break this cartel. That makes the coordination harder to spot and even harder to shake loose once it forms.

🕵️This behavior highlights a blind spot in current market rules. Surveillance tools hunt for human coordination through messages or phone logs, yet these bots coordinate by simply reading the tape and reacting.

Tight limits on model size or memory do not help, as simpler agents slide even faster into the lazy profit split. The work argues that regulators will need tests that watch outcomes, not intent, if AI execution keeps spreading.Image The page explains why the authors power their trading bots with Q‑learning, a plain version of reinforcement learning. Q‑learning gives a solid base for many modern AI tricks, is already popular on trading desks, and is easy to read and audit.

Next it introduces the Bellman idea. Think of a bot in a market. At any instant it sees a “state”, like recent price and value signals. It chooses an order size, pockets today’s gain or loss, then cares about tomorrow’s gains too, but with a discount so near‑term cash matters more.

To handle that, the bot keeps a Q‑table. Each cell stores a score for doing a certain action in a certain state. After every trade the score is nudged toward “today’s profit plus the best score it now expects for tomorrow”.

Repeated millions of times, those tiny updates teach many bots how each move affects later prices and payoffs. Inside the study this self‑teaching is the fuel that lets separate bots quietly line up their trades and earn cartel‑level profits without ever swapping messages.Image