Latest Twitter Threads by @bookwormengr on Thread Reader App

Jan 9 • 6 tweets • 5 min read

AI talent density by global metro areas - mega thread, bookmark.
==================

Demographics is the destiny, compute alone is overrated in the Age of Research (though @SemiAnalysis_ may not agree).

In this thread, let us analyse demographic distribution of AI talent globally. The stats are shocking.

- China exceeds USA. Tiny Singapore matches all of Europe (no wonder major labs are opening base in Singapore)
- Beijing area has the highest talent density in the world.
- Beijing Haidan district >> SF Cerebral valley (there are more tier 1 labs in this area than all of San Fransisco: MoonShot, MiniMax, Ziphu AI, ByteDance SEED and many many others...).
- Chinas has 3 metro areas with comparable research output as the entire Bay area (more than 10% of global). Each of these have high concentration of robotics firms as well.
- USA has only one major research cluster with more than 10% contribution to AI research, off-course San Fransisco Bay area.

You may say, US labs like OpenAI and Anthropic don't publish, that is why this is the case.

But, do you really think Chinese base labs like DeepSeek, MiniMax, MoonShot, Z AI with 400+ staff publish as many as they could? How many papers you have seen from Chinese robotics firms?
They publish may be 5-10 papers a year far below the number of experiments they conduct.

GCR have other corporate labs that publish more, like Alibaba, ByteDance SEED, Tencent. There are more labs on the block: Xiomi, Meituan etc. but they are balanced by American ones like Google, Microsoft, Amazon, SalesForce, Nvidia etc.

Most of the difference is made strong research culture at Chinese Universities, as well as emerging Asian universities like NUS, NTU, KSAIT etc.

What follows are region specific maps showing distribution of talent.

(Source: AI talent density maps is produced on basis of influential paper published. Neurips selection taken as the proxy.)

@shaunrein @teortaxesTex @bgurley @chamath @DavidSacks @MohapatraHemant @natolambert @Scobleizer @ClementDelangue @aakrit @svembu @balajis @naval @rohanpaul_ai @SemiAnalysis_ @deedydas @adityaag @pmarca @elonmusk @dwarkesh_sp

China AI talent density as fraction of global by metro regions:

USA has only one region with more than 10% of total AI research output (SF Bay Area). China has 3.

Beijing area leads the world with Tsinghua University, Peking University, University of Chinese Academy of Science, Beihang University, Beijing Institute of Technology with impressive research output. Tsinghua and Peking University beat any other university in the world.

Beijing area is also home to likes of Moonshot, Ziphu AI, MiniMax, ByteDance SEED and many others.

Most of these universities and labs located in north-west part of the city in Haidian district (area where summer palace is).

That is why I say, Beijing Haidian >> SF Cerebral Valley

Time to stop looking down at China, may be?

May 31, 2025 • 5 tweets • 2 min read

Mind blow with Perplexity Pro Lab 🧵

It prepares Deep Research reports with diagrams, charts, graphs.

Prompt:

- What type of HBM attached on MI300X GPU from AMD?
- How much memory and memory layer it has? Is it 8, 12, 16 layers?
- Compare it with H100 from Nvidia

Attaching screenshot of the marvellous report i got.

If you are not firing at least 3 queries on it per day, NGMI!

@Jukanlosreve @rwang07 @teortaxesTex @AravSrinivas

Mind blow with Perplexity Pro Lab 🧵

It gave me back everything I wanted and so more without asking for it. Look at the beautiful comparision table it prepared.

I love Deep Research (OpenAI, Gemini etc), but Perplexity one is far more readable, in this new format.

Feb 7, 2025 • 4 tweets • 5 min read

India's 🇮🇳 path to AGI: #1
====================
Arjun vs Eklavya 🏹

In this series I will publish notes on building AGI in India. We will discuss technology, but in a manner that most will understand.

In the series, will cover steps to achieve 1) reasoning models, 2) adding vision, voice, lidar, motion capabilities, 3) learning on the fly, memory, 4) embodied AI, 6) computational efficiency, 5) semiconductors and power needed, 6) GPU & ASIC architectures, 7) supply chain and bottleneck, 8) control & ethics.

I promise, you will learn a lot 💕.

Today we will talk about step 1, i.e. building reasoning models.

There are two approaches for teaching models how to reason:
1. Imitation Learning (technical name SFT): Here a small model looks at the big model's reasoning process and learns from it. Let us call it Arjuna mode
2. Self-play Learning (technical name RL): here a model learns with large amount of trial and errors. Let us call it Eklavya mode.

Imitation Learning - Arjuna mode:
-----------------------------------------
First, why imitation learning from large reasoning models like DeepSeek R1 (671 billion parameters) works?

It is because it teaches smaller models how to
1/ search for an answer,
2/ how to verify and
3/ how to backtrack on finding a wrong answer or making a mistake

The hardest part for LLMs is to verify their own answers objectively and change the trajectory if mistakes were made.
Imitation learning from R1 helps smaller LLMs to do that (recall in the output of R1 you see it doing that).

Imitation learning, however, requires a high quality dataset to show how the "masters" like DeepSeek R1 reason to arrive at a solution.

There is such a dataset available
Thanks to DeepSeek for permitting it (MIT License) and thanks to Prime Intellect @PrimeIntellect for publishing the 1 million plus reasoning trajectories from DeepSeek R1.

Indian teams can get started with this dataset. As I show on next tweet even 1K reasoning trajectories are sufficient, if certain conditions are met.

You can do Self Play (RL) - Eklavya mode - also with this dataset, as it has verifier needed for running GRPO with RVRL. Do not worry, we shall cover it tomorrow 🙂

We have mavericks @bhash, @paraschopra who are going to surprise us.🦾

@pranavmistry & team is already further along 👏.

@svembu, @vikramchandra, @AbhijitChavda, @AshwiniVaishnaw, @Iyervval, @HindolSengupta @TVMohandasPai, @ThePrintIndia, @ShekharGupta

If you want others to know about this series, please retweet it, and follow me.

Let us spread the knowledge broadly, and let us build AGI in India.

India will not stay behind in this industrial revolution.

primeintellect.ai/blog/synthetic…

India's 🇮🇳 path to AGI: #1 continued
====================
Nature versus Nurture and
Less is More for Reasoning:

Contrary to the current thinking, the main idea of this research paper is that reasoning is not required to be taught, but it can be unlocked in a strong base model.
That is Nature matters more than nurture.

Advantage:
1. They just needed 1000 examples and training can be completed for less than 30$ (India take note)

Caveats:
1. This only works if your base model was strong, it had seen lot of mathematical or reasoning content in what is known as "pre-training". That is why model A gets so much better than model B below. But pre-training on mathematical content is quite easy, so this should not be hard.
2. You need to be careful about choosing those 1000 training examples that will unlock the reasoning. It is lot of work, but not that expensive.

A. Qwen2.5-32B-Instruct achieved 57.1% on AIME2024
B. Qwen1.5-32B-Chat only achieved 10.0% on AIME2024

(AIME 2024 is a set of hard problems from an American Maths competition).

arxiv.org/pdf/2502.03387

Mar 28, 2020 • 5 tweets • 3 min read

@ReSt_AsSuReD2 Nope, there is a sound model behind it.

Both below models result in similar total mortality initially & flooding of hospital

1) Medium RO (2) and High mortality rate (1%) - used by governments

2) High RO (3 or higher) and very low mortality (0.01%)
1/n @ReSt_AsSuReD2 However, the later model (which now is being backed by prominent experts - e.g Oxford) is more likely to be true (as it accounts for asymptotic carriers) and predicts that virus will soon burn out (run out of people to infect).

While USA has close to 100k patients 2/n

Share this page!

Enter URL or ID to Unroll