Rohan Pandey Profile picture
descending cross-entropy to ascend entropy @PeriodicLabs || prev research @OpenAI @CarnegieMellon '23
Jul 27, 2025 5 tweets 2 min read
A thread of my favorite Thomas Daniell aquatints from "Oriental Scenery: 24 Views in Hindoostan", etched during his tour of India between 1795-1807

1. Daśāśvamedha Ghāṭ, on the Ganges in Vārāṇasī (May 1796) Image 2. Hindu Temples at Agori, on the Sone River in Bihar (Sept 1796) Image
Jun 29, 2025 8 tweets 2 min read
Recently had a good chat with @tamaybes. He thinks we aren’t yet in the GPT-3 era of RL and as it scales, cross-task OOD generalization will emerge.

It’s difficult to empirically study this at current scale, but let’s take it as true—what does this mean for custom RL plays? 🧵 Image RL is attractive for solving domain-specific tasks because all you need is a small problem set and a verifier.

OpenAI’s RFT team, Thinking Machines, and Applied Compute are all pitching customers on RL as a service.

So what happens if o5 zero-shots all customer tasks?
May 22, 2025 5 tweets 2 min read
Looking for people to help build 3 fun RL environments for Sanskrit:
- metrical/chandas poetry composition
- surface morphology/paninian rendering
- literature source retrieval

These can all be verified with OSS packages and implemented in <2 days. Short 🧵 on the project ideas! Sanskrit poetry is metrical—each meter/chandas has syllabic constraints. LLMs often struggle to use meter correctly.

Write a verifier using Chandas, generate a dataset of poem topic + meter requests, and use an LLM grader if reward hacking occurs.
github.com/sanskrit-coder…Image
Jan 22, 2025 11 tweets 4 min read
Deciphering the Indus Valley script would revolutionize our understanding of Indian history. Recently, @yajnadevam's Sanskrit hypothesis has gained steam & many asked me: is it legit?

So I spent the last couple hours analyzing his decipherment with o3, and here's what I found 🧵 Image For background, his scheme is a straightforward mapping from IVC symbols to Sanskrit sounds, but note that it ignores:
- aspiration (k vs kh)
- retroflection (त vs ट)
- sibilant place of articulation (स vs श vs ष)

These differences are critical in Sanskrit (especially Vedic). Image
Jan 13, 2025 12 tweets 6 min read
The Indus Valley seals' most common motif—the unicorn—is always found flanked by a mysterious object that has drawn far less scholarship.

What is this lamp-looking item? And can Vedic literature tell us anything about the Indus Valley Civilization's supposedly lost religion? 🧵 Image First, some context: academic consensus has traditionally been that the IVC had *no* continuity with subsequent Indian civilization: they flourished, collapsed, and civilization restarted once the Aryans arrived.

This view is crumbling in light of recent archaeological evidence. Image
Image
May 28, 2024 10 tweets 5 min read
📢 Excited to finally be releasing my NeurIPS 2024 submission!

Is Chinchilla universal? No! We find that:
1. language model scaling laws depend on data complexity
2. gzip effectively predicts scaling properties from training data

As compressibility 📉, data preference 📈.
🧵⬇️
Image Chinchilla claims their 1-to-1 parameter-data scaling law is agnostic to the type of textual training data used 🤨

But @ArmenAgha @AIatMeta find code-gen scaling prefers parameters 😳

@deepseek_ai team further noticed that scaling with cleaner data also prefers parameters 🤔

Image
Image
Image
May 24, 2024 16 tweets 6 min read
Sacred fire for Vedic ritual is chiefly produced by Agnimanthana (fire-churning), outlined in the Śatapatha Brāhmaṇa.

But one specific chip of wood it requires has puzzled scholars for the last 800+ years.

Here's how I solved it with some wilderness survival sleuthing 🧵⬇️
At the center of Vedic Hinduism lies yajña, the fire sacrifice.

Correct performance of a yajña requires strict adherence to instructions provided in the Saṃhitā & Brāhmaṇa texts of the Veda, composed over 3000 years ago.

And the fire used in yajña must be of sacred origin. Image
Nov 11, 2023 5 tweets 2 min read
Introducing Tarsier 🙈, an open source Python library to enable web interaction with multi-modal LLMs like GPT4! Here’s a demo of a Tarsier agent navigating through google to watch the OpenAI Dev day announcement: Tarsier provides two fundamental utilities:
1. ability to tag interactable elements with a unique id.
This allows LLMs to better understand what the elements they can take actions upon are, and this also provides a mapping back from the LLMs choice to the underlying element. Image