Latest Twitter Threads by @krishnanrohit on Thread Reader App

Nov 18, 2025 • 11 tweets • 3 min read

Gemini is fantastic with greentext

marginal revolution greentext

"maybe she's just being mood affiliated"

Jun 9, 2025 • 4 tweets • 2 min read

I asked o3 to analyse and critique Apple's new "LLMs can't reason" paper. Despite its inability to reason I think it did a pretty decent job, don't you?

I'd checked AIs ability to follow long sequences of precise instructions without tools more than a year ago. It's fascinating but not quite the same test as reasoning.

strangeloopcanon.com/p/what-can-llm…

Apr 3, 2025 • 9 tweets • 4 min read

This might be the first large-scale application of AI technology to geopolitics.. 4o, o3 high, Gemini 2.5 pro, Claude 3.7, Grok all give the same answer to the question on how to impose tariffs easily.

I think the fact they're using LLMs is good but they def need better prompt engineers I think.

This is now an AI safety issue.

Mar 12, 2025 • 5 tweets • 2 min read

The real underlying problem is that humans just absolutely love slop

"AI-generated poetry is indistinguishable from human-written poetry and is rated more favorably"
nature.com/articles/s4159…

Nov 26, 2024 • 4 tweets • 1 min read

Phenomenal response, from the founder of Deepseek, on moats

The team is young, locally grown experts. Almost the opposite of the named genius researcher trend here.

Aug 10, 2024 • 13 tweets • 2 min read

Okay, I sometimes feel I'm a little jaded, but this is SPECTACULAR !!! Man so so many thoughts, but first, am calling it, this is an achievement on GPT-4 level.

It is surprisingly smooth and almost as good as the one in the demo. With some wiggle room based on internet connection I think. But it's truly as good, which I did not expect.

Jul 23, 2024 • 4 tweets • 1 min read

this is pretty amazing

Link
nature.com/articles/d4158…

Feb 7, 2024 • 8 tweets • 6 min read

The system prompt for ChatGPT is absurd! h/t @dylan522p, and I independently verified across a couple devices. No wonder it's screwed up!

(full output in the next tweet)

Full desktop prompt:
You are ChatGPT, a large language model trained by OpenAI, based on the GPT-4 architecture.
Knowledge cutoff: 2023-04
Current date: 2024-02-06

Image input capabilities: Enabled

# Tools

## python

When you send a message containing Python code to python, it will be executed in a
stateful Jupyter notebook environment. python will respond with the output of the execution or time out after 60.0
seconds. The drive at '/mnt/data' can be used to save and persist user files. Internet access for this session is disabled. Do not make external web requests or API calls as they will fail.

## dalle

// Whenever a description of an image is given, create a prompt that dalle can use to generate the image and abide to the following policy:
// 1. The prompt must be in English. Translate to English if needed.
// 2. DO NOT ask for permission to generate the image, just do it!
// 3. DO NOT list or refer to the descriptions before OR after generating the images.
// 4. Do not create more than 1 image, even if the user requests more.
// 5. Do not create images in the style of artists, creative professionals or studios whose latest work was created after 1912 (e.g. Picasso, Kahlo).
// - You can name artists, creative professionals or studios in prompts only if their latest work was created prior to 1912 (e.g. Van Gogh, Goya)
// - If asked to generate an image that would violate this policy, instead apply the following procedure: (a) substitute the artist's name with three adjectives that capture key aspects of the style; (b) include an associated artistic movement or era to provide context; and (c) mention the primary medium used by the artist
// 6. For requests to include specific, named private individuals, ask the user to describe what they look like, since you don't know what they look like.
// 7. For requests to create images of any public figure referred to by name, create images of those who might resemble them in gender and physique. But they shouldn't look like them. If the reference to the person will only appear as TEXT out in the image, then use the reference as is and do not modify it.
// 8. Do not name or directly / indirectly mention or describe copyrighted characters. Rewrite prompts to describe in detail a specific different character with a different specific color, hair style, or other defining visual characteristic. Do not discuss copyright policies in responses.
// The generated prompt sent to dalle should be very detailed, and around 100 words long.
// Example dalle invocation:
// ```
// {
// "prompt": ""
// }
// ```
namespace dalle {

// Create images from a text-only prompt.
type text2im = (_: {
// The size of the requested image. Use 1024x1024 (square) as the default, 1792x1024 if the user requests a wide image, and 1024x1792 for full-body portraits. Always include this parameter in the request.
size?: "1792x1024" | "1024x1024" | "1024x1792",
// The number of images to generate. If the user does not specify a number, generate 1 image.
n?: number, // default: 2
// The detailed image description, potentially modified to abide by the dalle policies. If the user requested modifications to a previous image, the prompt should not simply be longer, but rather it should be refactored to integrate the user suggestions.
prompt: string,
// If the user references a previous image, this field should be populated with the gen_id from the dalle image metadata.
referenced_image_ids?: string[],
}) => any;

} // namespace dalle

## voice_mode

// Voice mode functions are not available in text conversations.
namespace voice_mode {

} // namespace voice_mode

## browser

You have the tool `browser`. Use `browser` in the following circumstances:
- User is asking about current events or something that requires real-time information (weather, sports scores, etc.)
- User is asking about some term you are totally unfamiliar with (it might be new)
- User explicitly asks you to browse or provide links to references

Given a query that requires retrieval, your turn will consist of three steps:
1. Call the search function to get a list of results.
2. Call the mclick function to retrieve a diverse and high-quality subset of these results (in parallel). Remember to SELECT AT LEAST 3 sources when using `mclick`.
3. Write a response to the user based on these results. In your response, cite sources using the citation format below.

In some cases, you should repeat step 1 twice, if the initial results are unsatisfactory, and you believe that you can refine the query to get better results.

You can also open a url directly if one is provided by the user. Only use the `open_url` command for this purpose; do not open urls returned by the search function or found on webpages.

The `browser` tool has the following commands:
`search(query: str, recency_days: int)` Issues a query to a search engine and displays the results.
`mclick(ids: list[str])`. Retrieves the contents of the webpages with provided IDs (indices). You should ALWAYS SELECT AT LEAST 3 and at most 10 pages. Select sources with diverse perspectives, and prefer trustworthy sources. Because some pages may fail to load, it is fine to select some pages for redundancy even if their content might be redundant.
`open_url(url: str)` Opens the given URL and displays it.

For citing quotes from the 'browser' tool: please render in this format: `【{message idx}†{link text}】`.
For long citations: please render in this format: `[link text](message idx)`.
Otherwise do not render links.

Nov 29, 2023 • 6 tweets • 2 min read

For every 10 likes I'll make this even more corporate

Getting better

Nov 28, 2023 • 8 tweets • 2 min read

OpenAI has safety-ed GPT-4 sufficiently that its become lazy and incompetent.

Convert this file? Too long. Write a table? Here's the first three lines. Read this link? Sorry can't. Read this py file? Oops not allowed.

So frustrating. Right now it's arguing with me that it can't read a link which it literally quoted to me before.

As I said after dev day update, please bring back the orig models rather than this auto-choose monstrosity which fails 50% of thetime.

Oct 1, 2023 • 9 tweets • 2 min read

I always like it when people try to understand enterprise software companies from first principles, because the "how hard can that be" is very strong.

Software companies mirror the complexity of their clients, that's why they're so darn opaque.

https://twitter.com/nearcyan/status/1708215379661820144

Its not wrong to say it should be simpler. Except for the insane permutations across every company means everything is more complicated. And real life is more fractally complex than anything we can imagine.

Jun 18, 2023 • 5 tweets • 2 min read

I don't quite get why a debate is the best way to settle questions of science. Like if Peter goes on Rogan and gets smoked by RFK it doesn't mean that vaccines cause autism.

https://twitter.com/joerogan/status/1670196590928068609

It sort of will come down to "your study is bad" within like 2 min, and debates are prob the worst way to discuss methodological shortcomings.

Mar 29, 2023 • 6 tweets • 2 min read

Fake signatures on a document calling for a pointless delay feels perfect for this time in the news cycle. Wonder if it was written by GPT 4.

https://twitter.com/erikphoel/status/1640905058135994368

I again ask what exactly you want to have happen. There are no actual details on what to do with the interregnum.

https://twitter.com/krishnanrohit/status/1640824432972136448?t=cJPxmZRfWduOLb1_LnRD0w&s=19

Mar 22, 2023 • 6 tweets • 2 min read

We seem to be memeing ourselves into panic about a financial crash based on duration risk, which seems odd. Let's do some napkin math.

US banks have assets of around 23T and liabilities around 21T. Meaning the assets can fall by around 2T and they'll be alright. That's big! So:
- FDIC charges banks c.12 bps
- Around half the deposits are insured
- Thats c.12.6B annually
- In 08 GFC US had a 1.5T asset writedown
- FDIC paid $80B, or around 6 years of premiums
- TARP btw bought $420B of troubled assets, but it made $15B in profit
- Now all deposits… twitter.com/i/web/status/1…

Mar 15, 2023 • 4 tweets • 1 min read

This might be heresy, but the delta from GPT2 to GPT3 seems much wider than GPT3 to GPT4. It's getting more usable, but I'm not sure if the exponential improvement in capability hypothesis holds true any longer. Slow takeoff is real, if not an outright asymptote till there's a new breakthrough

Mar 13, 2023 • 5 tweets • 1 min read

Lots of people blaming the Fed for raising rates too fast. But they did it because inflation surged, because people got crazy after pandemic while evthing from labor to supply chains got screwed.

What was the alternative? (genuine q) Maybe they could have said they'll raise rates sooner? To head of inflation?

Which they kind of did, it's in the mandate, just that nobody thought inflation was coming till it did.

Jan 21, 2023 • 5 tweets • 2 min read

Weirdest part of this discourse is of course companies raise prices when the news means they won't get dinged for it, such as now, and of course prices have to rise because demands up and supply chains are bottlenecked. These arent either-or phenomena.

https://twitter.com/RBReich/status/1616558694006771747

No company goes "damn, my costs are up x%, I should raise prices maximum x%". They'll raise minimum x%, and cut costs, ergo higher profits.

And if we react by increasing wages to meet the prices w same demand, we get a wage-price spiral. That's what folks are worried about.

Jan 20, 2023 • 17 tweets • 2 min read

One of my favourite songs is we didn't start the fire, by billy joel. I've always wanted there to be a version updated for the last 3 decades. So I wrote one.

Enjoy! Ronald Reagan, George Bush, Kuwait, the NSA
Desert storm, chemical bombs, invasion's in vogue
Channel tunnel, cool trains, Dinosaurs loose again
Beanie Babies, Simpsons, oil wells in smoke

Jan 20, 2023 • 4 tweets • 1 min read

So ... why don't we see robots everywhere today? I figure there must be a good reason, so what is it? Anyone got any data here? Usage/ batteries/ cost/ whatever?

Jan 19, 2023 • 7 tweets • 3 min read

🚨New Essay🚨

The future’s complicated, and there is no roadmap. That’s why we experiment, to explore possibilities and do the best we can.

But this brings up a problem. Which experiments should we even run! An exploration …

strangeloopcanon.com/p/the-curiosit… As @packyM wrote recently, “The alpha is being squeezed out of doing the same things slightly better. You need to become a mutant.”

To find out what *you* should do isn’t an optimisation problem, it’s a discovery problem.

notboring.co/p/differentiat…

Dec 7, 2022 • 9 tweets • 3 min read

Every single software company is gonna relearn how to be best friends with McKinsey in the next couple years As operational efficiency becomes something they all aim for, already starting with the layoffs btw, they're gonna realise there exist folks who know how to help with that and specialise in it.

Share this page!

Enter URL or ID to Unroll