Stanislas Polu Profile picture
_co-founder+engineer(https://t.co/fCirsLjeo2), _alumni(https://t.co/8jAnpFAkp1, https://t.co/e99AaHzlA0, https://t.co/4jg6knqi2S, https://t.co/kXE6PNf8xH)
Jan 31, 2023 7 tweets 1 min read
A thread on what I believe are the 2 hottest AI/LLM research questions for 2023 🔥 I hear left right and center, and believe myself, that there is a super strong need in the market right now for a Chinchilla style RETRO-style model.

First, chinchilla style, because we want davinci-003 performance at 70B parameters (which ~ what saturates 1 GPU at inference).
Dec 13, 2022 6 tweets 3 min read
For the past couple weeks I've been experimenting with a new way to interact with LLMs: a GPT-based assistant that has access to my browser tabs content.

It's called XP1 🧵

... and It's now available here: producthunt.com/posts/xp1 It lets you submit queries to a `text-davinci-003` prompted assistant, but doing so, lets you search and select some of your tabs to inject their content in the context of the assistant Image
Dec 12, 2022 10 tweets 2 min read
A list of predictions for 2023 for the field of LLMs🧵 There will be an opensource Chinchilla-style LLM released this year at the level of text-davinci-*. Maybe not from the ones we expect🤔This will obliterate ChatGPT usage and enable various types of fine-tuning / soft-prompting and cost/speed improvements.
Nov 12, 2022 6 tweets 2 min read
Weekend side-project: try to reproduce @chillzaza_'s Toolbot demo with a Dust app.

Got it to work pretty well (with a bonus tool, "question to SQL"): dust.tt/spolu/a/b39f8e…

Here's how I went about it 👇🧵 Image I started by crafting few-shot examples to teach the model to generate few-shot examples for a new tool "instruction". Here's the dataset: dust.tt/spolu/a/b39f8e…
Mar 9, 2022 8 tweets 2 min read
Request for (AI) Startups

A 2022 list of AI-related startup ideas (I would use or see myself working on if I had the time)🧵 "Copilot but for your entire OS"

This one is an oldy but it comes back every now and then. Some old ideas here: spolu.notion.site/Neuromancer-31…

By-product: opportunity to reinvent the OS
Jun 20, 2018 6 tweets 1 min read
There is a lot of approaches in recent AI research (CNN obv, HER, VAE, or more recently world models, ...) that definitely feel like “hypothetically something close to how our brains work” So much that it begs the question of whether we should, instead of solely attempting to replicate our brain functions, maybe try to interface with it?