andrew gao Profile picture
Training a Latin LLM (highlights) @stanford; prev @LangChainAI @pika_labs @nomic_ai; Z Fellow 🇺🇸; views my own
Jun 5 18 tweets 5 min read
We embedded 250,000 works of art 🎨 from The Met using @nomic_ai's new SOTA #multimodal embeddings model!

It's the *first ever* semantic search tool of its kind 👩‍🎨 🔎
Search with smart queries like "oil painting with flowers & dogs".

How we did it & how to use it👇 Nomic Atlas 2D interactive visualization of 250,000 works of art from the Met museum @nomic_ai @WilliamGao1729 Nomic just released their newest model, Nomic-Embed-Vision.
It transforms the way we can interact with images. Instead of using CLIP to get text captions for models and using those captions for semantic search, you can work directly with image embeddings!👇 Image
May 13 11 tweets 3 min read
To everyone disappointed by @openai today... don't be. The livestream was for a general consumer audience.

The cool stuff is "hidden" on their site.

I am really excited. (Text to 3D??)
🧵🧵 @OpenAI 1/ Lightyears ahead of anyone at having text in AI generated images. Gorgeous
Image
Image
May 8 15 tweets 8 min read
🔔the guy who invented the LSTM just dropped a new LLM architecture! (Sepp Hochreiter)

Major component is a new parallelizable LSTM.
⚠️one of the major weaknesses of prior LSTMs was the sequential nature (can't be done at once)

Everything we know about the XLSTM: 👇👇🧵 Image 1/
Three major weaknesses of LSTMs that make Transformers better:
"Inability to revise storage decisions"
"Limited storage capacities"
"Lack of parallelizability due to memory mixing".

SEE THE GIF, if you don't get it. LSTMs are sequential which basically means you have to go through the green boxes (simplified) one after the other. You need the results from the prior box before you can move on.

Transformers don't do this. They parallelize operations across tokens, which is a really really big deal.

So how did Sepp and team solve?
Keep reading: 👇👇
GIF credit: Michael Phi
towardsdatascience.com/illustrated-gu…
May 7 15 tweets 5 min read
gpt2-chatbot RETURNS! it's now TWO similarly performing models.

i've been testing them.

everything i can tell you 👇🧵 #gpt2
Image 1/ first of all, @sama posted this cryptic tweet a few days ago.
that tweet contains the name of one of the two new GPT2 models.

can I confirm that it is from OpenAI? no. However, model creators need to work with @lmsysorg to add the model and it seems strange for LMSYS team to allow someone to pretend

how good are the mystery models? 👇👇👇🧵👀
Apr 29 8 tweets 4 min read
uh.... gpt2-chatbot just solved an International Math Olympiad (IMO) problem in one-shot

the IMO is insanely hard. only the FOUR best math students in the USA get to compete

prompt + its thoughts 🧵
Image megathread:
Apr 29 21 tweets 9 min read
🧵megathread of speculations on "gpt2-chatbot": tuned for agentic capabilities?

some of my thoughts, some from reddit, some from other tweeters

my early impression is 👇Image
Image
Image
1/

there's a limit of 8 messages per day so i didn't get to try it much but it feels around GPT-4 level, i don't know yet if I would say better... (could be placebo effect and i think it's too easy to delude yourself)

it sounds similar but different to gpt-4's voice

as for agentic abilities...
Mar 17 10 tweets 3 min read
here's your DEEP DIVE into @grok's architecture!
I just went through the , for this 314B open source behemoth with *no strings attached*.

👇🧵 model.py
Image @grok 1. Basics:
314 B, mixture of 8 experts (2 active)
86B active parameters

It's using Rotary Embeddings #rope instead of fixed positional embeddings

📜👇👇
Mar 17 8 tweets 2 min read
HOLY SH*T @grok IS 314 BILLION PARAMETERS

Mixture of 8 Experts, not RLHFd/moralized

THIS IS HUGE
Image @grok Grok repo as a TXT for your prompting convenience:
drive.google.com/file/d/1YGKcT5…
Mar 13 9 tweets 4 min read
ATTN: Bioinformatics/Comp Bio people

Save this thread for a walkthrough of #Devin AI designing primers for flu detection from scratch!

I asked "Help me identify DNA primer candidates for the most recent flu strain in the US"

Here's what happened! 🧵
#biotech #compbio 1.
I didn't give it much detail and my request implicitly required it to determine what the most recent flu strain in the US was, and how to get that data.

It intelligently used the NCBI api to get the H1N1 sequence data for 2023 (most recent flue season)

It wrote Python to retrieve the data and installed Biopython. Then, it looked at the FASTA file to see what it looked like.

There was an error in some of the code it wrote so it fixed it.

||
||
vImage
Mar 12 21 tweets 10 min read
i never believe recorded demos so I reached out to the @cognition_labs team for early access to try for myself and got it!

will be sharing my unfiltered opinions on #devin here.

🧵🧵 1/n My first task I asked it for, was a website where you play chess against an LLM. You make a move, the move is communicated to GPT-4 via a prompt, and GPT-4 replies, and the reply is converted into a move that is reflected on the chessboard.

So quite a few moving parts.

I was curious to see if Devin:
1. Would figure out how to accurately use the GPT-4 API because most LLMs don't actually know how to use it, and there are conflicting versions online bc OpenAI changed the API in November
2. Would appropriately ask for an API key and securely handle it.
3. Would deal with package errors
4. Would understand how to prompt an LLM to make a chess move and return it in a precise notation (FEN in this case)

Here is what happened:
2/nImage
Aug 10, 2023 19 tweets 6 min read
A 🧵 of peer reviewed published scientific research where the authors left out a key coauthor 😉:

“As an AI language model…”

@MicrobiomDigest Image This one is published on “” so who knows

Peer reviewed in the journal of internet banking and commerce

Stock price prediction is so tired 😴 predatory-publishing.com
Image
Aug 10, 2023 20 tweets 4 min read
Go to Google Scholar and look up ‘As an AI language model” -“ChatGPT”’ Image A thread:
May 11, 2023 6 tweets 1 min read
🧵I had the opportunity to stay in South Africa over the summer.

They have a custom called"stokvel", which is
a friend-based credit union. It's genius + super interesting.

About half of South Africans take part and $2.5B USD is put in each year.

👇👇 Image You form a group of 12 with friends. Each month, you pitch in a set amount, such as $100. One out of 12 months, it's your turn to receive that month's money ($1200). The rest of the months, you put $100 in and one of your friends gets the money.

Why is this good?
May 4, 2023 8 tweets 3 min read
WTF: Mind reading is here.

Researchers invented a new #AI method to convert brain signals into video. See the results for yourself

Published in Nature yesterday: nature.com/articles/s4158…

What are the implications? Is this the biggest paper of 2023?

#CEBRA Check out the homepage of CEBRA here: cebra.ai
And the abstract: Image
Mar 17, 2023 5 tweets 3 min read
Offering FREE access to #GPT4 and comparing with #GPT3.

Retweet for instant access (follow so I can DM you).

💻🔮💫✨
#openai #ai #tech #chatgpt Image Is water wet?
GPT4 gives a direct answer.

GPT3 responds "As an AI language model..." Image
Mar 17, 2023 9 tweets 4 min read
🧵Racial minorities are 40% of the US population but only 5% of jurors.

For @Stanford CS109, I built an interactive site that explores the probabilities of jury selection and race.

cs109.olafblitz.repl.co

👇 more info Image @Stanford 1/n

In the #AhmaudArbery trial, only 1 in 12 jurors was Black, even though the local population is 27% Black.

The defendants struck out 11/12 jurors.

In another case, a jury pool of 105 people in Stockton, CA had 0 Black people.

The website teaches probability in the context… twitter.com/i/web/status/1…
Mar 17, 2023 4 tweets 2 min read
for some reason, on desktop Twitter, it's always "load 34 tweets". Never 24 or 63. Just 34. twitter.com/i/web/status/1… @Scobleizer's tweet is caught in this screenshot lol
Mar 17, 2023 10 tweets 2 min read
You can't do everything and be everything.

Someone who spends 100% effort on A will beat you who does 50% A, 50% B.

Jack of all trades is master of none.

🧵 Yes, synergy is a thing, many Nobel Prize winners have hobbies and are great musicians and artists. Not what I'm talking about here.

1/n
Mar 16, 2023 8 tweets 3 min read
all my friends who read a lot when young are successful now

i think reading is probably the more important thing you can do in elementary school

if ur brain is like an LLM, you should maximize the # of tokens it sees while training i read so much in elementary school that writing is very easy for me. i liken it to next token prediction.

SAT/ACT grammar came very easy for me, i just went off of what sounded right. idioms as well, even though my family isn't native english speaker. but id read all them
Mar 15, 2023 7 tweets 2 min read
“Money doesn’t buy happiness” is a coping mechanism.

Sure money isn’t 1:1 with happiness 💰 😆

But I’d rather be crying with 10M in the bank than 10 bucks.

Don’t settle for less. Caveats to this claim but I’m too lazy to write them out
Mar 15, 2023 8 tweets 4 min read
🧵 EVERY new #AI feature @Google launched, and how YOU can take advantage!

This is going to save you hundreds of hours.👇👇👇 Image I have always been bullish on Google even though everyone says they’re behind. Google has the money and talent, but more importantly, existing users locked into their massive product ecosystem.

1: One click Powerpoint! Put in a topic. Seems to be Stable diffusion images 1/n #ai ImageImageImage