Eugene Yan Profile picture
ML, RecSys, LLMs @ Amazon; led ML @ Alibaba, startups. Figuring out what actually works. Writing to learn & teach. Building https://t.co/nsyKoR8tSA, https://t.co/jJRZ8MOSnj.
Oct 30, 2024 6 tweets 3 min read
Evaluating LLM output is hard. It's the bottleneck to scaling AI products for many teams

A key mistake is defining eval criteria w/o actually LOOKING AT THE DATA. This leads to irrelevant / unrealistic criteria + wasted effort

Thus, I built AlignEval AlignEval.com The key insight: In addition to aligning AI to human context (prompting, rag), we also need to calibrate human criteria to actual AI output.

By working backwards from the data, AlignEval helps you build better evals.

Screenshots & how it was built here: eugeneyan.com/writing/aligne…
Jun 18, 2023 8 tweets 3 min read
Our paper club recently revisited some of the earlier language modeling papers. Here's a one-liner for each.

---

Attention: Query, Key, and Value are all you need*

*Also position embeddings, multiple heads, feed-forward layers, skip-connections, etc

arxiv.org/abs/1706.03762 GPT: Decoder is all you need.

(Also, pre-training + finetuning 💪)

cdn.openai.com/research-cover…
May 8, 2023 5 tweets 3 min read
Ran a simple benchmark (Mandelbrot sets) between Mojo & Python. The speedup is impressive, and it benefits from Python's libraries.

• Python: 1,184ms
• Mojo: 27ms 🤯
• Python (vectorized): 240ms
• Mojo (vectorized): 2ms ImageImageImageImage Here's a GitHub gist if you want to try it yourself: gist.github.com/eugeneyan/1d2e…

(Couldn't download the notebook for some reason)
May 7, 2023 5 tweets 3 min read
An insider's view on China's scale and tech, the 996 work ethic, and Alibaba's acquisition of Lazada. corecursive.com/software-world…

Years later, I'm still boggled by the scale and how we had to use a completely different tech stack (spoiler alert: it's mostly Ali Java). Image Yea, there were one-click deploys, rollbacks, A/B tests—you name it.

Also, there were SQL commands that were both powerful and scary (and borderline questionable 🙈). Any data analyst on the street became a median data scientist. Image
Apr 11, 2023 5 tweets 2 min read
Over the past few weekends, I've experimented with using LLMs to build a simple assistant.

Here's a write-up of what I built, how I built them, and their potential. Also, some shortcomings with embedding retrieval, with solutions from search & recsys.

eugeneyan.com/writing/llm-ex… Here's my first project dabbling with LLMs for the humble `summarize`

Apr 3, 2023 15 tweets 7 min read
This weekend, I had a blast building a personal board of advisors using BeautifulSoup, @LangChainAI , and @pinecone.

`/board` provides advice to questions on tech, leadership, and life. It also provides links to sources for further reading! `/ask-ey` does something similar for my own site, eugeneyan.com. And because I'm more familiar with my own writing, I can better spot shortfalls such as not answering based on a source when expected, or when a source is irrelevant.
Apr 2, 2023 6 tweets 3 min read
This is why CS fundamentals continue to be crucial: LLaMA 30B only needs 4gb of memory if we use mmap().

Not sure why this works but one reason could be that 30B weights are sparse. Thus, lazy loading the fraction of needed weights reduces memory usage.

github.com/ggerganov/llam… Image Correction: I guess all we can say is that it now (only) uses the actual memory required instead of 2x memory required.

Still an awesome improvement nonetheless!

news.ycombinator.com/item?id=354000… Image
Nov 10, 2022 13 tweets 7 min read
I'm trying to get up to speed on the subject of text2image and diffusion models. What are some good resources I should consult?

Here's what I've found helpful so far. The 2015 paper by Solh-Dickstein that demonstrates how to systematically add noise to data (e.g., images) via forward diffusion and then learn a reverse diffusion process that returns structured data from noise.

arxiv.org/abs/1503.03585
Jul 31, 2022 5 tweets 2 min read
Unfortunate, but true.

“Simplicity is a great virtue but it requires hard work to achieve it and education to appreciate it. And to make matters worse: complexity sells better.” — Edsger Dijkstra Adding tweets on this theme, starting with this.

Jun 28, 2022 6 tweets 2 min read
I'm learning how to write better Python libraries. What packages do you recommend I emulate?

sklearn, pytorch, fastai come to mind—anything else? It may be helpful to read the principles behind each implementation. Here's a paper on PyTorch's design principles

arxiv.org/abs/1912.01703
Jun 2, 2022 9 tweets 3 min read
What are some design patterns that are common in data and machine learning code? Here are a couple I've identified in my work.

Factory: Interface for creating objects
• sklearn.datasets, torchvision.datasets
torch.utils.Data.DataSet, torch.utils.data.DataLoader
Mar 17, 2022 6 tweets 2 min read
What are some good papers/tech blogs sharing how bandits are used in recommendation systems? Netflix shared how they use bandits to select the best artwork for each media

netflixtechblog.com/artwork-person…
Nov 25, 2021 7 tweets 3 min read
I'm excited to share something I've been working on for a while—ApplyingML.com!

It collects the tacit, tribal, ghost knowledge on how to apply machine learning from papers, guides, and interviews with ML practitioners. I was inspired by the unexpected popularity of the applied-ml repo (github.com/eugeneyan/appl…) and @vboykis's sharing of her ghost knowledge (veekaybee.github.io/2021/03/26/dat…).
Mar 2, 2021 4 tweets 2 min read
While starting AWS, before writing any code, engineers spent 18 months writing documents on how best to serve customers.

Similarly, before I build anything, I write docs. Here, I'll share three docs I write and reveal the framework that structures them

eugeneyan.com/writing/writin… In addition, I'm writing about design docs for data science / machine learning next.

Have you come across relevant material? Comment here or DM me please. Thank you!
Aug 4, 2020 7 tweets 2 min read
Early this year, I compiled a curriculum of the best books, essays, and videos on writing non-fiction.

I was surprised by how much I didn't learn about writing in school.

Here are some of the uncommon and best advice I've come across (a thread👇).

eugeneyan.com/writing/what-i… Writing is 80% preparation, 20% writing

“I consume information, think, and write. I would say I spend about 80% of my time consuming and thinking, and only 20% of my time writing.” – @dollarsanddata
May 10, 2020 5 tweets 3 min read
Regular note-taking didn't work for me.

Notes stay separate—connections are not made.

I fixed this by building a Zettelkasten using @RoamResearch.

(Zettelkasten originates from highly prolific sociologist Niklas Luhmann. He wrote 70 books & 500 scholarly articles.)

Thread 👇 Information vs. knowledge (1/4) Here's how a Zettelkasten works:

• Write each idea you come across on a card
• Link idea cards to other relevant idea cards (idea -> idea link)
• Sort cards into broader topic boxes (idea -> topic link)

Here's how you can build a digital one using Roam.
May 3, 2020 17 tweets 3 min read
14 Ideas on How to Grow Your Business by Writing (THREAD)

Ideas are all from @shl and @david_perell

Notes below 👇

(1/14) Everyone is a writer

• Engineers, designers, managers, etc.
• You can't avoid writing, might as well do it well
• Gumroad relies a lot on writing, so does Amazon