Leonie Profile picture
Developer advocate @elastic_devs helping devs build search-powered agents | Google Developer Expert (Kaggle) | Views are my own
Jun 30 6 tweets 2 min read
Getting LLMs to output valid JSON is one of the most common production tasks.

But most benchmarks can't tell if your model actually does it well.

Here's how the team at @LiquidAI built IFStruct to measure exactly this (and how they trained a 350M model to beat models 10x its size). 🧵Image Most evals either do one of two things:
> force the model's output using hard rules
> score content quality alongside format.

The gap IFStruct fills is to answer the question:

"Can a model follow a schema when a user asks for it in plain language?"
Oct 16, 2024 12 tweets 4 min read
ColBERT is a new retrieval model.

While common dense retrieval models are
- either fast
- or effective,

ColBERT promises to be both:
Fast & effective.

Let’s dive in! Image ColBERT leverages BERT while introducing the new “late interaction” mechanism:

Col → Contextualized Late Interaction
BERT → over BERT

Fun fact: it’s pronounced /koʊlˈbɛər after Stephen Colbert in reference to his Late Show
Aug 27, 2024 15 tweets 4 min read
I’ve been learning about Retrieval-Augmented Generation (RAG) for over a year.

Here’s how I’d approach it if I had to start all over again today: Image 1. Understand that RAG is just a subset of building LLM-powered apps.

RAG just describes explicitly that you are using an external knowledge source.

Any resource you find on building LLM-powered apps could be applicable to RAG as well.
Aug 13, 2024 14 tweets 4 min read
New to fine-tuning LLMs?
Confused by all the jargon?

Me, too.

So, I did a little deep dive into LLM fine-tuning.
Here’s what I understood: Image Let’s take one step back before we get into the details of fine-tuning.

At the highest level, there are two types of model training:

Pre-training (for LLM training):
• Input: Large corpus of unlabeled raw data
• Algorithm: Unsupervised or self-supervised learning
• Output: Base model

Fine-tuning (for LLM alignment):
• Input: Smaller, more refined, labeled data
• Algorithm: Supervised or reinforcement learning
• Output: fine-tuned modelImage
Jan 2, 2023 31 tweets 12 min read
This thread will be a collection of the unstructured learnings of my "30 days of time series" challenge. Day 1 of #30daysoftimeseries:

Started reviewing top solutions of past Kaggle forecasting comps.

Seems like the M5 competition in 2020 was a turning point for which models are used.

- Before M5: mix of classical and ML models
- After M5: mainly ML models like LightGBM and NN.