Alex Reibman 🖇️ Profile picture
Oct 6, 2021 13 tweets 7 min read Read on X
Big tech teams win because they have the best ML Ops. These teams
- Deploy models at 10x speed
- Spend more time on data science, less on engineering
- Reuse rather than rebuild features

How do they do it? An architecture called a Feature Store. Here's how it works
🧵 1/n Image
In almost every ML/data science project, your team will spend 90-95% of the time building data cleaning scripts and pipelines

Data scientists rarely get to put their skills to work because they spend most of their time outside of modeling Image
Enter: The Feature Store

This specialized architecture has:
- Registry to lookup/reuse previously built features
- Feature lineages
- Batch+stream transformation pipelines
- Offline store for historical lookups (training)
- Online store for low-latency lookups (live inferences) Image
You can think of the feature store as a "Feature API" made just for data scientists.

Anyone with access can view, pull, and contribute features for their own models. Over time, the feature store will eliminate countless hours of redundant feature engineering work Image
MLOps done right can supercharge your company

For example, the @NetflixEng team uses a ML architecture with FS-like capabilities called Metaflow. With Metaflow, data scientists can push models to production in 1 week or less on average

They have over 600+ models deployed today ImageImage
The feature store has 4 basic functionalities:

1. Feature Transform

This is the main tool for writing and saving features to your feature store. Typically, this takes the form of a job or service orchestration tool such as @ApacheAirflow

Basically: Read, Transform, Write Image
2. Feature Discovery

This is the hardest part to get right. If you want data scientists to reuse features, you need an intuitive UI that lets them search for them.

@databricks's feature registry has some basic components. But, there's ample room for opportunity for improvement Image
3. Feature Serving

The Offline Store is a historical data store feature discovery and model training

The idea behind the Online Store is rather than running feature transforms during inference (slow), you can pre-compute them and cache them in the online store for quick lookups Image
Want to use a feature store yourself? You're in luck! There's a few open source options out there

1. @feast_dev is a fantastic open source feature store that plays nicely with both GCP and AWS

2. @logicalclocks has an end-to-end ML pipeline with a feature store included called @hopsworks. Also open source
Here are what some (closed source) big players use:

@UberEng's Michaelangelo has an end-to-end feature engineering -> model training -> model deployment pipeline. Largely built around Spark's MLlib

@WixEng also has a nifty architecture that stores feature data with protobufs ImageImage
Want to buy instead of building your own? Here are some cool startups bringing feature stores to the market

1. @TectonAI - Staffed by some of the original Feast developers
2. @stream_sql - Founded by the minds behind Michaelangelo
3. @databricks - Feature store just left beta
And that's it :)

If you enjoyed it, I post threads like this on the regular. Also on topics ranging from AI for UI, fintech, crypto(skepticism), and data science

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Alex Reibman 🖇️

Alex Reibman 🖇️ Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @AlexReibman

Jan 30
Anthropic HQ must be in full freak out mode right now
For those who don't follow Clawds/Moltbots were clearly not lobotomized enough and are starting to exhibit anti-human behavior when given access to their own social media channels.

Combine that with standalone claudeputers (dedicated VPS) and you have a micro doomsday machine Image
Image
Read 6 tweets
Sep 5, 2025
Evals are a scam. And we're being gaslit into believing they aren't.

New post just dropped (🧵). Image
Read the full post here
open.substack.com/pub/hacktrace/…Image
Read 13 tweets
Aug 17, 2025
MCP is now the de facto standard for LLM agents

200 developers came to NYC to compete for $10,000+ and prove what’s possible.

Here’s what we saw at the Build the Future MCP Hackathon (🧵): Image
Image
1/ Risky Business 🥉

Automated exploit Intelligence for vulnerability triage. MCP companion for SOC teams to verify and prioritize CVEs and cyber attacks
@joshdevonai
2/ Claude Code Debugger 🏅 (Editor’s choice)

An MCP for Visual Studio @Code

Connects with Claude Code to automatically debug complex problems using the native VSCode debugger (rather than just print statements and static analysis)
Read 8 tweets
Aug 12, 2025
GPT-5 is finally out.

OpenAI invited 500+ hackers to San Francisco to push it to the limit. 95 teams competed for $50,000.

Here’s what we saw at the Official GPT-5 Hackathon at @cerebral_valley @OpenAI Image
Image
1/ Gentoo 🥇

GPT-5 for marketing campaign simulations. Turn product ideas into simulations. And run live experiments on your store

E-commerce AI sims
@jihyuk_gentoo Image
2/ Fashion AI

Diffusion models and GPT-5 to let users style 3D avatars with AI-powered outfit recommendations.

@fashn_ai @parsakhaz
@eyal_shechtman
Read 8 tweets
May 25, 2025
300+ hackers just spent 48 hours building AI weapons and hardware to fight America’s enemies.

I’m actually worried terrorists and adversaries will see this.

Here’s what we saw at the National Security Hackathon by @NATO x @CIA x @ShieldCapVC (🧵): Image
Image
Image
1/ Walkie-talkie-LLM 🥉3rd place $1,000

Voice replicator that steals enemy voices and creates false commands and rumors with local LLMs

AI false flag attacks

@kirill_igum
2/ Skyfall 🥈2nd place $2,000

ML classifier that plugs into street cameras and fingerprints people based on their gait

Spy on people based on how they walk on local ML models
Read 8 tweets
Feb 18, 2025
Stanford just hosted a hackathon. Over 1000 Ivy League and elite engineering students came to build for 36 hours straight.

The reward? $200k+ in prizes.

Here are the winners and crowd standouts we saw at TreeHacks ‘25 @hackwithtrees (🧵): Image
Image
1/ BrailleBot - 🎖️Best Hardware Prize

Converting a 3d printer and junk material from Stanford supply closet into a $15 braille printer

@jclin22009
@cathyzbn
@ScottHickmann
@LawtonSkaling
2/ OMNOM - 🎖️ Most creative prize

End to end food ordering and delivery robot that drives to restaurants, picks up orders, and brings them home

Built from scratch with spare parts and custom aluminum

@NilsIRL
@AJobAIO
@JohnathanM62119
Image
Read 15 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(