Gus (🤖🧠+🐍+🥑🗣️) Profile picture
AI Developer Advocate @google - Gemma 💎 - Machine Learning 🤖🧠 - Google AI ⚙️🧠 - DevRel 🥑🗣️ find me also at: https://t.co/3nrTwEKoJ0

Nov 6, 2021, 7 tweets

How can we change a 3 minute load time to 1 second?
⚡️⚡️⚡️🤯

As a Pandas🐼 user, the read_csv method might be very dear 💕to you.
But even with a lot of tuning, it will still be slow.

Let's make it faster!!!

[1 ⚡️ min]

1/7🧵

As a ML developer or Data Scientist, [re]loading data is something you do many many times a day!

Having long loading times can make experimentation annoying as everytime you do it, you'll "pay" the time-tax

2/7🧵

One trick to make loading faster is to use a faster file format!

Let's try the Feather file format.

It is a portable that uses the Arrow IPC format: arrow.apache.org/docs/python/ip…

3/7🧵

How can we do that?
Simple!

Pandas has a to_feather method on the data frame! Easy as that👍🏾

Later you can use the Pandas method read_feather!
Boom, 1.16s loading time!⚡️⚡️⚡️

4/7🧵

The first time you load your dataset, you can use all the tricks on my previous threads, like this one:



After that, you save in the feather format and keep using that for the rest of your work on the following days!

5/7🧵

Being efficient with slow tasks helps you spend more time on more interesting work!

Loading a dataset in 1s means that your line of thought won't be broken so often, helping keep you in the flow!

6/7🧵

For more ML, Python and Career daily content, follow @gusthema and don't miss on these kind of tips!

Don't forget to share this thread with your friends, let's save everyone some daily minutes!

7/7🧵

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling