How can we change a 3 minute load time to 1 second?
⚡️⚡️⚡️🤯
As a Pandas🐼 user, the read_csv method might be very dear 💕to you.
But even with a lot of tuning, it will still be slow.
Let's make it faster!!!
[1 ⚡️ min]
1/7🧵
As a ML developer or Data Scientist, [re]loading data is something you do many many times a day!
Having long loading times can make experimentation annoying as everytime you do it, you'll "pay" the time-tax
2/7🧵
One trick to make loading faster is to use a faster file format!
Let's try the Feather file format.
It is a portable that uses the Arrow IPC format: arrow.apache.org/docs/python/ip…
3/7🧵
How can we do that?
Simple!
Pandas has a to_feather method on the data frame! Easy as that👍🏾
Later you can use the Pandas method read_feather!
Boom, 1.16s loading time!⚡️⚡️⚡️
4/7🧵
The first time you load your dataset, you can use all the tricks on my previous threads, like this one:
After that, you save in the feather format and keep using that for the rest of your work on the following days!
5/7🧵
Being efficient with slow tasks helps you spend more time on more interesting work!
Loading a dataset in 1s means that your line of thought won't be broken so often, helping keep you in the flow!
6/7🧵
For more ML, Python and Career daily content, follow @gusthema and don't miss on these kind of tips!
Don't forget to share this thread with your friends, let's save everyone some daily minutes!
7/7🧵
Share this Scrolly Tale with your friends.
A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.
