Gus (🤖🧠+🐍+🥑🗣️) Profile picture
Nov 6, 2021 7 tweets 3 min read Read on X
How can we change a 3 minute load time to 1 second?
⚡️⚡️⚡️🤯

As a Pandas🐼 user, the read_csv method might be very dear 💕to you.
But even with a lot of tuning, it will still be slow.

Let's make it faster!!!

[1 ⚡️ min]

1/7🧵 Image
As a ML developer or Data Scientist, [re]loading data is something you do many many times a day!

Having long loading times can make experimentation annoying as everytime you do it, you'll "pay" the time-tax

2/7🧵
One trick to make loading faster is to use a faster file format!

Let's try the Feather file format.

It is a portable that uses the Arrow IPC format: arrow.apache.org/docs/python/ip…

3/7🧵
How can we do that?
Simple!

Pandas has a to_feather method on the data frame! Easy as that👍🏾

Later you can use the Pandas method read_feather!
Boom, 1.16s loading time!⚡️⚡️⚡️

4/7🧵 ImageImage
The first time you load your dataset, you can use all the tricks on my previous threads, like this one:



After that, you save in the feather format and keep using that for the rest of your work on the following days!

5/7🧵
Being efficient with slow tasks helps you spend more time on more interesting work!

Loading a dataset in 1s means that your line of thought won't be broken so often, helping keep you in the flow!

6/7🧵
For more ML, Python and Career daily content, follow @gusthema and don't miss on these kind of tips!

Don't forget to share this thread with your friends, let's save everyone some daily minutes!

7/7🧵

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Gus (🤖🧠+🐍+🥑🗣️)

Gus (🤖🧠+🐍+🥑🗣️) Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @gusthema

Mar 14, 2023
Ok, since today is Pi (π) day, maybe it's a good day to learn about it a little bit!

Here are some fun facts about the number Pi
🤓

👇
Pi is the ratio of a circle's circumference to its diameter.

It is an irrational number, meaning it cannot be expressed as the ratio of two integers.

And its digits NEVER repeat in a regular pattern.

👇
As of Today, 100 trillion digits of Pi were calculated!!

The "last" 10 digits are: 43095295560

If all the digits were written to a txt file, with regular ASCII encoding, that would be a 100 Terabytes file! 🤯

You can learn more about it here:
cloud.google.com/blog/products/…

👇
Read 13 tweets
Mar 13, 2023
Learning how to apply Machine Learning for the Audio domain can be tricky as there are aspects related to the data that might not be obvious and it's not as popular of a topic as Image or Text

Don't worry! I got you covered!

Here are some tutorials to get you started:

👇
The first one you should take a look at is the Recognizing Keywords tutorial:

tensorflow.org/tutorials/audi…

This tutorial goes over some of the basics and it's a great start

👇
Building a model from scratch with great results is hard
😓

Stand on the shoulders of giants using a pre-trained model!

Here is a tutorial doing just that: tensorflow.org/tutorials/audi…

This model can be easily used on mobile devices and on the browser!

👇
Read 8 tweets
Mar 8, 2023
🐦🦅🦆🦉🦜+ 🤖🧠 = 💰

I've been working with ML for the Audio domain for a while

At first I couldn't understand much but as I kept reading I managed to figure out some things.

Let me share some of the basic theory with you:
🎙️🧑‍🏫

👇
Sound is a vibration that propagates as an acoustic wave.

It has some properties:
• Frequency
• Amplitude
• Speed
• Direction

For us, Frequency and Amplitude are the important features.

en.wikipedia.org/wiki/Sound#Sou…

👇
An important aspect is that Sounds are a mixture of their component Sinusoidal waves (follow a sine curve) of different frequencies.

From the equation below:
• A is amplitude
• f is frequency
• t is time

👇

gist.github.com/gustheman/9101… ImageImageImage
Read 15 tweets
Sep 12, 2022
"How do I learn Python?"
🤔

3 tips:

• Do one basic tutorial 🤓
• Practice, practice, practice 💪🏾
• Start/Find a project to apply what you learned 🧐

"Ok Gus, how about some links?"

👇
I have three very good Python tutorials to get you started:

1⃣ Kaggle course: kaggle.com/learn/python

• Kaggle Kernels allow you to try the code in the browser
• Very good pace of content
• Fun and challenging puzzles

👇
2⃣ THE Python tutorial: docs.python.org/3/tutorial/

This is the official one and I like it very much

It's very direct on how things work without fun exercises but some people prefer this approach

👇
Read 9 tweets
Aug 29, 2022
When we talk about Decision Trees in Machine Learning, one of the most popular and powerful algorithms is the Gradient Boosted Decision Trees

Do you know how it works?🤔

Let me give you an easy explanation of how it works…👀

👇
Gradient Boosted Decision Trees (GBDT) is an ensemble method -> it's based on a set of other smaller models

The smaller models are just Simple Decision trees, similar to the Random Forest algorithm

👇
Random Forest and GBDT both use a set of basic Simple Decision Trees but are trained and work differently

The idea of the GBDT algo:

➡️to improve your model's prediction, you add new trees that make its error (distance from its prediction to the real label) smaller

🤔😵‍💫

👇
Read 9 tweets
Jul 28, 2022
What are Python🐍 decorators🎀?

Decorator is an Object Oriented pattern that allows behavior to be added to individual objects

They can be more efficient than subclassing and in some cases it can make your code 1000 faster!👀🤯

👇
Python supports this pattern and you can apply decorators to functions like this:

@some_decorator
def my_function():
Some_code

This is also called metaprogramming.

👇
There are many built-in decorator available, like: lru_cache, data_class, singleton

lru_cache creates an automatic cache for you.
The cache, in the code below, has a ~1000 times improvement!
⚡️🤯
Read 5 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(