Ramon Fritsch Profile picture
Jan 14 14 tweets 4 min read Read on X
Let me tell you how I took this CPU usage of an 8 core server down from 800% to 140%. Image
A while ago I got an email from @linode saying the server was peaking at 770% for a few hours. It was just a heads up since dedicated servers are designed to sustain such load for a long time without any issues. Image
First I suspected it was a DDOS attack. By seeing the network traffic chart, I noticed most of the traffic was happening between servers. (@letsavee has 2 main servers, www and database). Image
Ran `nethogs` on the www machine to inspect this substantial traffic. All being sent/received by the www NodeJS processes. Image
Ran `nethogs` on the database machine as well. All being sent/received by the MongoDB process. Image
Ran `mongostat` to take a look at query load on mongo. Ouch, ~15000 queries in the queue. I don't remember ever seeing that much before. Image
Decided to take a closer look at queries. By running `db.setProfilingLevel(2, 0);` on Mongo, it allows it to log EVERY SINGLE QUERY into a file for later inspection. I let it run for a few minutes and got back to the previous setting.
Now is time to download that log file and write a custom script to count how many queries each table had during that period. Found out it was mostly the teamusers table. Image
Then I inspected the code to look for bad smells. We're low key rolling out the Teams plan to test for adoption. This made our check to see if a user has valid Pro privileges more complex as it now has to do an aggregation to see if the user is a member of a valid paying team.
Found the query and introduced a cache layer for it, invalidating every time something related to this query changed (user is added to a team, team admin's subscription expires…). This is super easy to get out of sync so I use this technique only in a few strategic places.
Here's what those scary CPU/Network charts look like now. Huge drop!
CPU: 800% to 130%
Private Network In: 52Mbps to 3Mbps Image
Of course this also had an impact on the database server. YEAH! Image
Best thing was a power user sending a DM that he felt the difference right away. He mentioned his page load dropped from ~30s to ~6s and the API got much faster as well.

It makes all the effort worth it!
@threadreaderapp unroll

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Ramon Fritsch

Ramon Fritsch Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(