Adarsh Profile picture
Feb 9 9 tweets 2 min read Read on X
tldr

spent the weekend debugging why our scraping workers kept crashing 😅
turns out processing everything in parallel was eating up memory and causing OOM errors
switched to sequential processing, added some GC triggers, and bumped up memory limits.
back to 95%+ uptime now 🎉 Image
1. how it was working before:
we were scraping 7 platforms (X, reddit, youtube, tiktok, instagram, threads, pinterest) all at once using Promise.all()
worked fine initially. but as we scaled and sessions got bigger, memory started spiking...
2. first attempt to fix:
thought it was just a memory limit issue. bumped from 512MB to 1GB in the dockerfile
workers still crashed 💥
checked logs: "JavaScript heap out of memory" errors everywhere
3. Second attempt:
ok maybe it's the batch size? reduced SESSION_BATCH_SIZE from 5 to 2
still crashing. tried adding manual garbage collection triggers
nope. still OOM errors 😤
4. third attempt:
maybe we're loading too much data? limited existing IDs retrieval to 500 per platform
added retry logic with exponential backoff for transient errors
workers still dying. status page showing red segments every few hours 📉
5. the realization:
was staring at memory graphs and it hit me - we're holding ALL platform data in memory at once
twitter + reddit + youtube + tiktok + instagram + threads + pinterest = 💥
even with 1GB, when each platform returns hundreds of posts, it adds up fast
6. the final fix:
switched to sequential processing. one platform at a time, save to DB immediately, then move on
added explicit GC after each platform finishes
kept the 1GB limit but now it's actually enough
added retry logic so transient API errors don't kill the whole job
7. results:
went from intermittent crashes to 95%+ uptime ✨
response times stabilized (no more 2s spikes)
workers process jobs reliably now
sometimes slower is better. especially when "faster" means your service is down
8. bonus win:
was running on a 13€/month server (16GB RAM) but only using like 2-3GB
after the fix, downgraded to 5€/month (8GB) and it's still more than enough 💰

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Adarsh

Adarsh Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(