Ming
Mar 15 15 tweets 3 min read Read on X
1/9 Every bulk RNA-seq experiment I run goes through the same 7 checks before I trust the results.

I've been burned enough times to know: if you skip QC, you will find out the hard way. Usually during a meeting with your collaborator.

Here's my checklist:
2/9 Check 1: FastQC + MultiQC on raw reads.

Before anything else. You're looking for adapter contamination, GC bias, per-base quality drops, and overrepresented sequences.

I've caught entire lanes of garbage data at this step. Five minutes that saves you days.
3/9 Check 2: Mapping rate.

After alignment (STAR, HISAT2, whatever you use), check the percentage of uniquely mapped reads. I want to see >70% for most human/mouse experiments.

Low mapping rate? Could be contamination, wrong reference genome, or your library prep went sideways.
4/9 Check 3: Gene body coverage.

Run RSeQC's geneBody_coverage.py. You want a roughly even signal across the gene body for poly-A enriched libraries.
Heavy 3' bias? Degraded RNA. Heavy 5' bias? Rare, but possible fragmentation issue. Either way, you need to know before you start calling DE genes.
5/9 Check 4: PCA plot. The most important plot in the entire analysis.

Do your samples cluster by the biology (treatment, condition) or by batch, lane, or extraction date?
If your PC1 is "which day the RNA was extracted," you have a batch effect problem. Fix it now or your DE results are noise.
6/9 Check 5: Sample-to-sample correlation heatmap.

Complements the PCA. Hierarchical clustering should group replicates together.

If one replicate clusters with the wrong group, you either have a sample swap or an outlier. I've caught mislabeled samples this way more than once.
7/9 Check 6: Library complexity / duplication rate.

Picard's MarkDuplicates or just check the duplication stats from STAR. High duplication (>60%) means you probably sequenced too little input material.
Your "20 million reads" might actually be 5 million unique reads. That changes everything for statistical power.
8/9 Check 7: Count distribution and filtering.

After quantification, look at the distribution of counts per gene. Filter low-count genes (I typically require >10 counts in at least n samples where n = your smallest group size).
Also check for genes driving >5% of total counts. One mitochondrial gene eating half your library is more common than you think.
9/9 I run these 7 checks on every single dataset. No exceptions.

It takes about 30 minutes for a typical experiment. I've written Snakemake pipelines that automate most of it.
The alternative is spending two weeks on a differential expression analysis, presenting results, and having someone ask "did you check for batch effects?" while you stare at the floor.

Ask me how I know.
I hope you've found this post helpful.

Follow me for more.

Subscribe to my FREE newsletter chatomics to learn bioinformatics divingintogeneticsandgenomics.ck.page/profile x.com/433559451/stat…

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Ming "Tommy" Tang

Ming

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @tangming2005

Mar 16
1/7 Most people use Claude Code wrong for building websites.

They type "build me a landing page" and get a generic Bootstrap mess. Then they spend hours fixing it.

I did this too until I started using two plugins that completely changed my workflow.
2/7 The mistake: jumping straight into building.

You skip the planning. Claude doesn't know what framework you want, what the site is for, who the audience is, what pages you need.
So it guesses. And the guesses are mediocre.

It's like telling a contractor "build me a house" without blueprints.
Read 14 tweets
Mar 13
1/ AI won't replace you.
But a biologist using AI will.
Especially in bioinformatics, where the questions never stop coming. Image
2/
In 2013, if you had a problem, you went to Biostars or StackOverflow.
You waited days. Or you searched for hours.
3/
Now?
You ask ChatGPT, Claude, or Gemini.
You get a decent answer in seconds.
That’s not magic. That’s leverage.
Read 16 tweets
Mar 12
Anthropic just dropped a 33-page guide on building skills for Claude.

I read the whole thing. Here's what actually matters: Image
1/ First, what's a skill?

A folder with a SKILL.md file. That's it. You write instructions in markdown, and Claude follows them every time instead of you re-explaining your workflow in every conversation.
Think of it as persistent memory for how you want things done.
Read 14 tweets
Mar 12
Claude Code can't see your browser.

But it can run Playwright. And that changes everything for testing web apps. github.com/microsoft/play…Image
1/ Setup takes one command:

npm init playwright@latest

Claude Code runs it, scaffolds the config, and starts writing tests. Headless by default. No browser window needed.
2/ I told Claude Code: "write a Playwright test that logs into my app, navigates to the dashboard, and verifies the data table loads."

It wrote the test. Ran it. Read the failure. Fixed the test. Ran it again.

I never opened a browser.
Read 8 tweets
Mar 10
Single biggest improvement I made to my CLAUDE.md:

"When I report a bug, don't start by trying to fix it. Instead, start by writing a test that reproduces the bug. Then, have subagents try to fix the bug and prove it with a passing test."
1/ Most developers (and AI agents) see a bug and immediately start hacking at the code.

That's backwards.

You're guessing at the fix before you even understand the failure.
2/ Here's what happens when you let Claude Code jump straight to fixing:

- It changes 3 files
- The bug looks fixed
- You ship it
- A week later the same bug is back, slightly different

Sound familiar?
Read 12 tweets
Mar 9
Claude Code keeps asking for permission every 3 seconds and you're about to lose your mind.

Here's how to fix it — from nuclear to surgical:
1/ The nuclear option: --dangerously-skip-permissions

claude --dangerously-skip-permissions "build the web app"

Claude bypasses ALL permission prompts. Runs until completion. Zero interruptions.

Anthropic's own engineers use it. But always in a container. Never on bare metal.
2/ The middle ground: Shift+Tab

Press Shift+Tab during a session to cycle through permission modes.

acceptEdits auto-approves all file modifications but still prompts for shell commands.

Good enough for most web app dev sessions.
Read 10 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(