Ming
Director of bioinformatics at AstraZeneca. YouTube at chatomics. On my way to helping 1 million people learn bioinformatics. Also talks about leadership.
Mar 15 15 tweets 3 min read
1/9 Every bulk RNA-seq experiment I run goes through the same 7 checks before I trust the results.

I've been burned enough times to know: if you skip QC, you will find out the hard way. Usually during a meeting with your collaborator.

Here's my checklist: 2/9 Check 1: FastQC + MultiQC on raw reads.

Before anything else. You're looking for adapter contamination, GC bias, per-base quality drops, and overrepresented sequences.

I've caught entire lanes of garbage data at this step. Five minutes that saves you days.
Mar 13 16 tweets 3 min read
1/ AI won't replace you.
But a biologist using AI will.
Especially in bioinformatics, where the questions never stop coming. Image 2/
In 2013, if you had a problem, you went to Biostars or StackOverflow.
You waited days. Or you searched for hours.
Mar 12 14 tweets 3 min read
Anthropic just dropped a 33-page guide on building skills for Claude.

I read the whole thing. Here's what actually matters: Image 1/ First, what's a skill?

A folder with a SKILL.md file. That's it. You write instructions in markdown, and Claude follows them every time instead of you re-explaining your workflow in every conversation.
Mar 12 8 tweets 2 min read
Claude Code can't see your browser.

But it can run Playwright. And that changes everything for testing web apps. github.com/microsoft/play…Image 1/ Setup takes one command:

npm init playwright@latest

Claude Code runs it, scaffolds the config, and starts writing tests. Headless by default. No browser window needed.
Mar 9 10 tweets 2 min read
Claude Code keeps asking for permission every 3 seconds and you're about to lose your mind.

Here's how to fix it — from nuclear to surgical: 1/ The nuclear option: --dangerously-skip-permissions

claude --dangerously-skip-permissions "build the web app"

Claude bypasses ALL permission prompts. Runs until completion. Zero interruptions.

Anthropic's own engineers use it. But always in a container. Never on bare metal.
Mar 8 7 tweets 2 min read
6 links on workflow to make your life easier 🧵
Bioinformatics analysis involves a lot of steps, 6 links on workflow to make your life easier:

1. over hundreds of workflow tools and engines github.com/pditommaso/awe… 2. see also from the CWL wiki github.com/common-workflo…
Mar 6 10 tweets 2 min read
Claude Code has a memory problem. Every new session starts from scratch.

claude-mem fixes that. And it just crossed 33,000 stars on GitHub. Image Here's how it works.

It runs as a Claude Code plugin. Every session, it quietly captures what Claude does -- tool calls, file reads, code changes -- and compresses them into semantic summaries.
Mar 4 12 tweets 2 min read
Batch effects once caused 162 patients to be misclassified.

28 of them received incorrect or unnecessary chemotherapy.

The culprit? Contaminated RNA extraction that introduced technical artifacts into the data. Image This is a patient safety problem disguised as a bioinformatics problem.

A review in Genome Biology lays out the full picture.
Mar 3 10 tweets 3 min read
you think you know it? 8 links to BETTER understand principal component analysis (PCA) 🧵 👇 Image 1. PCA in action, my blog post to calculate SVD and PCA with #rstatsdivingintogeneticsandgenomics.rbind.io/post/pca-in-ac…
Mar 2 8 tweets 3 min read
5 tools to visualize genomic datasets 🧵
1. Karyoploter bernatgel.github.io/karyoploter_tu…Image I used that to plot single-cell ATACseq tracks github.com/crazyhottommy/…, more examples rpubs.com/crazyhottommy/…
Feb 27 14 tweets 4 min read
10 websites for drawing scientific figures 👇 Image 1. Biorenderbiorender.com
Feb 19 14 tweets 3 min read
Why are there intronic reads in your bulk RNA-seq data?
You're not alone—it's common, and the reasons are more layered than you think.
Let’s break it down. 🧵 Image 1/
Your RNA-seq targets mature mRNA, right?
So why do you see intronic reads?
Here’s why—some technical, some biological.
Feb 18 14 tweets 4 min read
🧵 10 free bioinformatics tools you should know in 2026. These will save you time, money, and headaches. Image 1/ NCBI BLAST: Still the gold standard for sequence alignment. Compare your sequences against massive databases.

challenge: use Claude code to rewrite BLAST to make it faster! I know someone did it.blast.ncbi.nlm.nih.gov
Feb 17 13 tweets 3 min read
🧵 Need gene lengths for every mouse gene? Bioconductor annotation packages make this simple. Here's how to get gene lengths using TxDb.Mmusculus and .db. org.Mm.egImage 1/ Why gene lengths matter: You need them for normalizing ChIP-seq signals in gene bodies (like H3K36me3). For RNA-seq, exon lengths are used to calculate RPKM or TPM.
Feb 17 16 tweets 3 min read
🧵 OpenClaw went from 0 to 180,000 GitHub stars in 72 hours. Then changed its name twice. Then taught itself to modify its own code. Here's why this AI agent is different from everything else you've tried. Image 1/ Every AI tool you use lives in the cloud. ChatGPT, Claude, Copilot. You type in a browser, it answers, you copy-paste. OpenClaw runs on YOUR machine. Locally. With full system access.
Feb 12 12 tweets 3 min read
1/ If you're in bioinformatics, you're staring at matrices all day.
RNA-seq? Gene x sample.
scRNA-seq? Gene x cell.
Everything is a matrix.
But I never learned how to think in matrices. And I regret it. Image 2/
No one told me in school:
To survive bioinformatics, you don’t just need R or Python.
You need linear algebra.
Not fancy. Just the fundamentals.
Jan 23 17 tweets 3 min read
Claude keeps suggesting outdated tools for my RNA-seq analysis.
Then I learned about skills.
Now it actually helps instead of creating work. Image 1/
A Claude skill is just a text file (markdown) with instructions.
I still remember learning markdown 12 years ago and now it is a language those AI tools speak!
Jan 19 14 tweets 3 min read
1/ Women scheduled surgery after being told they had rare BRCA1 variants.
The genetic test was wrong.

University of Exeter analyzed 50,000 samples to find out how often this happens.

The results should worry anyone who's downloaded their 23andMe raw data. Image 2/
SNP chips test genetic variation at hundreds of thousands of positions across your genome. They're the technology behind most direct-to-consumer genetic testing.

For common variants, they work brilliantly. For rare disease-causing mutations, they fail spectacularly.
Jan 5 16 tweets 3 min read
1/ Your genomics analysis is wrong because you are not careful about this. Image A collider is a variable that's the common outcome of two other variables.

When you filter on it, you force those two variables to correlate - even when they're completely independent in the full dataset.
Dec 2, 2025 11 tweets 4 min read
10 courses for my dream bioinformatics curriculum:

1. Unix Commands with Greg Wilson youtube.com/watch?v=U3iNcB…
2. statistics and R with Rafael Irizarry rafalab.dfci.harvard.edu/pages/harvardx… Image 3. Modern Statistics for Modern Biology with
Susan Holmes and Wolfgang Huber
– A great resource for statistical applications in biology. huber.embl.de/msmb/
Nov 15, 2025 9 tweets 3 min read
1/ Exploratory Data Analysis (EDA) is the first step in any data analysis journey. When working with RNA-seq data, one of the most commonly used techniques is Principal Component Analysis (PCA). But what exactly is PCA, and why does it matter? Let’s break it down. 🧵👇 Image 2/ What is PCA?

PCA is a mathematical method used to simplify complex datasets. It finds patterns by identifying directions (called principal components) that capture the most variation in the data. read my post divingintogeneticsandgenomics.com/post/pca-in-ac…