Caleb Lareau Profile picture
Assistant Professor at Memorial Sloan Kettering. Interested in computational and translational immunology. cat dad.
Jun 29, 2023 22 tweets 8 min read
Out today in @NatureGenet, our work utilizing single-cell multi-omics to define specific immune cell subsets that experience purifying selection against pathogenic mitochondrial DNA. Check out a thread about this work below: 1/22nature.com/articles/s4158… I'm so grateful to have worked with a terrific team of co-authors. A special thanks to @LeifLudwig (and his new group!), the @Agarwal_Lab for enabling this work with precious patient samples, and supervision and support from @Satpathology, @bloodgenes, and Aviv Regev 🙏 2/22
Mar 31, 2021 25 tweets 8 min read
I'm genuinely excited to see the extension of methods to discern clonal structure from mtDNA variants-- something that I've thought about over the years. However, I'm going to comment on a couple of points in this pre-print to hopefully show why mtDNA tracing isn't easy 1/n Disclosure: I am the author of mgatk, a method that is compared against in this new approach (Mquad), so I must admit that I was a bit peeved by the statement "there is a lack of effective computational methods to identify informative mtDNA variants" (to qualify this thread) 2/n
Oct 30, 2019 31 tweets 10 min read
A common assumption in #singlecell data is that 1 cell = 1 barcode. You've heard of cell doublets, but what about barcode doublets? In work with the great team of Sai Ma, @fabianamduarte, and @JD_Buenrostro, we quantify this effect: biorxiv.org/content/10.110…. Thread: 1/n In this paper, we coin the term 'barcode multiplets' to describe instances where a barcode was derived from a droplet with more than one high-quality oligonucleotide sequence. We use the term multiplet as we often find instances where there are more than two barcodes! 2/n
Feb 20, 2019 20 tweets 5 min read


Thanks @aayushraman @tangming2005. Here's a few thoughts on how I see the landscape of tools (thread) 1/n In HiChIP data analyses, there are two primary problems that we are trying to solve. A) Which anchors (i.e. genomic loci) should be used as a feature set and B) which loops (i.e. interactions between pairs of loci) are important in the data. 2/n