I sometimes see #SingleCell papers where the authors treat some dimensionality reduced plot as ground truth. Here's a quick, simple example showing why that can be problematic. 1/n
Here are three random distributions plotted at the vertices of an equilateral triangle. As you can see, the mean distance from any two clusters is equal. 2/n ImageImage
Right now we're using two dimensions. What if we want to reduce down to 1-D? Well, one simple solution would be to just get rid of the y values. The distance between cluster 3 & 1, and 3 & 2 are about equal, but now 1 & 2 are further apart. Not great. 3/n ImageImage
What about something fancier? Here is the first principle component of a PCA run on our two dimensions. Clusters 1 & 3 are almost on top of each other now and none of the distances are equal. Also not great. 4/n ImageImage
What about something EVEN FANCIER? UMAP has an....interesting solution. Again, the distances are completely unequal and now we've split up one of our clusters. Really not great. 5/n ImageImage
So why did they all fail? Because I set up an impossible problem. There is no way to preserve the information I encoded in 2-D down to 1-D. That isn't to say that dimensionality reduction is always terrible, but it ALWAYS loses information. This is important to keep in mind! 6/n
I think dimensionality reduction is great and I use it in all my papers, but it's not ground truth! Your high-d dataset has more information than your reduced dimensions, so don't throw that away! Analyze with hi-D; visualize with 2-D. Thanks for coming to my TED talk. n/n

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with David Glass, PhD

David Glass, PhD Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @david_r_glass

14 Jul 20
Our single cell atlas of human B cells is out in @ImmunityCP! We screened the B cell surface proteome and then followed up with functional analyses. We identify twelve populations across four lymphoid tissues. cell.com/immunity/fullt… 1/16
Using a multiplexed mass cytometry approach, we screened the expression of 351 surface molecules on human B cells, mostly looking at CD markers and other proteins associated with immunological function or signaling. 2/16
We identified 98 surface molecules expressed by human B cells and evaluated their expression on canonical B cell gates. 3/16
Read 16 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!