Discover and read the best of Twitter Threads about #Bioinformatics

Most recents (24)

Only a matter of time before a paper formalized this exercise:

Automated #scRNAseq cell type annotation with GPT4, evaluated across five datasets, 100s of tissues & cell types, human and mouse.

A🧵below with my thoughts on how such tools will change how #Bioinformatics is done. Image
I'll start with a quick summary of the paper, such that we're all on the same page.

(The paper is also a super quick read, literally only 3 pages of text, among which 1 is GPT prompts).

Here's the link to the preprint

biorxiv.org/content/10.110…
The paper looked at 4 already-annotated public datasets: Azimuth, Human Cell Atlas, Human Cell Landscape, Mouse Cell Atlas.

Differentially Expressed Genes characterizing every cluster (DEGe) in these studies were generally available with the publications & were also downloaded.
Read 18 tweets
1/ Are you a bioinformatics researcher looking for powerful tools to analyse your data? Check out @Bioconductor ! Here are some of my favorite packages for #bioinformatics analyses. Image
2/ First up: #DESeq2 by @mikelove. This package provides a method for differential gene expression analysis of RNA-seq data. It's widely used and highly cited in the field, and it's perfect for identifying genes that are differentially expressed between samples.
3/ Next, I recommend #edgeR. Like DESeq2, edgeR is a package for differential gene expression analysis of RNA-seq data. It's particularly useful for smaller sample sizes and can detect differential expression with greater sensitivity.
Read 10 tweets
1/ If you're new to #bioinformatics and looking to start learning, there are a ton of great resources out there to help you get started. Here are some sites, #Github repos, and books to check out:
2/ First up, the website Bioinformatics.org has a ton of resources for learning bioinformatics, including tutorials, forums, and #software tools. Check it out here: bioinformatics.org
3/ Next, the Github repository Awesome Bioinformatics is a great place to find curated lists of resources, including #tutorials, #courses, and #software tools. Check it out here: github.com/danielecook/Aw…
Read 9 tweets
1/ If you're interested in learning #bioinformatics as a #novice, you're in the right place! Bioinformatics is a field that combines biology, computer science, and statistics to analyse large scale biological data. Here are some resources to get you started:-
2/ First, you'll need to learn some basic biology. #KhanAcademy has a comprehensive series of videos on biology that can help you understand the fundamentals. Check it out here: khanacademy.org/science/biology
3/ Next, you'll need to learn some basic #programming skills. #Codecademy has a great course on #Python, which is a popular programming language in bioinformatics. Check it out here: codecademy.com/learn/learn-py…
Read 7 tweets
I did a deep dive on the different workflow management (WFM) tools for #Bioinformatics Data Analysis a few years ago, and since then there have been a few extra entrants in this segment, still mostly concentrated in serving the Next-Generation Sequencing field.
A few years ago, there were two communities dominating the open-source WFM ecosystem in NextFlow and SnakeMake, and two platforms dominating the the commercial offerings in DNAnexus and Illumina BaseSpace.
Since then, a company out of the founders of Nextflow has started offering enterprise support for Nextflow workflows in the cloud: Seqera Labs. They offer the extra level of support that some organizations require to run Nextflow on their Data Analysis setups.
Read 7 tweets
1/ Bioinformatics is an essential part of modern biology, and R is a powerful programming language that has become the standard tool for bioinformatics analysis. #rstats #bioinformatics #datascience Image
2/ R provides an extensive collection of packages for bioinformatics analysis, including tools for gene expression analysis, sequencing data analysis, and network analysis. #rstats #bioinformatics
3/ Bioconductor is an open-source software project that provides tools for the analysis and comprehension of genomic data. It contains more than 1,800 packages for bioinformatics analysis. #rstats #bioinformatics
Read 7 tweets
1/6: Venn diagrams are commonly used in bioinformatics to visualize the overlap of different sets of genes or proteins. There are several R packages available for creating these diagrams, including VennDiagram, ggvenn, and ggVennDiagram. #rstats #datascience #bioinformatics GGPlot Venn Diagram with R ...
2/6: VennDiagram is a widely used package for creating classic Venn diagrams with up to six sets. It offers a range of options for customizing the appearance of the diagram, including font size, color, and label placement. #rstats #bioinformatics
3/6: One of the advantages of VennDiagram is the ability to easily incorporate statistical analyses. For example, you can calculate the significance of the overlap between different sets of genes or proteins and display this information on the diagram. #rstats #bioinformatics
Read 6 tweets
1/ In 2021, DeepMind made headlines when it announced that it had developed an algorithm called AlphaFold that could predict the 3D structure of proteins with remarkable accuracy. Here's what you need to know about this groundbreaking technology. #bioinformatics #AlphaFold #AI Image
2/ Proteins are essential building blocks of life, and their structure is critical to understanding how they function. Determining the structure of a protein can be a long and complex process, but AlphaFold is changing that.
3/ AlphaFold uses deep neural networks to predict the 3D structure of a protein based on its amino acid sequence. By training on a vast database of known protein structures, AlphaFold can accurately predict the structure of a protein in a matter of days, rather than years.
Read 7 tweets
Today, I want to share 3 Bioconductor/R packages that are useful for chemoinformatics - the field that applies computational methods to solve problems in drug discovery, molecular design, and other areas. Let's get started! #rstats #datascience #bioinformatics
1/ The first package on our list is "ChemmineR". With ChemmineR, you can manipulate, visualize, and analyze large chemical datasets. It includes functions for clustering compounds, predicting properties, and more. Check it out at bioconductor.org/packages/relea…
2/ "ChemmineOB" is a package that extends ChemmineR with tools for working with Open Babel, a software that converts chemical data between different formats. With ChemmineOB, you can perform tasks such as generating 3D coordinates, converting file formats, and more.
Read 4 tweets
Why are biologists adopting #julialang #sciml? Performance, metaprogramming, and the development of new abstractions are improving software tools for #computationalbiology #systemsbiology #bioinformatics. Check out this new paper in Nature Methods!

nature.com/articles/s4159…
In this we detail how #julialang's core compute model gives faster code, with a detailed calculation of the effects of the #python interpreter and kernel launching costs on simulation performance. It's pretty cool how one can pen and paper calculate the 100x expected difference. Image
Julia's ecosystem has a complete set of tools for mathematical modeling (#sysbio), #bioinformatics, #machinelearning, and #datascience which we contextualize in the field of biology. Image
Read 7 tweets
Disheartening how little regard there is for consent in the medical-health-data community. For example this new study looking into suicide "prediction". This is a #BigProblem #MentalHealth #BioInformatics #DigitalPhenotyping #Suicidology #DataEthics nature.com/articles/s4174…
2/ In this study, with MASSIVE personal health datasets provided by Kaiser Permanente @kpthrive, HealthPartners @_HealthPartners , Henry Ford Health System @HenryFordHealth the researchers applied models (including "neural" network) to try to predict suicidal risk.
3/ Spoiler alert it didn't really work (comment: maybe put resources on things we already know work, like human connection & actual care & support?). Sadly, this result will probably just cause more effort to "improve" modeling. This is about $, making it & cost-cutting it.
Read 18 tweets
Do you need to analyze Spatial Transcriptomics data, but are lost in the endless sea of methods?

Here's an explainer of the new @NatureComms paper benchmarking 18 spatial cellular deconvolution methods🧵🧵

nature.com/articles/s4146…
This thread is organized as follows:

1️⃣ Intro to Spatial Transcriptomics
2️⃣ Intro to Cellular Deconvolution
3️⃣ Methods benchmarked
4️⃣ Datasets used (real & simulated)
5️⃣ Performance assessment
6️⃣ Benchmarking results
7️⃣ Accuracy
8️⃣ Robustness
9️⃣ Usability
🔟 Guidelines
1️⃣ What is Spatial Transcriptomics & why is it important?

Spatial Transcriptomics (Method of the Year 2020) is a fast evolving field.

It holds great potential to further our understanding of development & disease, by placing cells in their spatial native tissue context.
Read 25 tweets
There was little online material to learn bioinformatics 10 years ago when I started.

I curated ten resources to learn bioinformatics for FREE 🧵👇
1/ Data Analysis for the Life Sciences Series buff.ly/3Z7F1ha by Rafa at DFCI. you can find the courses on Edx buff.ly/3mapP4m
2/ buff.ly/3SDZasD Applied Computational Genomics Course at UU
Read 12 tweets
I need to raise awareness about an important point in #scRNAseq data analysis, which, in my opinion, is not acknowledged enough:

‼️In practice, most cell type assignment methods will fail on totally novel cell types. Biological/expert curation is necessary!

Here's one example👇
Last year, together with @LabPolyak @harvardmed, we published a study in which we did something totally awesome: we experimentally showed how a TGFBR1 inhibitor drug 💊 prevents breast tumor initiation in two different rat models!

Here's a detailed thread on this paper:
As you can imagine, this is a big thing. Treating tumors is already hard, preventing them is even harder!

Obviously, the most burning question for us then became: what is the drug actually doing to prevent tumor initiation?

Or, what is different in treated vs. control cells?
Read 17 tweets
🚨New #SpatialTranscriptomics #Bioinformatics data resource out in @naturemethods.

SODB, a platform with >2,400 manually curated spatial experiments from >25 spatial omics technologies & interactive analytical modules.

This🧵will walk you through all the features of SODB [1/33] Image
First, some background.

Spatial technologies complement classical genomics by also providing information about spatial context & tissue organization in:

- embriogenesis
- disease development
- normal tissue homeostasis

The field has exploded 🔥 in the past 2 years. [2/33] Image
But, data from different studies is stored in different configurations/repositories, such as:

- GEO
- zenodo
- fig share
- SingleCellPortal
- IONPath for MIBI
- 10XGenomics website

This makes data sharing & re-analysis challenging.

Databases exist, but have limitations. [3/33]
Read 33 tweets
1/ My favourite gene has 15 transcripts, which one should I use for further analysis? To report the position of variants? To design primers for? 🤯

This #tweetorial will show you how to filter and prioritise transcripts… 🧵

#genomics #bioinformatics #Ensembltraining 🧬 Screenshot of the transcript table of the LAMA3 gene in the
2/ To start, look for the #canonical tag in the flags column of the transcript table. The canonical transcript is based on conservation, expression, concordance with @appris_cnio and @uniprot, length, clinically important variants and completeness. Screenshot of the transcript table of the LAMA3 gene in the
3/ Many Ensembl #canonical transcripts will also be the #MANESelect, which is our collaboration with @NCBI. These transcripts match perfectly with RefSeq transcripts, so are the best to report variant location. Screenshot of the transcript table of the LAMA3 gene in the
Read 8 tweets
I want to talk today about a methodological issue in #genomics research that has been around a long time but is still a major problem.
The reason is that today I reviewed another manuscript that has this exact problem.
First some background.
In genomics research we often do profiling of how genes are switched on and off in disease and development, and in these profiling tests we identify dozens to thousands of genes that could play a role in those processes
Gene names don’t tell us about their function. We could dig into the literature on them, but the lists are so big it takes too long. So we often use tools to summarise whether genes belonging to certain functional groups are over-represented. Image
Read 24 tweets
Find a great team for junior #Bioinformatics folks. A thread…

1/N
0. Why the right team?

Your life long work habits form in your first 1-2 roles. They’re so hard to change (and find folks who will invest in that change). Please find the right team to se your self up for life long work success. Here’s what I’d look for going back in time.
1. No more than five people.

You need very focused mentorship, and a large team won’t do that. You need to see a variety of approaches to solve problems, effectively organize your time, and execute on projects. Soft+hard skills. Your team are your wingmen.
Read 9 tweets
The science of #immunotherapy can cure a patient's otherwise incurable cancer.

But sometimes immunotherapy fails completely

Shockingly, we hardly know why.

A meta-analysis of #Genomics & #Transcriptomics in >1,000 immunotherapy-treated patients aims to better understand why🧵
This 2021 @CellCellPress paper is one of the best #DataScience #Bioinformatics resources out there for understanding the genetic determinants of response to immune checkpoint inhibitors (ICIs).

cell.com/cell/fulltext/…
Some context:

PD-1 & PD-L1 inhibitors are examples of ICIs.

ICI is a type of immunotherapy that un-blocks the immune system & allows it to mount attacks🤺

It does it by inhibiting checkpoints (s.a. PD-1 & PD-L1): proteins that keep the immune system from attacking its own self
Read 28 tweets
Learning R is an essential step for practicing Bioinformatics.
Here are 10 resources that will help you with R... 👇🧵

#Rstats #Bioinformatics #Biology #programming
(1). R for Data Science: r4ds.had.co.nz
Covers various topics for data analysis, visualization and programming with R
(2). An Introduction to R: cran.r-project.org/doc/manuals/R-…

An R manual from CRAN that covers basic and advanced R topics
Read 12 tweets
1/4 "Using #PEPMatch, a newly developed #bioinformatics package which predicts #peptide similarity within specific #amino #acid mismatching parameters consistent with published #MHC binding capacity,..."

nature.com/articles/s4159…
2/4 "... we discovered that #nucleocapsid #protein shares significant overlap with 22 #multiple #sclerosis (#MS)-associated proteins, including #myelin #proteolipid #protein (#PLP). 

#MultipleSclerosis
3/4 "Further computational evaluation demonstrated that this #overlap may have #critical #implications for #Tcell responses in #multiple #sclerosis (#MS) patients and is likely unique to #SARSCoV2 among the major #human #coronaviruses."
Read 4 tweets
Bench to bedside series: Lung COPD part 1/3
Respiratory histology (via @drawittoknowit)
Health & COPD Lung @TheLancet
#4KMedEd #meded #foamed #medtwitter #MedEd #MedTwitter #Pulmtwitter #scRNAseq #Bioinformatics
Bench to bedside series: Lung COPD part 3/3
#scRNAseq paper: Human distal airways contain a multipotent secretory cell that can regenerate alveoli
1. RASCs (new cell-type) + #stemcell properties in distal airways 2. faulty RASC-to-AT2 transformation in COPD
#Bioinformatics #MedEd
Read 4 tweets
#Bioinformatics is a field that involves using computers and other technological tools to analyze and interpret biological data, such as DNA sequences or protein structures. It is an interdisciplinary field that combines biology, computer science, and information technology.
Some basic concepts in bioinformatics include:
DNA sequences: The sequence of nucleotides (A, C, G, and T) in DNA determines an organism's genetic information. Bioinformatics tools are used to analyze and compare sequences to understand how they differ between organisms and how they can be used to study biological processes.
Read 8 tweets
Healthy Lung vs. Lung with Chronic Obstructive Pulmonary Disease (COPD)
h/t @PatologCritica
#4KMedEd #meded #foamed #medtwitter #MedEd #MedTwitter #Pulmtwitter #lung #COPD #INNOMed

Bench to bedside series: Lung COPD part 1/3
Respiratory histology (via @drawittoknowit)
Health & COPD Lung @TheLancet
#4KMedEd #meded #foamed #medtwitter #MedEd #MedTwitter #Pulmtwitter #scRNAseq #Bioinformatics
Read 5 tweets

Related hashtags

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3.00/month or $30.00/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!