Jinwoo Leem Profile picture
May 19, 2022 7 tweets 6 min read Read on X
Really excited to announce that AntiBERTa is now published in @Patterns_CP! Here we describe a transformer model that demonstrates understanding of antibody sequences 🧵 (1/6)

#machinelearning #antibodies #drugdiscovery
cell.com/patterns/fullt…
We pre-train a transformer model based on RoBERTa. We exclusively use full-length antibody/B-cell receptor sequences using the MLM objective. Other similar transformers FYI include BioPhi (@prihodad), ABLang (@HegelundOlsen), AntiBERTy (@jeffruffolo) (2/6)
We show that the embeddings pick up nuanced features of BCR/antibody sequences. For example, V gene usage mutational load, and remarkably, B cell provenance. This is all done in a zero-shot setting, i.e. none of these labels were provided during pre-training. (3/6) Image
Transformers are powered by self-attention and AntiBERTa is no exception. We see that the self-attention maps correlate broadly to positions of contact. While not perfect, AntiBERTa does seem to understand some pairwise dependencies (4/6) Image
Finally, we fine-tune the model for paratope prediction and show that it can achieve SOTA performance. This helps us think about novel ways in which we can investigate convergence in repertoire datasets, such as using Paratyping (@EveRichardson20). (5/6) Image
All in all, a huge thanks to the @alchemabtx team, and I want to particularly thank our stellar head of tech @jakegalson, along with @lauramitch29, @jhrf and @all_your_bayes. (6/6)
@alchemabtx @jakegalson @lauramitch29 @jhrf @all_your_bayes Oh, and reviewers #1-3, thank you all! You made the manuscript better.

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Jinwoo Leem

Jinwoo Leem Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @ideasbyjin

Nov 23, 2022
Ran some experiments! #ImmuneBuilder is better than #ESMFold (@alexrives / @TomSercu), right? There are advantages of using the latter, and now a 🧵 on why getting carried away with low RMSDs is only a part of the story...(1/5)
#machinelearning #proteins
First, the metrics are RMSDs based on aligning the C/N/CA/CB atoms across the chain, then calculating the RMSD across a region. i.e. align every residue of the VH, then calculate RMSD across CDRH3, or CDRH1, etc. This is on ~35 antibodies of the ImmuneBuilder test set (2/5)
ESMFold's CDRH3 accuracies are better than what I expected. Where it's let down is on the "canonical" CDRs. It would've been nice to compare the VH-VL orientations and talk about how ImmuneBuilder doesn't generate D-amino acids, etc. (3/5)
Read 5 tweets
Nov 19, 2022
Pretty wild that there have been at least 4-5 antibody-specific structure prediction tools this year, all based on #deeplearning

1. DeepAb/IgFold (@jeffruffolo)
2. ABLooper/ImmuneBuilder (@brennanaba)
3. Equifold (@jaehyeon_lee_ml)
4. t-AbFold

What does this mean? 🧵(1/6)
First, it's pretty crazy we even have antibody-specific tools, since #AlphaFold2, #ESMFold, #OmegaFold, all do a decent job at antibody modelling. However, antibody-specific tools have -some- feature that's necessary (e.g. being MSA-free) (2/6)
The demand is likely due to interest from pharma & biotech, but we don't have anywhere near the same level of interest for other polymorphic proteins like TCRs and MHCs (🤔). Regardless, with such interest, I think an antibody-specific CASP should be resurrected! (3/6)
Read 7 tweets
Jun 6, 2022
Exciting work from @KathyYWei1, @AmeyaHarmalkar & @proteinrosh on predicting scFv developability. TLDR: transformers and CNNs can potentially help prioritise mutations sites for enhancing stability (1/5) #antibodies #machinelearning #proteineng
Context: a single-chain Fv (scFv) is an antibody construct whose heavy and light chains are linked. It's not the conventional "Y" shape molecule, and is useful for engineering / phage display, etc. See @AlissaHummer's post blopig.com/blog/2021/07/a… (2/5)
Thermostability (measured by TS50, the temperature when scFv loses binding) is weakly predicted by 0-shot and fine-tuning via transformers (ESM-1v + ESM-1b). CNNs using sequence and structural (energy) convolutions perform better (?) [hard to tell, sorry!🙈] (3/5)
Read 5 tweets
May 26, 2022
A "negative" result, but phenomenal thought piece from @naturalantibody / @antibodymap. TLDR predicting antibody-antigen interactions is pretty darn hard (1/5) #antibodies #machinelearning #alphafold

naturalantibody.com/use-case/deepm…
Predicting Ab-Ag interactions is a sub-problem of the protein-protein interaction problem. There are many facets to consider here, including but not limited to, identifying the correct antigen (let alone the correct epitope), the correct paratope, orientation, etc (2/5)
@antibodymap's team show first that true Ab-Ag pairs (i.e. those where we know the Ab binds antigen) and false Ab-Ag pairs (i.e. Ag was randomly given to an Ab), the pIDDT scores are incomparable, suggesting score-based discrimination is HARD. (3/5)
Read 6 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Don't want to be a Premium member but still want to support us?

Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal

Or Donate anonymously using crypto!

Ethereum

0xfe58350B80634f60Fa6Dc149a72b4DFbc17D341E copy

Bitcoin

3ATGMxNzCUFzxpMCHL5sWSt4DVtS8UqXpi copy

Thank you for your support!

Follow Us!

:(