There's (quite rightly) a lot of raving about the new AlphaFold DB resource today. Thought I'd weigh in with my own example. DNA-dependent protein kinase catalytic subunit (uniprot.org/uniprot/P78527) is an enormous beast - a single chain of 4,128 residues! (1/10)
There are now 19 structures of this guy in the wwPDB, all post-resolution-revolution cryoEM structures. The first (2017) was reconstructed to a character-building 4.3 Å. I'm told the model building was extremely challenging, ... (2/10)
... and a glance at the overall structure shows why. The thing is almost entirely composed of short helix after short helix after short helix... without clear sidechain information getting out of step with the map would be really, really easy. (3/10)
Anyway, about 6 months ago I happened to be idly looking at one of the more recent higher-resolution (mid-3 Å) structures in ISOLDE, and picked out (and corrected) a few out-of-register helices. I spotted a few other issues and meant to go through the whole thing, ... (4/10)
... but never found the few days of time I expected it would take. Anyway, the AlphaFold DB announcement brought it back to mind. To my initial disappointment P78527 didn't come up in a search, but then I read the fine print: (5/10)
"... we have attempted to fold most sequences in the UniProt reference proteome that are between 16 and 2700 amino acids long and contain only standard amino acids.For human proteins only, longer sequences are available split into fragments in the bulk download." (6/10)
So I downloaded the full database, and sure enough there are no less than 14 overlapping models of 1400 residues each, covering the entire chain (cryo-EM model in CA trace, AlphaFold models as ribbons). (7/10)
Taking a subset of the AlphaFold models that together provide complete coverage and using these as templates for ISOLDE's adaptive distance and torsion restraints, and rebuilding the cryo-EM model becomes almost trivial. (8/10)
So far, for every site of disagreement between cryo-EM and AlphaFold models (with the exception of long-range interdomain contacts), it's been the AlphaFold one that was closest to correct. For example, this little site here. (9/10)
Upshot: while AlphaFold clearly isn't a *replacement* for experimental structures by any stretch, it's already very clear that it's going to make the task of *building* experimental structures both much easier and much less error prone. Welcome to the future! (fin)

• • •

Missing some Tweet in this thread? You can try to force a refresh
 

Keep Current with Tristan Croll

Tristan Croll Profile picture

Stay in touch and get notified when new unrolls are available from this author!

Read all threads

This Thread may be Removed Anytime!

PDF

Twitter may remove this content at anytime! Save it as PDF for later use!

Try unrolling a thread yourself!

how to unroll video
  1. Follow @ThreadReaderApp to mention us!

  2. From a Twitter thread mention us with a keyword "unroll"
@threadreaderapp unroll

Practice here first or read more on our help page!

More from @CrollTristan

14 Jul
As we all know by now, the most pressing current concern about the ongoing pandemic is the potential for emergence of immune-escape variants. With that in mind, I'm very happy to have had the chance to contribute to this important work: (1/6)
nature.com/articles/s4158…
A major multi-centre effort including @tylernstarr, @jbloom_lab, @Vir_Biotech, @jchodera, @ivyzhang__, @W_Glass and many others, this work assesses a wide panel of antibodies against the SARS-CoV-2 spike for potency, breadth and potential for immune escape. (2/6)
While the potency of many existing antibodies is dramatically reduced by mutations tolerated by the virus itself, the great news is that a few show high potency against most or all known sarbecoviruses. (3/6)
Read 6 tweets
8 Feb
A more extreme illustration of the problem: yes, this is a single residue (JSG, found in 6s8h and 6mhu). What it actually *is* is E. coli lipopolysaccharide (LPS) - how it appears in the database is {deep breath}:
(2~{r},4~{r},5~{r},6~{r})-6-[(1~{r})-1,2-bis(oxidanyl)ethyl]-2-[(2~{r},4~{r},5~{r},6~{r})-6-[(1~{r})-1,2-bis(oxidanyl)ethyl]-5-[(2~{s},3~{s},4~{r},5~{r},6~{r})-6-[(1~{s})-1,2-bis(oxidanyl)ethyl]-4-[(2~{r},3~{s},4~{r},5~{s},6~{r})-6-[(1~{s})-2-[(2~{s},3~{s},4~{s},5~{s},6~{r})-...
...6-[(1~{s})-1,2-bis(oxidanyl)ethyl]-3,4,5-tris(oxidanyl)oxan-2-yl]oxy-1-oxidanyl-ethyl]-3,4-bis(oxidanyl)-5-phosphonooxy-oxan-2-yl]oxy-3-oxidanyl-5-phosphonooxy-oxan-2-yl]oxy-2-carboxy-2-[[(2~{r},3~{s},4~{r},5~{r},6~{r})-5-[[(3~{r})-3-dodecanoyloxytetradecanoyl]amino]-6-...
Read 7 tweets
2 Feb
As is becoming increasingly common these days, the pace of reality has well outstripped that of the scientific publication cycle. Still, I'd like to share this with you: sciencedirect.com/science/articl… (1/14)
Unless you've been entirely disconnected for the last month or three, the core message (that immunity-escaping variants of SARS-CoV-2 are developing and need to be watched) should come as no surprise. This work focuses on one of the first such mutants identified, N439K. (2/14)
At the time of submission, the N439K variant had been spotted in 34 different countries and had arisen independently multiple times, suggesting at least some gain of fitness over the wild-type virus. (3/14)
Read 14 tweets
6 Aug 20
ISOLDE 1.0 is finally live! To get it, just install ChimeraX 1.0 from rbvi.ucsf.edu/chimerax/downl…, then go to Tools/More Tools..., find ISOLDE and click Install. In the thread, I'll give a quick recap of what ISOLDE is, followed by a rundown of what's new. (1/17)
So what is ISOLDE? In brief, it's an interactive environment for (re)building atomic models into medium-low-density crystallographic and cryo-EM maps using GPU-accelerated interactive molecular dynamics. That's a bit of a mouthful, so here's a video demo in the next tweet. (2/17)
This example (found this morning) shows the correction of an out-by-one error in a beta strand (residues 306-318 of 3mca chain B). Like all videos in this thread, it's an actual-speed screen capture. Key features to note: (3/17)
Read 17 tweets
27 Jul 20
In all cryo-EM maps of the SARS-CoV-2 spike protein density for the N-terminal domain has been really rubbishy, with modelling only really possible based on somewhat-weak homology to the original SARS equivalent... until now. (1/12)
On the left: a better-resolved region from this domain in 6vxx - general path of the backbone is fine, but sidechains are uninterpretable without outside info. The outer surface loops devolve to complete rubbish. On the right, the same site from 6zge. Clear, unambiguous (2/12)
So what caused this enormous difference in quality? I asked the authors (from the lab of Steve Gamblin at the Crick), and their honest answer was that they weren't sure - but they pointed me at this intriguing preprint from Christiane Schaffitzel: biorxiv.org/content/10.110… (3/12)
Read 12 tweets
21 Mar 20
I was going to wait until Monday to organise and release my current set of SARS-CoV-2 structure rebuilds, but given that I woke up with a bit of a cough (don't worry, nothing serious yet) I figured better safe than sorry. Link in next post, and previous tweets collated in thread:
Link to models: drive.google.com/drive/folders/…. Each folder also contains a short set of notes covering what I considered the most important changes/issues. I'll add the previous threads discussing each model below, and add new ones as I do them.
Read 16 tweets

Did Thread Reader help you today?

Support us! We are indie developers!


This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Follow Us on Twitter!

:(