Examples from our Protein Common Assembly Db now on @biorxiv_bioinfo with @QifangXu. Viable assemblies for all Xray structures (EPPIC, @josemduarte), clustered w/ cryoEM, NMR assemblies by Pfam architectures + stoichiometry+ symmetry + interfaces. @buildmodels@PDBeurope@pdbj_en
We count "Crystal forms (CFs)" - unique crystal unit cells or cryo-EM or NMR experiments for each cluster, and number of uniprot sequences across each cluster. More experimental data --> more support for relevance of biological assembly. Also #PDB's bioassemblies, PISA, EPPIC.
PCNA example from 1st tweet. Largest cluster (#8) of Pfam arch (PCNA_N)_(PCNA_C) has 28 crystal forms (CFs) out of 36 CFs for this Pfam architecture in whole in PDB (78%). 19 UNPs of 23 in whole PDB (83%). 55 entries of 64 in whole PDB (86%). dunbrack2.fccc.edu/ProtCAD/Result…
Of 31 CFs (#CFs_UNParch) for these Uniprots in whole PDB, 28 (#CFs_UNPClus) have this assembly (0.9 or 90%). Of 55 entries in cluster, 47 have this assembly in PDB's biol. assemblies (85%). EPPIC and PISA have 32 (58%). Click [8] button in 1st column -> Pymol script & cif files.
This well known assembly is present in some crystals but actually missing in PDB biological assemblies. Example.
Here are other examples. MHC Class II alpha/beta chains with peptide. Find by searching for Pfam (MHC_II_alpha) or Uniprot (DRA_HUMAN) or PDB (1DLH). 53 CFs, 30 UNPs, 69 entries, 67 PDBBA. dunbrack2.fccc.edu/ProtCAD/Result…
14-3-3 proteins -- homodimer with peptides. 56 CFs, 11 UNPs, 297 entries. We cluster Pfam architectures with and without peptides separately. dunbrack2.fccc.edu/ProtCAD/Result…
CD1 proteins (Pfam MHC_I_3) with T-cell receptors. Two clusters. Cluster 1: alpha-yellow; beta-cyan), 11 CF, 65 entries, all CD1D_MOUSE or HUMAN. Cluster 2 (alpha-orange, beta-blue TCRs), 8 CFs, CD1A, CD1B, CD1C, CD1D. dunbrack2.fccc.edu/ProtCAD/Result…
FGFR and FGF heterotetrameric complex (stoichiometry A2B2). Includes FGFR1 and FGFR2; FGF1, FGF2, FGF10. 5 CFs, only 2 are annotated in PDB as tetramer. Rest as heterodimer. dunbrack2.fccc.edu/ProtCAD/Result…
We include deposited biological assemblies (from authors and PISA) in our clustering. Here are actin filament assemblies (A3, A4, A5, A8) combined. 27 CFs, 30 PDB entries, 6 UNPs. Filaments are challenging to differentiate from crystallization-induced chains of monomers.
Hormone receptor homodimers and heterodimers (estrogen receptor, PPAR-gamma, RXR heterodimers, etc.). 79 CFs, 39 UNPs, 347 entries. Dimer is in PDB biological assemblies for only 226 of these entries (65%). Not sure why so many are missing.
(Neur_chain)_(Neur_membrane) Pfam architecture. Pentamers of ligand-gated ion channels. 39 CFs, 21 UNPs, 144 PDB entries.
GPCR Galpha complexes with bound antibodies in gray (most such complexes). 31 CFs (in two different clusters with V_set (nanobodies) and (V_set)_(V_set) (scFvs)).
The U1 snRNP, including complex of 7 human proteins: SNRPB, SNRPD1, SNRPD2, SNRPD3, SNRPE, SNRPF and SNRPG complex. 18 crystal forms, 26 entries, 10 species. 19 homoheptamers (mostly Archaea). 7 heteroheptamers (S. cerevisiae and Ss. pombe; human).
• • •
Missing some Tweet in this thread? You can try to
force a refresh
In the works for a LONG while - a new clustering of antibody CDR structures to update North et al (2011)/PyIgClassify. Clusters now have high electron density support & at least 10 sequences. DBSCAN helped to remove noise points. @biorxiv_bioinfo@build_models@PDBeurope@PDBj_en
The website has been updated w/ the new data. dunbrack2.fccc.edu/PyIgClassify2. Downloads include mmCIF coordinates, data per domain & per cdr, & fasta sequences. Paper is work by Simon Kelow, Bulat Faezov (@Bulat41292666), website by @QifangXu. Advice from Jared Adolf-Bryfogle @jadolfbr.
We show that some old clusters are mostly misfitting to electron density. Clusters on left side have stable elec density. Those on the right are misfit (and retired from our nomenclature). Red arrows point to peptide flips (dPSI of 180 at residue N, dPHI of 180 at residue N+1).
I can't figure out what the hell this image is supposed to be. That's not what RNA or DNA look like. Scientific American (@sciam) should promote scientific visual literacy not pseudoscientific stock photos. @WrongHandedDNA will love this.
@WrongHandedDNA is one of my favorite twitter accounts. I post to it when I am in this kind of mood:
I checked some more getty images. These two images are screenshots of the first page for "DNA helix". 17 left-handed and 13 right-handed and many indeterminate. Some not-handed - one strand is always in front. I like the one with >30 bp per turn.
@deray OK, give me a few minutes to write a thread. But I'll start with - it's like the difference between squirrels and raccoons." Quite different species. Different branches of vertebrates. Ultimately related (like humans and mice) but distantly.
@deray First, viruses are particles that are much smaller than cells (like human skin cells, nerve cells). They have some kind of shell (of proteins or lipids - fat-like molecules) and inside they have a chromosome (like our 46) - information to build new copies of the virus.
@deray On the outside of the shell, on many viruses there are some proteins that stick out of the surface. In coronavirus, this is called the spike protein. Influenza A has "hemagglutinin" protein. These are totally different, unrelated proteins between flu and coronavirus.