We had a fundamental question that Burkhard Rost has addressed decades ago!
How large is the sequence attractor of a given protein fold?
Which positions can be varied in sequence so that the fold does not care and which ones are not changeable?
(+)
Using reverse-folding algorithms (ProteinMPNN, Caliby) + structure prediction + local frustration analysis,
we redesigned sequences for fixed backbones.
We used FrustraEvo to analyse:
Which positions are free to vary and which are energetically constrained?
(+)
We analyzed this in the context of alpha-globins where protein-protein interaction sites, known to be highly frustrated. Sequences are remodelled to largely decrease frustration!
That makes sense as ProteinMPNN maximises stability which conflicts with function.
(+)
Our next family to analyse was Beta-Lactamases, whose catalytic sites are also known to be highly frustrated.
To our surprise, reverse folded sequences were not remodelled. The native, highly frustrated identities are always recovered in the design.. This made no sense...!
(+)
Why are reverse folded sequences maintaining energetic conflicts if they are maximising stability and the sequence-to-structure fit? Is there a bias in the methods? Has ProteinMPNN memorized catalytic sites and just imprints its identities when seeing something alike? (+)
We mutated in silico all catalytics to to Valines to explicitly minimise frustration. We also used experimentally mutated structures. Reverse folded sequences using both methods still recovered the native identities back.. Is there a memory in ProteinMPNN?
We pushed ProteinMPNN to its limits. We used max temperature to maximise seq varability. We also retrained it by deleting from the training set all annotated and predicted enzymes!
Some signal was gone but some was not! It this still evolutionary information leakeage?
(+)
We then decided to use De Novo designed folds as they are designed to maximise foldability, stability and have not (known) function.. Top7 is very intersting.
Both the original design and the ProteinMPNN designs (Baker vs Baker metaverse) contain highly frustrated residues (+)
We conclude that some frustration that cannot be erased from the sequences even when reverse folding to maximise sequence-structure-fit behaves as a spandrel. This frustration is not the consequence of adaptive evolution but a consequence of the fold architecture.
(+)
On his postume article, Dan Tawfik & colleagues propose that ancestral enzymatic function could be seeded by unstable hotposts in proteins that could bind small molecules such as phosphate with low affinities by separating them from the environment... (+)
Such sites could have later evolded to become more complex and give rise to modern catalytic sites. We know that frustration can represent those ancestral hotpots. A spandrel that has no reason other than to facilitate a complex structure can be that ancestral hotspots (+)
This idea is also compatible with the theory of platonic folds. Sequences don't really code structures but fall into attractors defined by folds. Folds are the consequence of physical and biochemical rules given the available amino acids and how they interact with solvent (+)
Folds represent basins in sequence space & sequences diffuse such space until the fall into one of this folds. We know that sequences are not evenly distributed across structure space. Something shown by Christine Orengo years ago. Are superfolds, super platonic attractors? (+)
If you want a complex architechture it makes sense that you cannot have a completely frustration free structure. You need "hinge" residues so you can adopt such folds. Maybe these are our frustration spandrels that can be later on exapted for function.. as Gould proposed (+)
Our study implied 1000s of predicitons & calculations but still it only represents few folds as case studies. Is this something more general? We will study all known enzyme families to complete this idea but we want to present this initial work as potential evidence (+)
This has been a tremendous work by Miriam and collabs. It is my first paper as a group leader so it scares me a bit.. but it was a great adventure! Let's see what reviewers say! Comments Welcome! (+)
We have recently lost 2 great scientists in our field. Amos Bairoch & Peer Bork who not only have inspired my work since my early years but also built theory, tools & databases without which this work would not have been possible. This work is dedicated to their memory. RIP❤️.
@threadreaderapp unroll
• • •
Missing some Tweet in this thread? You can try to
force a refresh
🚨We've updated our preprint "Frustration, Dynamics, and Catalysis" ✨
A short 🧵1/8: We expanded the conceptual connection between the energy landscapes theory and catalysis, added new figures, and clarified how local frustration shapes enzyme function. arxiv.org/pdf/2505.00600
2/8 Proteins aren't perfectly optimized machines, they’re frustrated. But that’s a good thing.
This review explores how local energetic frustration enables flexibility, catalysis, and evolvability, rather than locking proteins into rigid forms.
3/8 🧠 What is local frustration?
It’s when some interactions in a folded structure are energetically "suboptimal", leading to competing conformations (Conformational Substates or CS).
These CS enable functionally important motions (FIMs) needed for catalysis.
[Thread 🧵] It is until now that I have time to tell about the great Birds of a Feather session we organised last Tuesday from the @Bioinfo4women project and that I chaired during #ECCB2022@ECCBinfo! Nice discussion happened afterwards! (+)
We started our discussion with @mjrementeria explaining why it is important to increase diversity within research groups and gave recommendations to do it at the institutional level (+)
We followed by having me explaining what happens if we don't make our biases conscious. I showcased how despite our best intentions.. we ended up having only male keynotes at the 1st LA-SCS (@IscbLascs) chaired by me in 2014. Since then gender parity is a must in our events (+)
Each session was 1.5 hours long. During the 1st 1/2 hour we screened all posters and selected our favourite 5. Then we selected each reviewer's favourite poster + those posters selected by +1 reviewer. We went through the preselected posters and evaluated them in depth (+)
During the pre-screening we evaluated the poster design, willing of the presenter to explain to the audience and originality (+)
[Thread] [1/13] After 10 years of being part of the @iscbsc, within @iscb I am running as its "Board of Directors Representative". It would be an honor and a huge responsibility to represent our community in the BoDs! You can log in and see my candidate statement and vote!
[2/13] I became involved in the @iscbsc Student Council in 2011/2012 when we created @RSGArgentina of which I was its 1st president and subsequent advisor until very recently. @RSGArgentina was the only active RSG in Latin America at that point, since @RSGBrazil was inactive.
[3/13] One of my main objectives for more than a decade has been to strengthen & impulse the development of our LaTam community. In 2014 I chaired the 1st Latin American Student Council Symposium which helped us to get organised and impulse LA crew! bmcbioinformatics.biomedcentral.com/articles/10.11…
Finalizo el #5SAJIB y con ello, doy por finalizada tambien mi pertenencia al @RSGArgentina luego de 10 años de actividades. Ha sido una experiencia increible. Aqui dejo video de la charla que di el miercoles acerca de como crecimos como comunidad (+)
Arrancamos el grupo de estudiantes en bioinfo de argentina, cuando yo tenia 22 años y estaba en el ultimo año de la lic. en bioinformatica. En 10 años, me recibi, hice un doctorado y ahora estoy en mi 2do postdoc. Al comenzar, no habia casi nada, lo construimos todo desde 0 (+)
Fui 1 de los 7 primeros graduados en bioinformatica de grado de Argentina. Estabamos todos desconcertados, no sabiamos bien a donde ibamos. A nivel latino americano tampoco habia demasiada comunidad. Escribimos la propuesta del @RSGArgentina en 2011 y se oficializo en 2012 (+)