Giovanni Pascarella Profile picture
Research Scientist @ RIKEN IMS

Jul 25, 2022, 20 tweets

Beyond ecstatic to share that our paper is finally out @CellCellPress!
It's been a while since I lastly highlighted the main findings, so here's a refresher for new and old readers with some new data added during the peer-review process (-18)
doi.org/10.1016/j.cell…

The background: roughly one base out of 3 in our genomes belongs to either an Alu or L1 repeat element. These are retrotransposons involved in a lot of cool things, here's an excellent primer if you want to know more about transposable elements (-17)
doi.org/10.1186/s13059…

However, our relationship with Alu and L1 is...well, complicated. For example, due to high identity of millions of copies, they can "confuse" the homology-based repair mechanism that fixes DNA double-strand breaks (DSBs) and cause non-allelic homologous recombination (NAHR) (-16)

NAHR can be bad news for the genome because it generates inversions, deletions and duplications. Alu and L1 in fact have been frequently found at breakpoints of mutations in cancer and genetic disorders. Alu can also generate complex rearrangements during DNA replication (-15)

With 10-50 DSBs happening per cell in a day and millions of copies of Alu and L1 interspersed in the genome of each cell, the number of NAHR events has been hypothesized to be very high...however no study so far had delved into a global survey of somatic NAHR in the genome (-14)

We initially used capture + sequencing of Alu and L1 to generate repeat-enriched libraries from 10 neurotypical donors, for which we assayed bulk samples of kidney, liver and 3 brain cortical regions further divided in neuronal and glial fractions (-13)

To detect NAHR in short and long-DNA reads we developed TE-reX, a bioinformatic tools designed to find NAHR of repeats from split reads. This was an excellent collaboration with the twitterless Martin Frith at @UTokyo_News_en (-12)
gitlab.com/mcfrith/te-rex

TE-reX identified millions of putative somatic NAHR events in our capture-seq libraries, and we validated >100 of these by PCR+Sanger using primers annealing on non-repeat regions flanking each recombined repeat (-11)

Overall we found more NAHR events in kidney and liver, while the brain was enriched with intra-chromosomal NAHR and NAHR of proximal repeats. In brain samples, NAHR was inversely correlated with chromosomes length, reminiscent of what happens for meiotic recombination (-10)

NAHR of intra-chromosomal inverted repeats can only generated inversions while NAHR of direct repeats cause deletions and duplications, with an expected ratio of 2:1. This ratio is almost perfectly reflected in our data for NAHR of proximal direct repeats (-9)

We found that repeats in peri-centromeric regions of several chromosomes are particularly "hot" and recombined more than expected from a random background NAHR distribution. Check out this very NAHR-prone locus on chr21 (one dot = one 100kb bin): (-8)

Trying to understand when/how NAHR originates in tissues, we discovered that differentiation of iPSCs in cortical neurons causes increased NAHR of proximal repeats, and neuron-specific increase of NAHR in B compartments at the expense of NAHR in A compartments (-7)

Analysis of NAHR in sporadic #Alzheimers and #Parkinsons samples showed disease-specific NAHR signatures. Intra-chrom. NAHR was enriched in several AD/PD tissues compared to controls. We also found tissue- and disease-specific increase of SVs caused by NAHR in AD/PD genes (-6)

To cross-validate capture-seq NAHR data, we sequenced on @nanopore PromethION WGS libraries of temporal cortex of control, PD and AD donors + kidney/liver from 3 controls. TE-reX identified >500k NAHR events in WGS data, less than 1% shared with capture-seq (-5)

Profiling of NAHR in @nanopore data largely confirmed capture-seq findings, with some exceptions. For example, the cont of NAHR per million reads was higher in PD vs Controls/AD. Intra-chromosomal NAHR of proximal repeats was enriched in all samples (-4)

We confirmed an almost exact deletion to duplication ratio of 2:1 in all samples for NAHR of direct proximal repeats. PD samples had more deletions vs Controls/AD. We also confirmed enrichment of NAHR in peri-centromeric regions. Alignment to T2T CHM13 had no impact on this (-3)

Overall, we show that somatic NAHR of Alu/L1 is an important source of genomic diversity in health, development and disease. But not everything is perfect! Bulk samples = no direct measure of NAHR/cell. Relationship b/ween NAHR and DNA repair in non-dividing cells is unclear (-2)

Lots of work lies ahead also to understand if NAHR profiles in PD/AD are just a consequence of cell/tissue specific DNA damage, or if there is more to this. And what's the role of recombination in neural differentiation? We hope to be able tell you more about this soon (-1)

This has been a fantastic collaboration with bright people inside and outside @RIKEN_IMS, accompanied by a smooth and productive peer-review process with the superlative editorial team of @CellCellPress! It's been a long journey...that has only just begun. Thanks to all! /END

The article will be freely accessible for 50 days at this link:
authors.elsevier.com/a/1fTWa_278y%7…

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling