I reviewed quite a few reproductive #epigenetics papers over the last several years and I find myself saying the same things over and over. So a thread 🧵 on things to consider and important information to include in your manuscript 1/
As there are many studies using #microarray data to measure DNAm. Beta values are great for visualization, but should not be used for statistical tests. Here are some citations explaining why - link.springer.com/article/10.118…
Multiple test correction needs to be performed when you are performing multiple tests.
If you do not do this, it should at least be transparent and rationale for not doing so, clearly explained 5/
This is a big one! CORRELATION != CAUSATION
Unless you include functional studies that show a mechanism of cause, you can only infer association. 6/
Validating results vs. Confirming results.
If you perform a different technique on the same samples and see you change in DNAm or gene expression, this is confirming the results.
To validate the results you need to see that change in an independent cohort. 7/
GO analyses - if you are not looking at every gene or CpG site in the genome, you should correct for the background in which your experiment is performed.
Ex. 450K and 850K are enriched for neuro and developmental genes. They are more likely to come up by chance 8/
Preprocessing and normalization of data is important! Don’t skip over it, you spent a lot of time doing that. Give yourself the credit! 9/
If you have numerous datasets and/or models, tables are helpful to summarize. I find myself getting lost in some of the larger papers 10/
The terms hyper and hypomethylated are not defined well. It’s never clear to me whether authors are talking less/more methylation compared to control or that a site is high or low methylated absolutely. Make sure to define this or avoid the terms 11/
Anyway! Hopefully this will help someone!
• • •
Missing some Tweet in this thread? You can try to
force a refresh