An opinionated thread to give context for
1/8
eg in "standard" analysis for brain mapping onlinelibrary.wiley.com/doi/full/10.10…, for machine learning in brain imaging
sciencedirect.com/science/articl…
or more generally in "hypothesis driven" statistical testing
go.gale.com/ps/anonymous?i…
2/8
Machine learning provides these, and tree-based models need little data transformations.
3/8
Cross-validation and permutation importance provide these, once we have chosen input (endogenous) and output (exogenous) variables.
4/8
I no longer trust such endeavors, including mines.
5/8
scikit-learn.org/stable/modules…
They are robust to data distribution and support missing values (even outside MAR settings arxiv.org/abs/1902.06931)
6/8
But applying them without thousands of data points (as I tried for many years) is hazardous. Get more data, change the question (eg analyze across cohorts).
7/8
For more, push me to write a paper on this.
8/8