Here's a a couple dozen tweets on how this happens, and what to do about it. #tweetstorm
That overreliance on data is especially bad with small n, or in complex domains, or in domains with non-predictive theory.
This is basically @nntaleb's case of induction by turkeys - they weren't killed yet, so they are safe. In fact, even with very large n, all data is conditional, not absolute.
That means we should identify what the key uncertainties about our conclusions are. If possible, we should also find where further evidence would help, consider gathering it.
Theory predicts that X% of height is genetic, conditioned on current distribution of population genetic variation, and on sufficient nutrition as a child, and on age, and on not having osteoporosis.
If you liked the tweetstorm, check out my other work!