Glad to see someone else really have a proper look at the data, and to see our raw analysis be replicated. After looking briefly at Jesse's results, I agree, it largely in agreement!
That said there are some enormous issues in interpretation and presentation here to address (1/7) Image
Most inaccurate to me is table 1. It relies on an arbitrary 20% cutoff of all chordates to dismiss many samples with wildlife DNA/RNA. I've put it below, with a corrected version on the right without this cutoff. Multiple viral positive samples had raccoon dog DNA. (2/7) Image
This cutoff is wrong for a few reasons. #1: wildlife samples have naturally more species diversity, so fewer hits >20%. #2: the virus is generally rare in the samples, and the host likely is too. #3: mammalian DNA% is really what is relevant for guessing who shed the virus. (3/7)
Second, here was exactly what we wrote about correlation analyses. I've put it next to Bloom's correlation with the species labeled. Inferring the meaning of this juxtaposition is left as an exercise to the reader. (4/7) Image
Third, the most lacking part of the somewhat silly correlation figures made to date (I do think we can all agree!) is how they do not show their error bars. Here I've remade Bloom Fig 5b (linear axes) with some basic 95% confidence intervals. These don't say anything... (5/7) Image
Finally, correlations, abundances... meh, never going to "prove" much, either way. I'd prefer to do science that makes predictions:
The market was identified as the epidemiological center of the virus in the city, and a stall as the viral hotspot in the market last year.
After, the data came out: the exact right species were in the exact spot they were predicted to be, the very center of all known epidemiological links within the city, and the hot spot within the market. And the 2 viral lineages found nearby.
Look for predictions that work! (7/7)

