Tweet

Calling Bullshit

2 May, 26 tweets, 9 min read

In our course, we spend a couple of lectures talking about how tell whether scientific research is legit.

callingbullshit.org/tools/tools_le…

@lastpositivist

Appearing in a journal from a reputable scientific publisher is a good start. But it's no guarantee.

@lastpositivist pointed us a paper from a journal produced by the largest scientific publisher, Elsevier.

sciencedirect.com/science/articl…

The study looks for the genetic basis of psychic ability.

Yes, you read that right.

To do that, I suppose you have to start with psychics. And they did.

"Candidate cases were vetted for their psychic claims."

Vetted? How?

By self-report, followed by a couple of "tests"

(Interesting twist: the main test involved "precognitive ability" to guess a random number that a computer hadn't selected yet.)

Of course any good study needs a control group, and so they created one. Fair enough.

Interesting that all "vetted psychics" were Caucasian women over 37.

We are not told the demographics of the full 3000+ who filled out the original survey. Any guesses?

The control group then completed the same tests of psychic ability.

Here's the thing.

They did just as well as the group of "vetted psychics" did!

But wait! The text says they did better on the the Remote Viewing test. Maybe that's the one that really matters, and the other tests aren't good tests.

Nope. Read the fine print under the table.

The authors *claim* the difference reached statistical significance in the text, but in the fine print note that it didn't actually reach significance once you account for multiple comparisons.

If you ever see these sort of thing in a paper—either their claims that that "most Xs are bigger than Ys, even though it's not significant", or worse yet a claim in the text that a difference is significant coupled fine print in a table saying that it is not—be very wary!

Let's take stock where we are: We've got a group of "vetted psychics" and a control group of random non-psychics, who have scored the same on the researchers' own test of psychic ability.

And where we're going: We want to find genetic differences associated with the differences in psychic ability THAT WE DIDN'T FIND BETWEEN OUR CASE AND CONTROL GROUPS.

How would you proceed at this point? I sure wouldn't waste the genetic tests. But the authors are bolder than I.

They went ahead and collected some sort of unspecified DNA sequence data. Whole genome sequence I would guess? Remarkably, I don't think the paper ever tells us.

And what did they find?

No differences between vetted psychics and controls in protein coding regions.

Nope, all they found was that all of the case samples but only two of the control samples had an A instead of a G in an intron, a part of the gene that is cut out before it is used to code for a protein. Such differences are generally expected to have no physiology effect.

Let's take stock again.

Now we've got a group of vetted psychics and a group controls who have

(1) no differences in psychic ability on the researchers' own choice of tests and

(2) no genetic differences.

Should be a pretty open and closed case, right?

Well, not according to the Discussion section of the paper.

The authors argue that the reason they didn't see any statistical difference in ability was not because there was none, but rather "likely due to the small number of participants".

A couple of observations here.

First, if I look for difference in flying ability between penguins and ostriches and find none, there's more likely hypothesis than "the small number of participants".

Second, sample size is under the control of the investigators, let these investigators argue that "the performance tests were not powered to detect differences."

Why on earth would you run a study you knew was not powered to detect the differences you were looking for?

But we shouldn't make too much of those tests anyway, the authors tell us. The psychics reported that the "tasks did not adequately capture their particular set of skills."

That's what I told Mrs. McClellan after I failed seventh grade social studies.

She gave me an F anyway.

Here I pause simply to note that the authors must travel in different circles than I do.

Besides, the authors note, there were some differences between cases (psychics) and controls.

Cases were more likely to believe in paranormal phenomena.

No kidding. You just screened them for *believing they were psychic.*

Next we get detailed cross-cultural sociogenetic analysis.

Wow. That sound like some serious science.

No. Actually it's a term that's appeared only once before—in a different paper by one of this study's authors.

Don't be intimidated by jargon! Or least make sure it's real.

Anyone with a bit of background in population should read the above. It's wild. To summarize, the authors find a single base pair difference in a non-coding genomes of a very small sample of people who don't differ in psychic ability, and from this posit a selective explanation.

Namely, modernization led to relaxed selection on psychic ability (which, I remind you, is determined by this single base-pair difference in a non-coding region and not manifested in the phenotypes of the study subjects), and this is driving a shift in allele frequencies.

Another tip: the extraordinary amount of effort here put into the analysis of a complete non-result should itself be a red flag, even if you don't know enough population genetics to realize that GPT-3 could have written something more convincing.

Next the authors note that the more common allele in their controls is less common in the population at large. They forget that the controls are just that—random controls—and not people selected for lack of psychic ability.

So they conclude that psychic ability must be common!

Honestly I'm a bit exhausted at this point and feel like I just finished watching Inception backward, stoned. I'll leave you with the conclusion of the paper and remind you that it is based on finding no genetic differences between two small groups with no phenotypic differences.

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @callin_bull

Calling Bullshit

@callin_bull

1 May

Here's an interesting example of how many of our lessons you can pack into one story.

We start with a forward reference headline. (Answer: Yellow).

Then a completely meaningless subhead, unless by perfect combination they mean minimal supply, maximal demand. (They don't).

To be fair, the FOX story manages not to claim that color has a causal effect on resale value, though the implication seems to be there.

The study itself, run by an online car sales site, fails this test badly, using causal language throughout.

iseecars.com/car-color-study

The problem is, this study is hopelessly confounded. It doesn't control for make and model, options, sticker price, etc.

Read 6 tweets

Calling Bullshit

@callin_bull

19 Mar

https://twitter.com/MLevitt_NP2013/status/1372571271117029385

Nobel Laureate Michael Levitt is worried about declining sperm counts, but he says he hasn't done any reading yet.

Should he panic? Let's dig a little deeper.

https://twitter.com/MLevitt_NP2013/status/1372571271117029385

He links to an article by Erin Brockovich. (Yes, *that* Erin Brockovich!)

Here's the money quote.

In our class, one of the fundamental rules for spotting bullshit is this:

"If something seems too good or too bad to be true, it probably is."

Zero sperm counts in 2045 sounds pretty bad.

When you see something like that, it's time to dig deeper and track back to the source.

Read 14 tweets

Calling Bullshit

@callin_bull

3 Mar

@joel_c_miller

My colleague, epidemiologist @joel_c_miller, has done a great job of debunking mis- and disinformation throughout the pandemic. In this great thread, he takes on the claim that COVID is basically harmless, and any excess deaths are due to fear and stress from social precautions.

https://twitter.com/joel_c_miller/status/1367006825598521347

Instead of calling the person an idiot, he does nice job of explaining how you might test such a hypothesis — and then looks to the data to show that this story about fear and stress is entirely unsupported. The whole thing is well worth a read.

But there's something else interesting here.

The fear-and-stress argument is introduced with an historical account about a medieval experiment conducted by medieval Persian philosopher Avicenna / Ibn Sīnā.

The story is *total bullshit.*

Avicenna did no such experiment.

Read 15 tweets

Calling Bullshit

@callin_bull

18 Feb

@RonDeSantisFL

I love seeing journalists do a textbook job of calling bullshit on the misleading use of quantitative data.

Here's a great example. @RonDeSantisFL claimed that despite having schools open, Florida is 34th / 50 states in pediatric covid cases per capita.
nbcmiami.com/news/local/des…

I don't know for certain what set off their bullshit detector, but one rule we stress in our class is that if something seems too good or too bad to be true, it probably is.

DeSantis's claim is a candidate.

Below, a quote from our book.

The very next paragraph of the book suggests what to do when this happens: trace back to the source. This is a key lesson in our course as well, and at the heart of the "think more, share less" mantra that we stress. Don't share the implausible online until you've checked it out.

Read 9 tweets

Calling Bullshit

@callin_bull

5 Dec 20

In science, people tend to be most interested in positive results — a manipulation changes what you are measuring, two groups differ in meaningful ways, a drug treatment works, that sort of thing.

Journals preferentially publish positive results that are statistically significant — they would be unlikely to have arisen by chance if there wasn't something going on.

Negative results, meanwhile, are uncommon.

Knowing that journals are unlikely to publish negative results, scientists don't bother to write them up and submit them. Instead they up buried file drawers—or these days, file systems.

This is known as the file drawer effect.

(Here p<0.05 indicates statistical significance.)

Read 23 tweets

Calling Bullshit

@callin_bull

3 Dec 20

Jevin West was away today so in lecture I was able to sneak in one my favorite topics, observation selection effects.

Let's start a little puzzle.

In Portugal, 60% of families with kids have only one child. But 60% of kids have a sibling.

How can this be?

@TimScharks

People are all over this one! And some are out ahead of me (looking at you, @TimScharks). We'll get there, I promise!

There are fewer big families, but the ones there are account for lots of kids.

If you sampled 20 families in Portugal, you'd see something like this.

@TimScharks

@TimScharks Now let's think about class sizes.

Universities boast about their small class sizes, and class sizes play heavily into the all-important US News and World Report college rankings.

For example, @UW has an average class size of 28.

Pretty impressive for a huge state flagship.

Read 15 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Calling Bullshit

Try unrolling a thread yourself!

More from @callin_bull

Calling Bullshit

Calling Bullshit

Calling Bullshit

Calling Bullshit

Calling Bullshit

Calling Bullshit

Did Thread Reader help you today?

Like this author's thread?