Thread on relationships between researchers and statistical consultants. Prompted by a few recent tweets, but not only those as this is a recurring and always-relevant conversation.
On the "researcher seeking stats help" side, there is an often-justified feeling that statistical consultants are difficult to work with (even those in good faith) and sometimes downright unhelpful or unpleasant.
So - let's address those right up front as part of this thread about making these relationships productive & relatively happy.
1. Yes, the statistical consultant should approach any new meeting with fresh eyes and ears, and should not carry any past bad experiences into a new meeting.
2. Yes, the statistical consultant should listen to the needs of the researcher & provide statistical support that actually meets the researcher's needs, not just use the person's problem to try to do whatever they like best.
These are definitely real peeves that I have with *my* colleagues (the statisticians) when I hear stories from folks that say they tried to go meet with a statistician, but found this unhelpful or unproductive.
Why do these things happen? I have a few thoughts, and then some suggestions for how to minimize the risk of a bad experience / maximize the chance of a productive relationship.
1. Sometimes it's that the statistician is ill-suited for the role of consulting statistician. Technical expertise /=/ good at working on applied problems. As likely is the case with any profession, statisticians vary in their skills, knowledge, and communications.
2. Sometimes it's that the researcher doesn't really understand the model that the statistician's position expects them to collaborate in. I think I wrote a thread on this awhile back...will maybe try to cover this in a separate thread.
3. Sometimes it's that the statistician has been worn down by months/years of being brought poorly done or conceived projects, poorly formatted datasets, on short deadlines ("My abstract is due tomorrow, can you just do this real quick for me?") and they've lost patience.
NOTE: none of these are excuses for being rude or unhelpful to a prospective collaborator (or anyone, really). And yeah, some statisticians are just bad at being collaborative statisticians.
But, sometimes it's also because the researcher may have (inadvertently or not) pushed one of the buttons common to those who have spent any time in a stats consultant-type role.
So, if you're a researcher seeking statistical support from someone, what can you do to try to minimize the risks of a bad experience or maximize the chances of a good experience?
1. Engage your statistician as early as possible in your project. This will vary from project to project, I'm sure, and sometimes you might say "too late for that..." but the point is to just make it as early as possible. Before you submit your grant / collect data if you can.
2. If you're able to engage them that early, offer them to sit in on discussions of the data collection process and give a chance to provide feedback & suggestions on how you'll record & store the data.
2b. (one of the top pet-peeves of stat consultants is being brought horrendously formatted data sheets by people that didn't know better)
3. The early-engagement is also crucial to give the stats consultant a chance to comment on the design, planned analyses, expected outputs of the project.
3b. Again, one of the biggest pet-peeves which probably leads to some frustrated stat-consultant behavior is when someone comes in (often with one of those poorly formatted datasets!) literally asking the impossible
3c. once I was brought a dataset of 40 patients with 4 outcomes & told they wanted a risk prediction model.
Is that an excuse to be rude? No, but a likely response to requests like this is "Sorry, but the truth is you can't really do that with the data you have"
(Yes, yes, there are *polite* ways to say that and less-polite ways, and it should be obvious that one should be *polite* about saying that & better still follow it with "...but here's what we *can* realistically do with the data you have...")
3d. Early-engagement also avoids this frighteningly-common scenario: "Hi, I was told you help people with stats, I have an abstract that's due tomorrow and just need a few p-values to finish it up. Can you meet this afternoon at 4PM? I've attached my dataset to this email"
4. Ask the statistician what their expectations/needs are for the division of labor and share yours in return.
4b. Do you expect them to "just" provide some feedback on analyses that you're doing? Do you expect them to do all of the analyses and produce tables and figures, etc? Both can work, but make sure both sides know what the expectations are.
5. This is probably the hardest one to know before going, but it's something to know when you initiate contact of have that first meeting: ask what their funding model is and what their position's needs or expectations are, and be respectful of what their response is.
5b. Some of the hard-feelings here are created by people flat out not knowing how different statisticians are expected to work (cont...)
5c. I had one exchange where a physician said they had emailed all the professors in the statistics department & were surprised that none took on the project.
5d. I asked if he had offered to pay them for the time. No, he said, but since they were statistics professors in the same University that he was, shouldn't they just do the project for free? After all, he would make them an author!
I'll try to add some more thoughts here later, or a full thread, about different collaborative models for stats consultancy inside and outside academia, but the main takeaway is - if you don't already know the statistician's funding model and expectations, ask them.
(and if their terms won't work for you...well, feel free to move on! If they need to be paid and you can't pay - clearly a different solution is needed - but it's kinda silly to try to guilt people with a plate full of work that pays their salary into doing your thing for free)
Oh, also - this isn't just about payment - but understanding what their professional needs are, too - and whether actually taking on your project actually does anything for them
(and more broadly, the expectations of their position / division / department / etc and where they sit in the overall structure...if they're a hired consultant *for your group* things are different than if they're a professor in a different dept or school)
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Has anyone in *medicine* (or otherwise, but particularly interested in US academic medicine) actually proposed a study where they said they'd use an alpha threshold above 0.05? How was it received? (cont)
(Also, please do me a favor, spare me the arguments about NHST being a flawed paradigm on this particular thread)
Clearly not all studies have the same tradeoffs of a false-positive vs a false-negative finding, and in some cases a higher alpha threshold seems like it should be warranted...
@Jabaluck@_MiguelHernan@aecoppock I think (perhaps unsurprisingly) that this shows “different people from different fields see things differently because they work in different contexts” - the scenario you painted here is not really possible with how most *medical* RCTs enroll patients & collect baseline data
@Jabaluck@_MiguelHernan@aecoppock The workflow for most medical RCTs (excepting a few trial designs…which I’ll try to address at the end if I have time) is basically this:
@Jabaluck@_MiguelHernan@aecoppock 1. Clinics/practices/hospitals know that they are enrolling patients in such-and-such trial with such-and-such criteria.
Amusing Friday thoughts: I've been reading Stuart Pocock's 1983 book Clinical Trials: A Practical Approach (do not concern yourself with the reason).
There is a passage on "Statistical Computing" in Chapter 11 of the book which one might have expected would age poorly, but is in fact remarkable for how well several of the statements have held up.
"I would like to refer briefly to the frequent misuse of statistical packages. Since they make each analysis task so easy to perform, there is a real danger that the user requests a whole range of analyses without any clear conception of what he is looking for."
Fun thread using some simulations modeled on the ARREST trial design (presented @CritCareReviews a few months ago) to talk through some potential features you might see when we talk about “adaptive” trials
DISCLAIMER: this is not just a “frequentist” versus “Bayesian” thread. Yes, this trial used a Bayesian statistical approach, but there are frequentist options for interim analyses & adaptive features, and that’s a longer debate for another day.
DISCLAIMER 2: this is just a taste using one motivational example for discussion; please don’t draw total sweeping generalizations about “what adaptive trials do” from this thread, as the utility of each “feature” must always be carefully considered in that specific context
Here is a little intro thread on how to do simulations of randomized controlled trials.
This thread will take awhile to get all the way through & posted, so please be patient. Maybe wait a few minutes and then come back to it.
This can be quite useful if you’re trying to understand the operating characteristics (power, type I error probability, potential biases introduced by early stopping rules) of a particular trial design.
I will use R for this thread. It is free. I am not interested in debates about your favorite stats program at this time.
If you want to do it in something else, the *process* can still be educational; you’ll just have to learn to mimic this process in your preferred program.
Here’s a brief follow-up thread answering a sidebar question to the last 2 weeks’ threads on interim analyses in RCT’s and stopping when an efficacy threshold is crossed
The “TL;DR” summary of the previous lesson(s): yes, an RCT that stops early based on an efficacy threshold will tend to overestimate the treatment effect a bit, but that doesn’t actually mean the “trial is more likely to be a false positive result”
(Also, it seems that this is generally true for both frequentist and Bayesian analyses, though the prior may mitigate the degree to which this occurs in a Bayesian analysis)