Tweet

Brian Nosek

3 Oct, 18 tweets, 7 min read

For the last 2.5 years, my daughters and I have been rating breakfast places in #charlottesville #cville area. We rated 51 restaurants from 1 (worst) to 10 (best) on taste, presentation, menu, ambiance, & service. We also recorded cost-per-person.

Here's what we learned. 1/

Across 51 restaurants we spent $1,625.36 pre-tip, average cost of $9.91/person (sometimes other family members joined).

Cheapest per person: Duck Donuts $3.10, Sugar Shack $3.41, Bojangles $4.30.

Most expensive per person: The Ridley $27.08, Farm Bell Kitchen $17.81, Fig $17.44

Averaging all 5 ratings across all raters is one way to determine an overall rating. The grand average is 7.1 out of 10 w/ a range of 4.8 to 9.1. How strongly related are cost per person and overall rating?

r=0.36

Just 13% of the variation in quality is associated with cost.

Assuming a linear relationship, this modest relationship between cost and quality means that a one point higher quality rating is associated with $12.47 greater cost per person.

But, this modest relationship between cost & quality also means that there are restaurants that exceed the expected quality given their cost (good values) or fall short of expected quality given their cost (bad values). The best & worst values (quality vs cost) are pictured.

Regardless of cost, the top rated breakfast places were Thunderbird Cafe, Quirk Cafe, & Croby's Urban Vittles (now closed). And, the bottom rated were Dunkin', Bojangles, and Cavalier Diner.

When considering taste, presentation, menu, ambiance, & service separately, we get some additional insights.

The top row box shows correlation of each dimension with cost per person. Cost is most strongly associated with presentation & ambiance, & weakly w/taste, menu, & service

And, among the five ratings, taste & presentation are strongly related as are ambiance with service and presentation. Other factors are only modestly related.

This suggests that personal priorities about dining experience will lead to a different set of top breakfast places.

The Taste Top 10 features Bluegrass (now closed), Charlie and Litza's, Quality Pie and Oakhurts; and bottom 10 anchored by Cav Diner, Taco Bell, Bojangles, and Tip Top.

The Menu Top 10 features Bluegrass, IHOP, Fig, and Thunderbird; and Bottom 10 anchored by Starbucks, Quality Pie, Bowerbird, and Juice Laundry.

[Our idiosyncratic interests in types of breakfast is most apparent in menu ratings.]

And, the Presentation, Ambiance, and Service Top and Bottom 10 are pictured. Few places received top or bottom marks on all dimensions.

The observed variance across dimensions is interesting. Quality Pie had the most extreme variance (stdev=3.0) with among the highest ratings for taste and presentation and lowest for ambiance and menu. So good, but a disaster inside and right on a busy road outside.

Others with high variation across dimensions were

Taco Bell (2.4): surprisingly good service/ambiance, terrible presentation

Dunkin' (1.9): Good taste, disaster presentation/ambiance

Starbucks (1.9): Good taste, terrible presentation/menu

Finally, there was variation across raters. In blue, all three raters were positively correlated with 36% to 65% shared variance -- lots of idiosyncrasy in our assessments. In red, Joni's quality ratings were least correlated with cost.

Our individual Top and Bottom 10's are pictured. There was consensus on Thunderbird Cafe being the best breakfast place in the Charlottesville area. And, despite being a donut loving family, we had a terrible (dirty restaurant) breakfast at Dunkin'.

Individual interests also played out in unique ways. If Joni woke up in a bad mood, ratings for that restaurant suffered (sorry Farm Bell Kitchen, Michaels' Diner, and Oakhurst).

Finally, all data are publicly accessible for reanalysis and for creating better visualizations. docs.google.com/spreadsheets/d…

This concludes the first ever breakfast rating open science twitter thread.

A few visual highlights from Haven...

• • •

Missing some Tweet in this thread? You can try to force a refresh

This Thread may be Removed Anytime!

Twitter may remove this content at anytime! Save it as PDF for later use!

More from @BrianNosek

Brian Nosek

@BrianNosek

3 Oct

@LastCrumbCookie

Parents: Taste-testing is a great way to engage kids on science and how methodology can address potential biases.

Here's an example w/@LastCrumbCookie we did recently. This is after a few years of lots of simpler taste tests.

@LastCrumbCookie

Most of our prior taste tests were amenable to some blinding such as comparing fast food chicken and fries and testing generic vs. brand-name.

Haven wanted to do @LastCrumbCookie and compare their 12 varieties. Blinding not possible & 12 is a lot to test! Challenging for design.

Haven and Joni also wanted to do a pre/post comparison to see if our expectations were good predictors of what we would actually like.

Finally, they wanted to evaluate the cookies holistically and needed a measurement strategy that captured important variation between cookies.

Read 14 tweets

Brian Nosek

@BrianNosek

17 Jul

Sharpen your intuitions about plausibility of observed effect sizes.

r > .60?

Is that effect plausibly as large as the relationship between gender and height (.67) or nearness to the equator and temperature (.60)?

r > .50?

Is that effect plausibly as large as the relationship between gender and arm strength (.55) or increasing age and declining speed of information processing in adults (.52)?

r > .40?

Is that effect plausibly as large as the relationship between weight and height (.44), gender and self-reported nuturance (.42), or loss in habitat size and population decline (.40)?

Read 8 tweets

Brian Nosek

@BrianNosek

25 Jun

@Edit0r_At_Large

An ironic back story (h/t @Edit0r_At_Large for evoking the memory).

This paper was itself submitted as a Registered Report in 2018 but was rejected!

Reviews were excellent. Identified some limitations we could solve, others we would have needed to debate on feasibility grounds.

https://twitter.com/BrianNosek/status/1408081726044319749

@Edit0r_At_Large

@Edit0r_At_Large The journal did invite a resubmission if we wanted to try to address them. However, we ultimately decided not to resubmit because of timing. We had a grant deadline to consider.

We did incorporate reviewer suggestions that we could into the final design and proceeded.

We eventually had the full report and that was peer reviewed in the normal process.

We published the paper in Nature Human Behaviour.

The RR was originally submitted to Nature Human Behaviour.

I think the RR submission did meaningfully improve our design & odds of success.

Read 9 tweets

Brian Nosek

@BrianNosek

24 Jun

New in Nature Human Behavior: We had 353 peer reviewers evaluate published Registered Reports versus comparison articles on 19 outcome criteria. We found that RRs were consistently rated higher on rigor and quality.

Paper nature.com/articles/s4156…

Green OA osf.io/preprints/meta…

Figure shows performance of RRs versus comparison articles on 19 criteria and 95% credible intervals. Red criteria evaluated before knowing the results, blue after knowing the results, green summarizing whole paper.

@cksoderberg

Congrats to @cksoderberg Tim Errington @SchiavoneSays @julia_gb @FSingletonThorn @siminevazire and Kevin Esterling for the excellent work on this project to provide an additional evidence base for how Registered Reports can alter the credibility of published research.

Read 6 tweets

Brian Nosek

@BrianNosek

9 Feb

10 years of replication and reform in psychology. What has been done and learned?

Our latest paper prepared for the Annual Review summarizes the advances in conducting and understanding replication and the reform movement that has spawned around it.

psyarxiv.com/ksfvq/

1/

@Tom_Hardwicke

Co-authors: @Tom_Hardwicke @hmoshontz @AllardThuriot @katiecorker Anna Dreber @fidlerfm @JoeHilgard @melissaekline @MicheleNuijten @dingding_peng Felipe Romero @annemscheel @ldscherer @nicebread303 @siminevazire 2/

We open w/ anecdote of the 2014 special issue of Social Psychology. The event encapsulated themes that played out over the decade. The issue brought attention to replications, Registered Reports, & spawned “repligate”

econtent.hogrefe.com/toc/zsp/45/3

Figure from royalsocietypublishing.org/doi/full/10.10…

Read 15 tweets

Brian Nosek

@BrianNosek

10 Sep 20

@JProtzko

Our prospective replication study released!

5 years: 16 novel discoveries get round-robin replication.

Preregistration, large samples, transparency of materials.

Replication effect sizes 97% the size of confirmatory tests!

psyarxiv.com/n2a9x

Lead: @JProtzko 1/

When teams made a new discovery, they submitted it to a prereg’d confirmatory test (orange).

Confirmatory tests subjected to 4 replications (Ns ~ 1500 each)

Original team wrote full methods section. Team conducted independent replications (green) and a self-replication (blue).

Based on confirmatory effect sizes and replication sample sizes, we’d expect 80% successful replications (p<.05). We observed 86%.

Exceeding possible replication rate based on power surely due to chance. But, outcome clearly indicates that high replicability is achievable

Read 10 tweets

Support us! We are indie developers!

This site is made by just two indie developers on a laptop doing marketing, support and development! Read more about the story.

Become a Premium Member ($3/month or $30/year) and get exclusive features!

Become Premium

Too expensive? Make a small donation by buying us coffee ($5) or help with server cost ($10)

Donate via Paypal Become our Patreon

Thank you for your support!

Share this page!

Brian Nosek

Try unrolling a thread yourself!

More from @BrianNosek

Brian Nosek

Brian Nosek

Brian Nosek

Brian Nosek

Brian Nosek

Brian Nosek

Did Thread Reader help you today?

Like this author's thread?