Happy Thursday! Today, I'd like to introduce and discuss various approaches, innovations, and resources for introducing Bayesian statistics to the undergraduates! I am sure I will miss something good, so feel free to add yours or the ones you know.
First, a little bit history. Bayesian methods became widely used, thanks to the computational advances in early 1990s, including the Gibbs sampler and Metropolis Hastings algorithms (e.g. Gelfand and Smith (1990)).
However, even before that revolutionary advance, innovative educators had designed ways to introduce Bayes to students: e.g. emphasizing the intuition on specifying prior for a data analysis problem while relying on numerical integration, Franck et al. (1988).
After the computational advances, Bayesian education also took off, though mainly at the graduate level. At the undergraduate level, the 1996 JSM had an invited session on the advantages, disadvantages, rationale, and method for teaching intro stats from a Bayesian perspective.
There were 3 papers later appeared in The American Statistician, all coming out from that JSM session. Berry (1997) discussed how to teach Bayesian elementary statistics with real applications in science. Albert (1997) proposed teaching Bayes' rule from a data-oriented approach.
The third paper, Moore (1997), argued that it was premature to teach Bayes ideas and methods in an intro stats course and discussed 4 obstacles: #1 Bayesian techniques were little used. #2 Bayesians had not agreed on standard approaches to standard problem settings.
#3. Conditional probability can be confusing to beginners. #4. The teaching and learning of Bayesian inference might impede the trend toward experience with real data and a better balance among data analysis, data production, and inference.
Nevertheless, in the late 1990s and early 2000s, materials had been developed for introductory Bayes teaching and learning, e.g. Statistics: A Bayesian Perspective by Berry (1995) and Workshop Statistics: Discovery with Data, A Bayesian Approach by Albert and Rossman (2001).
In 2000s, many textbooks and papers are published, mainly on graduate level Bayesian education for statisticians and / or scientists in other fields. For statistics grad students, there are (in order of 1st edition) #1 Bayesian Data Analysis by Gelman et al. (now in 3rd edition).
#2. A First Course in Bayesian Statistical Methods by Hoff. #3. Bayesian Essentials with R by Marin and Robert (it has a companion R package). #4. Bayesian Statistical Methods by Reich and Ghosh.
Bayes methods also became widely used in other sciences, so the education of it also caught on. Most notably, psychology and cognitive science. e.g. Doing Bayesian Data Analysis by Kruschke, Bayesian Cognitive Modeling by Lee and Wagenmakers.
In addition to psychology and cognitive science, marketing and business also had textbook developed for Bayes, e.g. Bayesian Statistics and Marketing by Rossi, Allenby, and McCulloch.
Statisticians also brainstormed and contributed to Bayesian education for non-statisticians, e.g. Gelman (2008) and Utts and Johnson (2008) in The American Statistician.
Now, back to teaching Bayes to undergraduates! Over the years, there had been many innovative teaching strategies proposed and published. In addition to Franck et al. (1988), there was Kuindersma and Blais (2007), on teaching Bayesian model comparison with a physics application.
In the last few years, there were an active-learning exercise of Bayes inference with m&m's (Eadie et al. 2019), a web simulator for Bayes theorem with applications to search for the USS Scorpion (Barcena et al. 2019).
There was also teaching Bayes theorem by looking at strength of evidence as predictive accuracy (Rounder Morey, 2019). All these are great teaching tools for classroom!
What about semester-long Bayesian courses and curriculum design? Witmer (2017) presented ideas on teaching an undergraduate Bayesian course that uses Markov chain Monte Carlo and that can be a second course or, for strong students, a first course in statistics.
More recently, the @JStatEd published a Bayes cluster, consisted of 5 articles on Bayes methods and the undergraduate statistics and data science curriculum in 2020: tandfonline.com/toc/ujse20/28/…
Hoegh (2020) in "Why Bayesian Ideas Should Be Introduced in the Statistics Curricula and How to Do So", discussed exactly what the title suggested. It also contains a rich compilation of materials that can be used to create Bayesian courses in the Supplementary Materials.
Hu (2020) in "A Bayesian Statistics Course for Undergraduates: Bayesian Thinking, Computing, and Research" presented approaches to designing an upper level elective course for students with calculus and probability backgrounds.
Albert and Hu (2020) in "Bayesian Computing in the Undergraduate Statistics Curriculum" provided an overview of the various options for implementing Bayesian computational methods motivated to achieve particular learning outcomes.
Albert (2020) in "Review of Statistical Rethinking: A Bayesian Course with Examples in R and Stan, Second Edition, by Richard McElreath, Chapman and Hall, 2020" provided a review to the best seller of McElreath's Statistical Rethinking.
And lastly Johnson et al. (2020) in "Teaching an Undergraduate Course in Bayesian Statistics: A Panel Discussion" gathered five Bayes educators as a panel with a goal of providing information and advice about teaching a course in Bayesian statistics for undergraduates.
CAUSE recently gathered a few authors from the @JStatEd Bayes cluster on a webinar on "Bayesian Methods and the Statistics and Data Science Curriculum". Link to the slide deck and recorded webinar: causeweb.org/cause/webinar/…
24 years after the 1996 JSM session on intro stats and Bayes, in 2020 another JSM session on "Thinking Beyond the P-Value: Advancing Bayesian Education for the Undergraduates" took place. Link to the slide deck and Q&A: github.com/monika76five/t…
To help you brainstorm to create your own Bayes for undergraduates course, I will next include links to a few courses with course material publicly available online. Please add yours or the ones you know, if I miss them!!
Statistical Rethinking by Richard McElreath, suitable for undergrads and grads: github.com/rmcelreath/sta…
Bayesian and Modern Statistics at Duke University by Jeff Miller (suitable for undergrads and grads): jwmi.github.io/BMS/index.html
Introduction to Bayesian data analysis at University of Potsdam by Shravan Vasishth, with a focus on linguistics and psychology: vasishth.github.io/IntroductionBa…
Bayesian Statistics at Vassar College by Monika Hu (suitable for undergrads): github.com/monika76five/U…
Bayesian Statistics at Carleton College by Adam Loy (suitable for undergrads): github.com/aloy/math315-f…
Also, a list of suitable textbooks for Bayes for undergrads!
Bayes Rules! An Introduction to Bayesian Modeling with R: bayesrulesbook.com
One treasure found I have discovered recently is ThinkBayes2, based on which I am doing an independent study with some students who are interested in Bayes and with some Python background. We are all enjoying the book and the jupyter notebooks: github.com/AllenDowney/Th…
It's amazing to see so many innovative approaches to Bayes pedagogy and more and more statistics and data science educators working to include Bayes in their undergraduate curriculum. I have no doubt that more is to come :)
I would add Bayes educators seem to have had a long and ongoing discussion about what software packages to use in an undergrad course: plain R / Python vs JAGS / BUGS vs Stan vs Stan-based packages. These discussions manifest the important role computing plays in teaching Bayes.
Okay I think I am done for now. Thank you for reading this super-long thread. I am sure I must have missed some good things to share so please feel free to add!
• • •
Missing some Tweet in this thread? You can try to
force a refresh
Happy Friday!! Today I'd like to describe two important approaches to data privacy research and applications: synthetic data and differential privacy. I hope to generate more interests in this area among researchers and practitioners!
1/n Data privacy and data confidentiality are important topics for statisticians, computer scientists, and really, anyone offers their own data and consume data!
2/n Statistical agencies, in particular, are under legal obligations to protect the privacy and confidentiality of survey and census respondents, e.g. U.S. Title 26.
Let’s talk vectorization! You may have heard about or experienced how simple NumPy array ops (such as dot product) run significantly faster than for loops or list comprehension in Python. How? Why? Thread incoming.
Suppose we are doing a dot product on two n-dim vectors. In a Python for loop, scalars are individually loaded into registers, and operations are performed on the scalar level. Ignoring the sum, this gives us n multiplication operations.
NumPy makes this faster by employing vectorization, where you can load multiple scalars into registers and get many products for the price of one operation (SIMD). SIMD — single instruction, multiple data — is a backbone of NumPy vectorization.
Today I will be talking about some of the data structures we use regularly when doing data science work. I will start with numpy's ndarray.
What is an ndarray? It's numpy's abstraction for describing an array, or a group of numbers. In math terms, arrays are a "catch all" term used to describe matrices or vectors. Behind the scenes, it essentially describes memory using several key attributes:
* pointer: the memory address of the first byte in the array
* type: the kind of elements in the array, such as floats or ints
* shape: the size of each dimension of the array (ex: 5 x 5 x 5)
* strides: number of bytes to skip to proceed to the next element
* flags
During my leave I’ve really enjoyed reading about the inspiring women trailblazers in statistics who paved the way for us. Here are some of my favourite quotes in chronological order. Please share yours! #WSDS
Florence Nightingale states in her essay Cassandra 👇
🖼 source: Wikimedia commons
I’m really looking forward to attending this 👇 #Nightingale2020 has been one of the few things worth celebrating this year! Her lessons on sanitation couldn’t be more relevant. #WSDS
As part of the bicentennary celebrations of the birth of the first @RoyalStatSoc woman elected fellow, at the society we’ve also organised several events throughout the year rss.org.uk/news-publicati…
Support mechanisms for students and early career researchers have become ever so important during the pandemic, yet more difficult to provide.
🖼️Another beautiful and on-point creation by @allison_horst
@allison_horst As a consequence, the power and potential of the support they receive from online communities like this one have been strengthened by the circumstances. I have personally valued them more than ever.
@allison_horst When I registered to curate this account earlier in the year I didn’t know there was going to be either a pandemic or elections. I just thought it would be a nice way to return to work after extended maternal leave, and a great way to get my confidence & stats interests back.