# Teaching statistics

Alex Reinhart – Updated May 31, 2019

See also Statistical misconceptions, Pedagogy more generally, and Statistical programming languages for thoughts on programming in early statistics classes.

We usually teach statistics in the same way it was taught to us: with lectures, backed by a tedious textbook and perhaps some labs using antiquated statistical software. We joke that students usually hate introductory statistics courses, but it is no joke, as evidenced by the people who routinely react as though I’ve admitted to an interesting sexually transmitted disease when I say I study statistics. (The reactions tend to be worse than when I was a physics major, surprisingly enough.)

Unfortunately the literature on methods to teach statistics is fairly thin.

## High level overview

• [To read] Garfield, J., & Ben-Zvi, D. (2007). How students learn statistics revisited: A current review of research on teaching and learning statistics. International Statistical Review, 75(3), 372–396. doi:10.1111/j.1751-5823.2007.00029.x. http://dx.doi.org/10.1111/j.1751-5823.2007.00029.x

• [To read] Zieffler, A., Garfield, J., Alt, S., Dupuis, D., Holleque, K., & Chang, B. (2008). What does research suggest about the teaching and learning of introductory statistics at the college level? A review of the literature. Journal of Statistics Education, 16(2). https://ww2.amstat.org/publications/jse/v16n2/zieffler.pdf

• Veaux, R. D. D., & Velleman, P. F. (2008). Math is music; statistics is literature. Amstat News, 54–58.

Subtitle “Why Are There No Six-Year-Old Novelists?” Answer: there are mathematical prodigies because mathematics is self-contained and can be learned piece by piece. There are no literature prodigies because literature depends on life experience. Similarly, statistics “requires not just rules and axioms, but life experience and ‘common sense.’” Statisticians must ask if a method is appropriate, evaluate its implications, and communicate the result (visually or in writing) to others – skills not required in a math class, and devilishly difficult to teach. We must embrace and emphasize this to do justice to statistics’s important role at the center of science.

## Assessment

The first question is: How well do we teach statistics?

See also Student assessment.

• DelMas, R., Garfield, J., Ooms, A., & Chance, B. (2007). Assessing students’ conceptual understanding after a first course in statistics. Statistics Education Research Journal, 6(2), 28–58. https://www.stat.auckland.ac.nz/~iase/serj/SERJ6(2)_delMas.pdf

A valiant attempt to devise a test, the Comprehensive Assessment of Outcomes in Statistics, to judge students’ conceptual understanding of statistical concepts after an introductory course. Includes results from 763 students at 20 different colleges, which are terrifying:

• 12% of students know “the purpose of randomization in an experiment.”
• 50% know “that correlation does not imply causation.”
• Only 60% understand why “an experimental design with random assignment supports causal inference.”
• 31% of students understand “that statistics from small samples vary more than statistics from large samples.”
• 50% of students confuse random assignment with random sampling, or think that random assignment reduces sampling error—an increase over 36% before taking the course.
• More students think “Causation can be inferred from correlation” (36%) after the course than before (27%).
• Fully a third of students think that “Rejecting the null hypothesis means that the null hypothesis is definitely false.”

However, my reading of the CAOS exam itself was not positive: questions seem poorly phrased, at least one has a seriously ambiguous interpretation, and many seem more tests of whether you remember a fact (small p values mean significance) rather than whether you understand why that is true.

• Allen, K. (2006). The Statistics Concept Inventory: Development and analysis of a cognitive assessment instrument in statistics (PhD thesis). University of Oklahoma. https://ssrn.com/abstract=2130143

Questions here also aren’t very inspiring.

• Lane-Getaz, S. J. (2017). Is the p-value really dead? Assessing inference learning outcomes for social science students in an introductory statistics course. Statistics Education Research Journal, 16(1), 357–399.

The Reasoning about P-values and Statistical Significance assessment, revealing a number of misconceptions about p values held by students. Unfortunately focuses heavily on definitions and formal inference instead of concepts.

• Other inventories, with varying degrees of publicly available information, are the Statistical Reasoning Assessment (SRA), the Quantitative Reasoning Quotient (QRQ), and the Assessment of Inferential Reasoning in Statistics (AIRS). See also e-ATLAS.

• Surprisingly (or not), I can’t find much more data on student understanding in intro stats courses. Surely this kind of data is necessary to understand what our courses actually teach.

• For designing a new concept inventory, see the advice in Madsen, A., McKagan, S., & Sayre, E. C. (2014). Best practices for administering concept inventories. arXiv. https://arxiv.org/abs/1404.6500

For statistics, it seems like a worthwhile starting place would be trawling the literature for common misconceptions and errors expressed by students in intro classes, then explicitly designing questions to elicit these. See Statistical misconceptions.

## Interesting case studies and curricula

• Wild, C. J., Pfannkuch, M., Regan, M., & Horton, N. J. (2011). Towards more accessible conceptions of statistical inference. Journal of the Royal Statistical Society: Series A (Statistics in Society), 174(2), 247–295. doi:10.1111/j.1467-985x.2010.00678.x

Proposes building up to inference over several years (e.g. in high school) in much the same way as science classes gradually build better and better approximations of physics. The most interesting aspect is that their proposed progression is almost entirely visual, and pares statistics down to the fewest possible core concepts. They believe that sampling variation is the “critical element”, so they directly illustrate sampling from real finite populations, and deliberately show cases where sampling variation gives us results directly the opposite of what we know to be true. A compelling article.

• Dierker, L., Kaparakis, E., Rose, J., Selya, A., & Beveridge, D. (2012). Strength in numbers: A multidisciplinary, project-based course in introductory statistics. Journal of Effective Teaching, 12(2), 4–14. http://eric.ed.gov/?id=EJ1092198

Describes an intro project-based course for nonmajors. Students read material from CMU’s OLI before class, so the course can have a 2 to 1 lab to lecture ratio. Students are presented several curated datasets at the very beginning, asked to pick a research topic and review the literature, then learn methods they can use to answer their research question, managing to get through EDA, inference, regression, ANOVA, and logistic regression, all in one semester. The project provides strong motivation for the course material and OLI lets the class be mostly labs with hands-on data analysis (students get to use SAS, Stata, SPSS and R). Course evaluations show the students really liked the course, but there’s no check to see if they managed to grasp all the concepts in such a short time.

• Wagaman, A. (2016). Meeting student needs for multivariate data analysis: A case study in teaching an undergraduate multivariate data analysis course. The American Statistician, 70(4), 405–412. doi:10.1080/00031305.2016.1201005

An early course (taught both immediately after an intro class and without it) taking an algorithmic approach to data analysis, covering classification and clustering in the belief these require fewer mathematical prerequisites.

• [To read] De Veaux, R. D. et al. (2017). Curriculum guidelines for undergraduate programs in data science. Annual Review of Statistics and Its Application, 4(1), 15–30. doi:10.1146/annurev-statistics-060116-053930

## Active learning

I’m a fan of peer instruction (see Pedagogy), but there have been only a few evaluations of similar active learning methods in statistics:

• Carlson, K. A., & Winquist, J. R. (2011). Evaluating an active learning approach to teaching introductory statistics: A classroom workbook approach. Journal of Statistics Education, 19(1). http://ww2.amstat.org/publications/jse/v19n1/carlson.pdf

Finds significantly improved attitudes towards statistics in a flipped-classroom approach where students read material before class, complete a short homework assignment on the reading, and then receive a brief lecture summary. After the lecture was a workbook activity in which students worked out examples and answered conceptual questions, with live feedback from the instructor.

• Shinaberger, L. (2017). Components of a flipped classroom influencing student success in an undergraduate business statistics course. Journal of Statistics Education, 25(3), 122–130. doi:10.1080/10691898.2017.1381056

A gradual transformation of a business statistics class to being fully flipped, mostly with think-pair-share activities in class. Tests the usefulness of pre-class reading quizzes, formative assessment via online homeworks with instant feedback, and forcing students to do homework at a reasonable pace by only allowing access to later homework after doing earlier homework with a satisfactory grade.

• Nielsen, P. L., Bean, N. W., & Larsen, R. A. A. (2018). The impact of a flipped classroom model of learning on a large undergraduate statistics class. Statistics Education Research Journal, 17(1), 121–140. https://iase-web.org/documents/SERJ/SERJ17(1)_Nielsen.pdf

A reasonably well-designed experiment comparing a flipped classroom (partly inspired by peer instruction) to conventional lecture. The results were underwhelming: final exam scores and quiz scores were slightly higher for the flipped class, but not substantially, and no concept inventory was used to assess understanding separately from the final exam.

• Strayer, J. F., Gerstenschlager, N. E., Green, L. B., McCormick, N., McDaniel, S., & Holmes Rowel, G. (2019). Toward a full(er) implementation of active learning. Statistics Education Research Journal, 18(1), 63–82. https://iase-web.org/documents/SERJ/SERJ18(1)_Strayer.pdf

A very interesting qualitative study of how instructors implement active learning. Their “Modules for Teaching Statistics with Pedagogies using Active Learning” (MTStatPAL) project developed new course materials for active learning, pilot tested them in classrooms, and then took extensive notes on how instructors adopted them in their classes. One unifying theme was “Mathematical and Statistical Authority” (MSA): instructors found it difficult to relinquish their MSA during active learning activities, preferring to provide students with correct answers rather than making time to listen to students and elicit their reasoning. The instructors did ask questions and talk to students as they did active learning activities, but mostly to push them to the correct answer. Interviews with instructors found they struggled with this as well.

## Developing expertise

Students find it difficult to develop expert thinking, particularly in tasks like selecting the right analysis or the right graphics to use for a problem. They may be able to interpret results, but the meta-skill of picking the right analysis to interpret is only taught implicitly.

• Lovett, M. (2001). A collaborative convergence on studying reasoning processes: A case study in statistics. In S. M. Carver & D. Klahr (Eds.), Cognition and instruction: Twenty-five years of progress. Psychology Press.

Discusses think-aloud interviews and cognitive task analyses with students as they analyze data with Minitab. Suggests that “by using the statistics package interface cues”, students used “a basic guess-and-test strategy in order to generate analysis” rather than a systematic method, so “on average, the most informative statistical analysis… was the sixth analysis attempted by these students.” An automated interactive tutor which had students plan their analyses and gave immediate feedback seemed to quickly improve student skills.

## Randomization-based intro courses

I am easily swayed by the claim that teaching probability theory, distributions, t tests and their requisite assumptions, chi-squared tests, etc. is all wasted effort on nonmathematical students who need an introductory statistics course for basic statistical literacy. We need months of background material (probability, distributions, the central limit theorem) before we can get to any interesting statistical ideas, like inference.

Randomization-based courses, where we just bootstrap and permute everything, are appealing because a simple two-sample permutation test can be explained on the second day of class, without weeks of background. We can get right into the logic of inference and sampling variation without the overhead.

There’s a group at Hope College that has developed a textbook and a series of papers on the effectiveness of the method:

• Tintle, N., Chance, B., Cobb, G., Rossman, A., Roy, S., Swanson, T., & VanderStoep, J. (2016). Introduction to statistical investigations. Wiley. http://math.hope.edu/isi/

Chapter 1 is free on their website; I found it underwhelming, jumping into statistical significance on the first page with a confusing example, before the idea of sampling variation is even mentioned. I’d prefer a curriculum starting with quantifying uncertainty in estimates, and only incidentally developing tests, instead of treating tests as the primary goal. If the rest of the book is as confusing as the first few pages, I suspect the results in the following studies could be greatly improved with better course materials.

• Chance, B., Wong, J., & Tintle, N. (2016). Student performance in curricula centered on simulation-based inference: A preliminary report. Journal of Statistics Education, 24(3), 114–126. doi:10.1080/10691898.2016.1223529

Sort of a comprehensive update of the articles below, using results from their curriculum adopted by a bunch of different instructors at various institutions. The key line: “The overall average pre-test score was 0.498 and the overall average post-test score was 0.582.” That’s not very convincing, and they attempt to slice and dice the results in all sorts of ways (experience of instructor, gender, type of institution, GPA, and on and on), but don’t address the key question: why can’t we do better?

• Tintle, N., VanderStoep, J., Holmes, V.-L., Quisenberry, B., & Swanson, T. (2011). Development and assessment of a preliminary randomization-based introductory statistics curriculum. Journal of Statistics Education, 19(1). http://www.amstat.org/publications/jse/v19n1/tintle.pdf

Testing the randomization-based curriculum with CAOS – results are comparable, with improvements in inference concepts.

• Tintle, N., Rogers, A., Chance, B., Cobb, G., Rossman, A., Roy, S., Swanson, T., & VanderStoep, J. (2014). Quantitative evidence for the use of simulation and randomization in the introductory statistics course. In ICOTS9.

Use of their curriculum doubles improvement on CAOS for students: average scores went from 44.9% before to 56.5% after the course, which I suppose is notable because it means ordinary intro statistics did even worse.

• Swanson, T., VanderStoep, J., & Tintle, N. (2014). Student attitudes toward statistics from a randomization-based curriculum. In ICOTS9.

Measures student perceptions of statistics after a simulation-based course, compared to traditional intro stats. Not much of a difference.

• Tintle, N., Chance, B., Cobb, G., Roy, S., Swanson, T., & VanderStoep, J. (2015). Combating anti-statistical thinking using simulation-based methods throughout the undergraduate curriculum. The American Statistician, 69(4), 362–370. doi:10.1080/00031305.2015.1081619

No data here, just a comprehensive argument that simulation-based inference better teaches the core concepts of statistics.

• Maurer, K., & Lock, D. (2016). Comparison of learning outcomes for simulation-based and traditional inference curricula in a designed educational experiment. Technology Innovations in Statistics Education, 9(1). https://escholarship.org/uc/item/0wm523b0

Proper crossover trial of simulation-based teaching of inference concepts, backed by the ARTIST tests for assessment. Finds only a 7% improvement on test scores over conventional instruction.