See also Teaching statistics for statistics-specific topics. Also, see Student assessment for ways to see what students are actually learning.
See also Course evaluations, Cognitive task analysis.
Docktor, J. L., & Mestre, J. P. (2014). Synthesis of discipline-based education research in physics. Physical Review Special Topics - Physics Education Research, 10(2), 020119. doi:10.1103/physrevstper.10.020119
A thorough review of physics education research, including many aspects of pedagogy and assessment. (See also the supplemental section on suggested future work.)
Fraser, J. M. et al. (2014). Teaching and physics education research: Bridging the gap. Reports on Progress in Physics, 77(3), 032401. doi:10.1088/0034-4885/77/3/032401
A review of the reasons that evidence-based teaching methods are not widely used (in physics):
People don’t approach teaching with scientific rigor, instead thinking of it as an “art” featuring lots of charisma and energy. Since teaching is imagined as very specific to instructors and classes, education research can be discounted as not applying to anything outside the very specific setting in which it was conducted.
Lecturers believe in lectures. They have to cover lots of material, they learned it from lectures, and so their students will learn it from lecture. But, as discussed below, their students won’t learn it from lecture, and would be better engaged with other teaching methods.
Instructors worry that focusing on concepts will mean their students can’t solve traditional problems or do math. But the evidence shows students in conceptual classes do as well or better on quantitative questions. And, importantly, “Evidence suggests that coverage of material does not necessarily result in learning, and students ultimately retain only a fraction of the content delivered,” and “a focus on conceptual understanding tends to enhance retention”.
Instructors think students don’t like nontraditional courses, or that they demand too much time and attention from students. But interactive courses don’t require extra class time, and time use surveys suggests they don’t have a significantly different workload, though there aren’t extensive studies on this. Student satisfaction can vary, but is often good.
Faculty don’t have the time to read through the extensive education literature, design new course materials, and rebuild their courses. The authors concede this, suggesting incentives need to be realigned: instructors should get timely peer review on their teaching (though good methods to do so are not yet available), review the literature to find premade methods that work, and adjust their time priorities. (No need to spend hours writing careful lecture notes when spending the time making interactive activities would be more useful.)
Ambrose et al., How Learning Works: Seven Research-Based Principles for Smart Teaching (2010). A good review of general pedagogical research, intended as a practical guide for instructors.
Dunlosky, J., Rawson, K. A., Marsh, E. J., Nathan, M. J., & Willingham, D. T. (2013). Improving students’ learning with effective learning techniques. Psychological Science in the Public Interest, 14(1), 4–58. doi:10.1177/1529100612453266
A review of ten different learning techniques that students can use on their own. Finds that practice tests and distributed practice (spacing practice over a longer period of time) have high utility, elaborative interrogation, self-explanation, and interleaved practice have moderate utility, and little evidence for benefits of summarization, highlighting, mnemonics, imagery, and rereading.
Pashler, H., McDaniel, M., Rohrer, D., & Bjork, R. (2008). Learning styles: Concepts and evidence. Psychological Science in the Public Interest, 9(3), 105–119. doi:10.1111/j.1539-6053.2009.01038.x
It’s tempting to worry about students’ “learning styles”: whether they prefer visual, auditory, textual, or other explanation strategies. Turns out there’s no evidence that they learn better if they’re taught in the style they prefer.
Kirschner, P. A., & De Bruyckere, P. (2017). The myths of the digital native and the multitasker. Teaching and Teacher Education, 67, 135–142. doi:10.1016/j.tate.2017.06.001
No, “digital native” students do not have magical technology skills – they mostly use technology for passive consumption – and do not have magical abilities to multitask. In fact, those who are “good” at multitasking do worse at the tasks; the brain is single-core with a high context-switching overhead. A review of various studies in education.
Husmann, P. R., & O’Loughlin, V. D. (2018). Another nail in the coffin for learning styles? Disparities among undergraduate anatomy students’ study strategies, class performance, and reported vark learning styles. Anatomical Sciences Education. doi:10.1002/ase.1777
A study of a large undergraduate course found that “student performance in anatomy was not correlated with their score in any VARK [visual, auditory, reading/writing, and kinesthetic] categories” and that “students did not report study strategies that correlated with their VARK assessment”. In other words, students did not seem to study using strategies that matched their preferred learning style, and their preferred learning style did not correlate with their course performance.
Shepherd, M. D., & Sande, C. C. van de (2014). Reading mathematics for understanding—from novice to expert. The Journal of Mathematical Behavior, 35, 74–86. doi:10.1016/j.jmathb.2014.06.003
An interesting think-aloud study exploring how graduate students and professors read advanced mathematical texts. More experienced readers “read the meaning”, meaning they don’t always read every mathematical detail aloud and instead decode them on the fly. Compared to an earlier paper doing a similar procedure with undergrads, found that experts are more likely to frequently check their understanding, come up with examples to test their understanding, and carefully examine given examples to try to understand them.
Singer, L. M., & Alexander, P. A. (2017). Reading on paper and digitally: What the past decades of empirical research reveal. Review of Educational Research. doi:10.3102/0034654317722961
A slightly odd systematic review of the literature on reading in print vs. electronically. Spends most of its time discussing whether the reviewed studies adequately defined terms or reported all details of their experiments, but does conclude “when longer texts are involved or when individuals are reading for depth of understanding and not solely for gist, print appears to be the more effective processing medium”.
Singer, L. M., & Alexander, P. A. (2016). Reading across mediums: Effects of reading digital and print texts on comprehension and calibration. The Journal of Experimental Education. doi:10.1080/00220973.2016.1143794
I don’t like any paper that uses “concomitantly” in a sentence. Regardless, a within-subjects design presenting college students either print or digital texts, followed by short-answer comprehension questions. 69% of participants thought they did best on the comprehension questions when they read the digital text, but average scores were higher for the print text. Students are apparently not good judges of which medium works well for them. (There is a chi-squared test in Table 3 justifying this conclusion that I don’t understand; what exactly is presented in the table? The text says 69% preferred digital, but the table shows 69/90 = 77% did.)
See also Cognitive task analysis on methods for figuring out how experts solve problems.
Chi, M. T. H., Feltovich, P. J., & Glaser, R. (1981). Categorization and representation of physics problems by experts and novices. Cognitive Science, 5(2), 121–152. doi:10.1207/s15516709cog0502_2
Asked experts and novices to categorize physics problems into groups. Experts grouped problems by the physical concepts involved – conservation of momentum, Newton’s laws, etc. – and novices tended to use surface features, like inclined planes or springs. Interviews with experts and novices asking for their “basic approach” to each problem showed that experts talk about “the major principles they would apply to solve the problems”, which were quite consistent, while novices “were unable to produce any but the most general kind of abstracted solution methods”, giving vague descriptions like “I think of formulas that give their relationships”. Discusses the differences this reveals between how novices perceive and represent problems and how experts do.
Van Heuvelen, A. (1991). Learning to think like a physicist: A review of research‐based instructional strategies. American Journal of Physics, 59(10), 891–897. doi:10.1119/1.16667
Introductory physics students think in a completely different way from expert physicists: students try to match problems to formulas, rather than building qualitative understanding. An example:
The test included a conservation of energy problem in which a spring launched an object up into the air. For many of the students, this was a spring problem. They searched their minds for spring equations. Over 50% of the students used the most recent “spring” equation they had encountered – an equation for simple harmonic motion… For them, the final test was an effort to find an equation to solve spring problems, inclined plane problems, cable problems, and so forth.
Students don’t use the diagrams and qualitative reasoning used by the instructors (like free-body diagrams), possibly because they “do not understand the meaning of basic quantities and concepts that are represented in the diagrams” and because they rarely get in-class practice doing so.
Atkinson, R. K., Derry, S. J., Renkl, A., & Wortham, D. (2000). Learning from examples: Instructional principles from the worked examples research. Review of Educational Research, 70(2), 181–214. doi:10.3102/00346543070002181
A review of research on student learning from worked examples. Finds they are beneficial for students, particularly before they have developed structural understanding of problems – the phase when they often match problems to solution strategies using surface features. Worked examples can illustrate how an expert would recognize the deeper features. One can also prompt students to self-explain, e.g. by leaving blanks in the solutions. There are caveats: many worked examples are insufficiently detailed, and many students study from them ineffectively.
There’s a lot of emphasis on having students work problems to improve their understanding, and it’s folk wisdom that students don’t really understand until they do homework problems. But:
Rohrer, D., Dedrick, R. F., Hartwig, M. K., & Cheung, C.-N. (2019). A randomized controlled trial of interleaved mathematics practice. Journal of Educational Psychology. doi:10.1037/edu0000367
Assigned 7th-grade math classrooms to either complete “blocked” worksheets (meaning they cover only one type of problems) or “interleaved” worksheets that spread problem types over multiple weeks, so students have to recognize which kind of problem they’re solving and which methods are suitable. Students got feedback on their worksheet answers after completing them. Appears to be a well-designed and well-powered study. Found a d = 0.83 improvement on a comprehensive test at the end, suggesting that interleaved practice has a major effect on outcomes.
Kim, E., & Pak, S.-J. (2002). Students do not overcome conceptual difficulties after solving 1000 traditional problems. American Journal of Physics, 70(7), 759–765. doi:10.1119/1.1484151
Looks at 27 first-year students in a physics class at Seoul National University. Because of the Korean examination system, each had solved an average of 1,500 physics practice problems while preparing for the entrance exam; however, their conceptual understanding was still quite poor, and they were not able to connect the math back to the concepts.
The small sample size and lack of any experimental control limits the conclusions that can be drawn here.
Byun, T., & Lee, G. (2014). Why students still can’t solve physics problems after solving over 2000 problems. American Journal of Physics, 82(9), 906–913. doi:10.1119/1.4881606
An expanded version with 49 students, still finding a lack of relationship between extensive problem-solving and conceptual understanding.
Webb, D. J. (2017). Concepts first: A course with many improved educational outcomes as well as parity for underrepresented minority groups. American Journal of Physics, 85(8), 628–632. doi:10.1119/1.4991371
An experiment with a “concepts-first” physics class, in which the first portion of the semester is spent entirely on concepts (using active learning), followed by a few weeks learning the algebra to apply the concepts to actual problems. Students in this course did better on the final exam and better on the FCI than an ordinary active learning class and a lecture-based class, and underrepresented minorities also reached parity with the rest of the class.
It would be nice if students were intrinsically motivated to work hard, but often they are not. What can we do to motivate them?
Chevalier, A., Dolton, P., & Lührmann, M. (2017). “Making it count”: Incentives, student effort and performance. Journal of the Royal Statistical Society: Series A (Statistics in Society), 181(2), 323–349. doi:10.1111/rssa.12278
For a large intro economics class, students were given online quizzes after each lecture, with immediate feedback on their performance and explanations of the answers. Perhaps unsurprisingly, they find “that tournament incentives [a prize for the top score] and participation incentives [the quiz unlocks access to exercise solutions] are ineffective in increasing quiz participation. In contrast, making the quiz count towards the final grade [even just a few percent] substantially increases participation.” This affect was strongest “for students at and below median ability, resulting in a reduction of the grade gap by 17%”. They found no evidence that quiz participation increased to the detriment of participation in other activities or during other weeks of the course.
PhysPort has some resources on encouraging student engagement in active learning, since active learning doesn’t work unless the students are active.
Peer Instruction is a teaching method from Eric Mazur and colleagues at Harvard, designed and tested on introductory physics courses. It’s a flipped classroom approach that tries to give students many opportunities to screw up and be corrected, rather than letting their misunderstandings persist until exams.
Unfortunately I haven’t seen studies testing peer instruction in statistics courses. (Please send me some, if they exist!) In physics, however, it has dramatic results, doubling student learning:
More generally, “active learning” strategies seem much more effective when compared to traditional lecturing:
Kishimoto, C. T., Anderson, M. G., & Salamon, J. P. (2018). Flipping the large-enrollment introductory physics classroom. https://arxiv.org/abs/1807.03850
Interesting description of active learning applied to a large (100+ students) physics class with only one TA, so methods requiring lots of instructor time are ruled out. Uses in-class worksheets which are carefully scaffolded to build up the concepts, with occasional clicker questions and mini-lectures, and student reading before class. Shows some modest improvements in learning over lecture-based classes.
Alten, D. C. D. van, Phielix, C., Janssen, J., & Kester, L. (2019). Effects of flipping the classroom on learning outcomes and satisfaction: A meta-analysis. Educational Research Review, 28(100281). doi:10.1016/j.edurev.2019.05.003
A meta-analysis of flipped classroom research. Finds a Hedge’s g of 0.36 for learning outcomes for flipped classes, so flipped classes improve learning outcomes by about a third of a standard deviation on average. There was a lot of variation, though – in some cases the flipped class was worse. Reducing classroom time in the flipped class has a negative effect, while having quizzes has a positive effect (presumably by improving engagement with the readings/videos?).
Intro stats labs often take the form “Here’s a simulation of phenomenon X [the central limit theorem, sampling distributions, …]. Press this button and see what happens.”
In physics, lecture demonstrations have a similar tenor: “Here’s this apparatus, now watch what happens when I press the button.”
However, this doesn’t teach students anything unless they predict the behavior in advance, so they have a chance to realize they’re wrong: Crouch, C., Fagen, A. P., Callan, J. P., & Mazur, E. (2004). Classroom demonstrations: Learning tools or entertainment? American Journal of Physics, 72(6), 835–838. doi:10.1119/1.1707018
Further, physics demos are often misinterpreted by students, who misremember the outcome of the demo in ways consonant with their misconceptions. Asking them to predict the outcome in advance (assuming they have learned the conceptual framework needed to do so) reduces this problem dramatically: Miller, K., Lasry, N., Chu, K., & Mazur, E. (2013). Role of physics lecture demonstrations in conceptual learning. Physical Review Special Topics - Physics Education Research, 9(2). doi:10.1103/physrevstper.9.020113
Wieman, C. E., Perkins, K. K., & Adams, W. K. (2008). Oersted medal lecture 2007: Interactive simulations for teaching physics: What works, what doesn’t, and why. American Journal of Physics, 76(4), 393–399. doi:10.1119/1.2815365
An excellent overview of the lessons learned in PhET, a project to build interactive physics simulations. Their demos require $10-40,000 to build, with a team of faculty, programmers, and science education specialists, with extensive testing with real students to ensure the demos meet their goals. They discuss several cases where apparently well-designed demos were leading the students to misconceptions, and how they found these problems and eliminated them. This process has also been discussed in more detail, below:
Adams, W. K. et al. (2008a). A study of educational simulations part 1 – engagement and learning. Journal of Interactive Learning Research, 19(3), 397–419.
Adams, W. K. et al. (2008b). A study of educational simulations part II – interface design. Journal of Interactive Learning Research, 19(4), 551–577.
Holmes, N., Olsen, J., Thomas, J. L., & Wieman, C. E. (2017). Value added or misattributed? A multi-institution study on the educational benefit of labs for reinforcing physics content. Physical Review Physics Education Research, 13(1), 010129. doi:10.1103/PhysRevPhysEducRes.13.010129
Measures differences in final exam scores between students who took an optional physics lab section and those who didn’t, and finds no meaningful difference. (Though are final exam scores a good way to measure conceptual understanding?) Makes a very good point about labs:
First, goals to reinforce content often come hand-in-hand with increased structure, as it becomes important for students to observe a particular “correct” result. When one examines the cognitive activities in which students are engaged while completing such lab course activities, they are dominated by following instructions to collect specified data using unfamiliar equipment, and following specified procedures to analyze the data and write up reports in a specified format. Although the relevant physics concepts were central to the thinking of the instructor that designed and built the experiments, those concepts get little, if any, attention from the student carrying out the assigned activities using that apparatus.
Labs should be restructured to eliminate extraneous cognitive load and focus more strongly on the physics concepts.
Wilcox, B. R., & Lewandowski, H. (2017). Developing skills versus reinforcing concepts in physics labs: Insight from a survey of students’ beliefs about experimental physics. Physical Review Physics Education Research, 13(1), 010108. doi:10.1103/physrevphyseducres.13.010108
Uses the Colorado Learning Attitudes about Science Survey for Experimental Physics (E-CLASS) to see what students think about physics after taking a lab course. Found that students in skills-based courses, in which students chose their own analysis methods and had to figure out their setup and apparatus, had “more expertlike postinstruction responses and more favorable shifts” than students in concepts-based courses that used the labs to verify concepts taught in lectures.
This seems to contradict the Holmes et al (2017) paper above, but perhaps not too much. Holmes et al note that many labs involve following rote instructions and writing reports in specified formats, with little time actually thinking about physics. It’s possible that in skills-based labs, students also don’t spend much time thinking about physics – E-CLASS doesn’t measure that – but develop better attitudes about how physics works, rather than just thinking labs exist to get the “right” answer that matches whatever they were taught in class.
Kulik, C.-L. C., Kulik, J. A., & Bangert-Drowns, R. L. (1990). Effectiveness of mastery learning programs: A meta-analysis. Review of Educational Research, 60(2), 265–299. doi:10.3102/00346543060002265
Shows that mastery learning programs “raise final examination scores by about 0.5 standard deviations, or from the 50th to the 70th percentile”, with five key moderators:
Studies with large effect sizes are, first of all, likely to examine teaching in the social sciences rather than in mathematics, the natural sciences, or humanities. Second, the studies are likely to use locally developed rather than nationally standardized tests as criterion measures of student achievement. Third, the mastery programs in studies with large effect sizes require students to move through course material at the teacher’s pace, not at individual student rates. Fourth, the mastery programs in these studies also require students to perform at a high level on unit quizzes (e.g., 100% correct). And fifth, in studies that report strong effects, control students receive less quiz feedback than experimental students do. Equating the amount of quiz feedback for experimental and control students reduces the size of the mastery effect.
Point 2 is a bit concerning, in case the tests are biased toward the mastery class (e.g. the test is written to cover exactly what the mastery class does, not the control class), but the results are promising.
Rajapaksha, A., & Hirsch, A. S. (2017). Competency based teaching of college physics: The philosophy and the practice. Physical Review Physics Education Research, 13(2), 020130. doi:10.1103/physrevphyseducres.13.020130
Describes an experimental competency-based intro physics class, in which students had certain defined competencies to master and could repeatedly take assessments to master them. Homeworks could also be resubmitted with revisions. Students proceeded through material at their own pace. Demonstrated major improvements in FCI scores.
Gutmann, B., Gladding, G., Lundsgaard, M., & Stelzer, T. (2018). Mastery-style homework exercises in introductory physics courses: Implementation matters. Physical Review Physics Education Research, 14, 010128. doi:10.1103/physrevphyseducres.14.010128
Describes an implementation of mastery learning in an intro physics class, and changes that had to be made when the initial implementation frustrated students and led them to adopt unproductive behaviors (like repeatedly guessing on quizzes until getting questions they had already seen the answers to). Shows that student attitudes are important; the point of mastery learning had to be introduced at the beginning of the semester to make students understand its purpose.
Interteaching comes from operant psychology, and frames itself as providing positive reinforcement for learning behavior instead of aversive consequences (like course failure). Seems focused on courses with lots of reading and conceptual material, like psychology, rather than math or hard sciences. Most research on interteaching seems to come from one group.
The basic idea – read before class, discuss in class, get targeted feedback – seems to match with Peer Instruction’s methods, with the addition of a “prep guide” and some different incentive systems.
Saville, B. K., Lambert, T., & Robertson, S. (2011). Interteaching: Bringing behavioral education into the 21st century. The Psychological Record, 61(1), 153–166.
Comprehensive description of interteaching and early experiments on its effectiveness. The method:
First, the teacher constructions a preparation (prep) guide consisting of questions designed to guide students through a reading assignment. The questions cover a range of formats, often proceeding from simpler definitional-type questions to more complex application and synthesis questions… The teacher then distributes the prep guide to students (e.g., via a course Web page), who then have several days to complete the prep-guide items before class. In class, students first hear a brief clarifying lecture that reviews selected material from the previous class period. After the lecture, students form pairs to discuss the prep guide… If students discuss the material thoroughly, the pair discussions should last approximately two thirds of the class period… After students have discussed the prep guide thoroughly, they complete a record sheet, which provides the teacher with feedback on how the discussions went and which material was difficult to understand.
The method also involves frequent testing (“at least five times per semester”), plus participation points and a small incentive grade from your partner’s exam performance: if you discuss a prep guide with someone, then one of its questions appears on the exam, you get a few points if both of you do well on that exam question.
Henderson, C., Yerushalmi, E., Kuo, V. H., Heller, P., & Heller, K. (2004). Grading student problem solutions: The challenge of sending a consistent message. American Journal of Physics, 72(2), 164–169. doi:10.1119/1.1634963
Asks instructors to grade some example student solutions to physics problems and then interviews them on their grading decisions. Finds that
The latter two points contradict the first: students clearly benefit by giving solutions without reasoning, since instructors are less likely to deduct points for incorrect reasoning if the reasoning is left unclear. Suggests placing the burden of proof on students: answers will not be accepted unless their reasoning is justified. I would have liked to see an experiment or analysis of student solutions to see if students are, consciously or not, following these incentives to show less reasoning.