Below are all of my presentations, including slides and video when available. You might also be interested in my peer-reviewed publications.
Delphi COVIDcast: Publishing early indicators of COVID-19, from massive surveys and medical data. Seminar, Department of Biostatistics, University of Pittsburgh. August 20, 2020.
Since the beginning of the COVID-19 pandemic, a crucial problem has been obtaining timely, high-quality data about its spread, ideally at the county level. In this presentation, I’ll give an overview of Delphi’s COVIDcast project, which now maintains over 500 million observations of data – mostly daily, county-level signals. Some of this data is from public sources, such as reports of confirmed cases and deaths, but much of it is unique to Delphi. Using medical data examples, I’ll illustrate how this data gives unprecedented insight into the pandemic, and show how the COVIDcast API makes it possible for any researcher to benefit from the data.
After discussing the project generally, I’ll focus on one data source in particular: an online survey conducted in partnership with Facebook and the University of Maryland. The survey reaches 70,000 respondents daily in the United States and asks about symptoms, COVID testing, social contact, mental health, and many other items – and with 10 million responses in the United States and many more outside it, represents the largest research survey ever conducted outside a national census. Aggregate data from the survey is available publicly and is used in Delphi’s COVID case forecasts; individual data is available to researchers. I’ll discuss insights gained from the data, potential areas for future research, and how researchers can get access to pursue their own research goals using the surveys.
Support for COVID-19 Research through the Symptom Surveys (with Curtiss Cobb and Frauke Kreuter).
UIDP Webinar, June 23, 2020.
The COVID-19 symptom surveys are designed to help researchers better monitor and forecast the spread of COVID-19. In partnership with the University of Maryland and Carnegie Mellon University, Facebook users are invited to take surveys conducted by these two partner universities to self-report COVID-19-related symptoms. The surveys may be used to generate new insights on how to respond to the crisis, including heat maps of self-reported symptoms. This information may help health systems plan where resources are needed and potentially when, where, and how to reopen parts of society.
Using Cognitive Task Analysis to Uncover Misconceptions in Statistical Inference Courses (with Mikaela Meyer and Josue Orellana).
eCOTS 2020, May 19, 2020. Video poster presentation.
When taking an introductory undergraduate statistical inference course, students struggle to approach problems in the ways experts do. However, articulating how experts solve problems is difficult, and instructors might not be able to detect the misconceptions students harbor when solving these problems. To enable research into student learning, we propose combining cognitive task analysis, a research method from cognitive science, and think-aloud interviews with graduate and undergraduate students to better understand the steps students and experts take to solve statistical inference questions. After using cognitive task analysis to break down the discrete cognitive skills needed to solve simple mathematical statistics problems, we used think-aloud interviews to determine how students and experts applied these skills and to identify skills students lack. In this presentation, we will discuss our analysis of coded think-aloud interview transcripts and our cognitive task analysis of problems where students must identify the relevant variable to operate on, then apply mathematical rules to obtain their answer.
Exploring how students reason about correlation and causation (with Ciaran Evans, Philipp Burckhardt, Rebecca Nugent, and Gordon Weinberg).
eCOTS 2020, May 19, 2020. Video poster presentation.
Introductory statistics students find it notoriously difficult to reason about correlation and causation: when does correlation imply causation, and when does it not? Through think-aloud interviews with students and assessment results, we have seen that while they often recognize that correlation does not necessarily imply causation, they usually fail to understand why. To aid student understanding, we introduce a new activity to our introductory statistics labs that explores correlation vs. causation through simple causal diagrams. In this poster, we present preliminary results from this lab activity and some insights from our think-aloud research, to gain insight into how students think about this fundamental statistical concept.
Using think-aloud interviews to assess student understanding of statistics concepts (with P Burckhardt, P W Elliott, C Evans, K Lin, A Luby, M Meyer, J Orellana, R Yurko, G Weinberg, J Wieczorek, and R Nugent).
USCOTS 2019, May 16-18, 2019. Breakout session.
Assessing student understanding of statistics concepts is quite difficult: conceptual questions are difficult to write clearly, students often interpret questions in unexpected ways, and students may choose answers (even the correct answer) for unexpected reasons. This makes it difficult to assess student learning of concepts, but as we continuously improve our introductory statistics courses, we need tools to understand what students understand.
In this breakout session, we will report on a year-long exploratory project to build an assessment using a powerful tool: think-aloud interviews. Audience members will learn to use think-aloud interviews to elicit student misconceptions and revise assessment questions, providing a practical method they can use in their own courses and research to better assess student learning. We will then share surprising misconceptions discovered during our own round of 36 student interviews, and summarize our assessment’s results from several hundred students in several introductory courses.
Identifying misconceptions of introductory data science using a think-aloud protocol (with S Hyun, P Burckhardt, P Elliott, C Evans, K Lin, A Luby, C P Makris, J Orellana, J Wieczorek, R Yurko, G Weinberg, and R Nugent).
eCOTS 2018, May 23, 2018. Video poster presentation.
Think-aloud interviews can provide assessment designers with insights into student thinking that may not be clear from test responses alone. We present results from preliminary rounds of think-aloud interviews with introductory students and describe surprising misconceptions we have identified, along with insights from our experience designing assessments and performing think-aloud interviews.
A Spatio-Temporal Statistical Model of Crime Hotspots (with Daniel S. Nagin).
American Society of Criminology Annual Meeting, Philadelphia, PA, November 17, 2017.
The concentration of crime into small “hotspots” has been widely observed across many different cities and types of crimes. Current tools to understand the causes and dynamics of crime hotspots are limited. A variety of mapping tools, for example, have been proposed to detect hotspots in crime data, but these tools cannot correlate clusters to events or covariates which may have caused them. Separately, methods such as Risk Terrain Modeling attempt to identify spatial features that predict crime rates, such as gang territories, bars, or poverty, but consider only chronic hotspots, not accounting for temporary flare-ups.
We propose a statistical model which accounts for spatial and temporal variation in crime by modeling both spatial features and near-repeat and retaliatory crimes, allowing it to model the birth and death of crime hotspots and the reasons they appear, and to statistically test hypotheses about each predictive variable. We demonstrate the model on a large dataset of crimes in Pittsburgh, Pennsylvania, showing its utility in understanding the dynamics of crime.
Point process modeling with spatiotemporal covariates for predicting crime (with Joel Greenhouse and Xizhen Cai).
JSM 2016, Chicago, IL, August 3, 2016.
Extensive research has shown that crime tends to be concentrated in hotspots: small pockets with above-average rates of crime. Criminologists and law enforcement agencies want to better predict crime hotspots and understand the factors that cause them, in order to target interventions. Prior research suggests that past crime hotspots, spatial features (like bus stops or bars), and leading indicators (like 911 calls) are all predictive of future crime, but no proposed predictive policing model can account for all of these factors. We have adapted a previous self-exciting point process model to incorporate past crime data, leading indicators, spatial features and spatial covariates (like population density or zoning data), and developed new tools to evaluate the performance of the model and select variables. We show the basic model and demonstrate its application to seven years of Pittsburgh crime data, comparing its fits to previous hotspot models. These results can be used to better guide crime prevention programs and police patrols.
Statistics Done Wrong: Pitfalls in Experimentation.
LASER Workshop, Washington, DC, October 16, 2014. Video.
Most research relies on statistical hypothesis testing to report its conclusions, but the seeming precision of statistical significance actually hides many possible biases and errors. The prevalence of these errors suggests that most published results are exaggerated or false. I will explain some of these errors, such as inadequate sample sizes, multiple comparisons, and the keep-looking bias, and their impact on published results. Finally, I will suggest solutions to these problems, including statistical improvements and changes to scientific funding and publication practices.