See also Pedagogy, Think-aloud interviews, Item response theory.

Cognitive task analysis is a way of improving our understanding of how students and experts perform tasks, so we can determine what specifically is lacking in student thinking and perhaps devise interventions to encourage students to think more like experts. It typically involves theorizing the steps involved in performing a task, then using think-aloud interviews with students and experts to determine what cognitive skills real people use.

Lovett, M. C. (1998). Cognitive task analysis in service of intelligent tutoring system design: A case study in statistics. In

*International conference on intelligent tutoring systems*(pp. 234–243). Springer Berlin Heidelberg. doi:10.1007/3-540-68716-5_29A good overview of cognitive task analysis. It can be theoretical (drawing a diagram of which steps we think should be involved in a specific task) or empirical (using think-alouds) to study how students or experts complete a task. Lovett gives examples of a study of exploratory data analysis, where students skip several steps (like identifying the types of the variables and the appropriate analyses) that experts perform routinely. (That study is explored in more depth here: Lovett, M. (2001). A collaborative convergence on studying reasoning processes: A case study in statistics. In S. M. Carver & D. Klahr (Eds.),

*Cognition and instruction: Twenty-five years of progress*. Psychology Press.)Feldon, D. F. (2007). The implications of research on expertise for curriculum and pedagogy.

*Educational Psychology Review*,*19*(2), 91–110. doi:10.1007/s10648-006-9009-0Why is cognitive task analysis important? Basically: experts can’t articulate their actual problem solving strategies. Reviews research on the nature of expertise, showing that a lot of expert thinking is essentially automated, and experts “fabricate consciously reasoned explanations for their automated behaviors” which do not necessarily correlate with what they actually do. Gives an example from nursing, where careful interviews found that “more than one-third of the individual cues (25 out of 70) used to correctly diagnose infants across the most commonly reported form of infant distress were not listed in any of the existing medical research or training literature” – that is, expert nurses used reasoning which was sufficiently automated that it had never before been written down. However, when cognitive task analyses are used to elicit reasoning from experts in detail, the results can be used to teach novices much more effectively than had the experts taught without it.

Tofel-Grehl, C., & Feldon, D. F. (2013). Cognitive task analysis–based training.

*Journal of Cognitive Engineering and Decision Making*,*7*(3), 293–304. doi:10.1177/1555343412474821A review of studies on applying cognitive task analysis to develop training and teaching materials, showing that when training is based on a thorough cognitive task analysis, student learning improves. There are some limitations based on the quality and reporting of the underlying studies, but this still is a very good set of references.

Rios, L., Pollard, B., Dounas-Frazer, D. R., & Lewandowski, H. (2019, January). Using think-aloud interviews to characterize model-based reasoning in electronics for a laboratory course assessment. https://arxiv.org/abs/1901.02423

This paper never actually uses the term “cognitive task analysis”, but it really is one. The authors explore the “Modeling Framework for Experimental Physics” describing how “physicists revise their models to account for the newly acquired observations, or change their apparatus to better represent their models”, and compare the framework to how students actually solve lab problems in think-aloud interviews to draw conclusions about how students reason experimentally. A good practical example.

Wilcox, B. R., Caballero, M. D., Rehn, D. A., & Pollock, S. J. (2013). Analytic framework for students’ use of mathematics in upper-division physics.

*Physical Review Special Topics - Physics Education Research*,*9*(2). doi:10.1103/physrevstper.9.020119Introduces the ACER framework:

In order to solve the back-of-the-book or exam-type problems that ACER targets, one must determine which mathematical tool is appropriate (activation) and construct a mathematical model by mapping the particular physical system onto appropriate mathematical tools (construction). Once the mathematical model is complete, there is often a series of mathematical steps that must be executed in order to reduce the solution into a form that can be readily interpreted (execution). This final solution must then be interpreted and checked to ensure that it is consistent with known or expected results (reflection).

This was based on experts solving problems and documenting their steps. Analyzed student solutions to physics exam problems to see which steps they struggled with the most, noting that students struggled to activate the right solution method and to construct the right mathematical expression for the situation. Most students do not reflect on solutions unless prompted.

Wilcox, B. R., & Corsiglia, G. (2019). A cross-context look at upper-division student difficulties with integration. https://arxiv.org/abs/1908.00474

Applies the ACER framework to problems requiring integration in upper-division physics classes, such as integrating over charge distributions in electrostatics. Used both student exam solutions and interviews where pairs of students worked together to solve problems. Again, found that students struggled to activate the right solution method and identified specific problems students had with constructing the math (such as picking the differential element or the right vector between charge and particle experiencing the charge). Reflection again was rare without prompting.

One can model student responses to questions in terms of the underlying skills required to answer those questions, and estimate which skills each student possesses. You might also want to “find” the underlying skills for each question automatically, by processing a bunch of response data and trying to find underlying latent skills. The models to do this are generally known as cognitive diagnosis models (CDMs) or diagnostic classification models (DCMs).

George, A. C., Robitzsch, A., Kiefer, T., Groß, J., & Ünlü, A. (2016). The R package CDM for cognitive diagnosis models.

*Journal of Statistical Software*,*74*(2). doi:10.18637/jss.v074.i02A good overview of cognitive diagnosis models (CDMs), in which each question requires a set of cognitive skills to complete and the probability of answering correctly is a function of the cognitive skills held by the student. Used to establish skill profiles for students.

CDMs involve three matrices: X, Q, and \alpha. X is the response matrix, so X_{si} = 1 if student s answered item i correctly. The skills required to answer each question are broken down into K discrete skills, and Q is a skill matrix, where Q_{ik} = 1 if skill k is relevant to solving item i. \alpha then gives the matrix of skills mastered by students; CDMs take Q as known and try to group students into skill “profiles” based on identical rows of \alpha.

Liu, J., Xu, G., & Ying, Z. (2013). Theory of self-learning Q-matrix.

*Bernoulli*,*19*(5A), 1790–1817. doi:10.3150/12-bej430Establishes conditions for the consistent estimation and unique identification of Q from response data in different CDM settings, and suggests some (combinatorial) estimators.

Chen, Y., Liu, J., Xu, G., & Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models.

*Journal of the American Statistical Association*,*110*(510), 850–866. doi:10.1080/01621459.2014.934827Extension of the previous results to a wider class of CDMs, with a new more practical estimator for Q.

Xu, G., & Shang, Z. (2018). Identifying latent structures in restricted latent class models.

*Journal of the American Statistical Association*,*113*(523), 1284–1295. doi:10.1080/01621459.2017.1340889Further extensions to the identifiability results.

Desmarais, M. C., & Naceur, R. (2013). A matrix factorization method for mapping items to skills and for enhancing expert-based Q-matrices. In

*International Conference on Artificial Intelligence in Education*(pp. 441–450). doi:10.1007/978-3-642-39112-5_45An attempt to build a semi-automated way to deduce the Q matrix from data, instead of taking it as given. Based on using nonnegative matrix factorization to approximate X \approx Q S, where S is the student skill matrix; uses an iterative updating procedure to improve upon an expert-suggested Q based on observed data. Tested on a dataset of only 20 items, 536 respondents, and 8 skills; their factorization seems to improve slightly upon the expert-suggested Q, but the dataset isn’t big enough to make the effect dramatic.

Matsuda, N., Furukawa, T., Bier, N., & Faloutsos, C. (2015). Machine beats experts: Automatic discovery of skill models for data-driven online course refinement. In

*Proceedings of the 8th International Conference on Educational Data Mining*(pp. 101–108).Another way to deduce Q, using both student response data and a bag-of-words analysis of the question texts, claiming improved prediction performance over expert-designed Q matrices and also improved interpretability (because you can pick out words attached to the skills).

Zhang, C., Taylor, S. J., Cobb, C., & Sekhon, J. (2019). Active matrix factorization for surveys. arXiv. https://arxiv.org/abs/1902.07634

Framed in terms of survey design, but relevant here too. If you have a bunch of questions relevant to different skills, how do you actively choose the next question to show to a student to best improve your estimates of their skills profile? Uses matrix factorization to make an active learning method for choosing questions.

Silva, M. A. da, Liu, R., Huggins-Manley, A. C., & Bazán, J. L. (2018). Incorporating the Q-matrix into multidimensional item response theory models.

*Educational and Psychological Measurement*, 001316441881489. doi:10.1177/0013164418814898Connects CDMs to multidimensional IRT models (see Item response theory). MIRT models allow multiple latent traits for each student, and questions load on these latent traits; this makes the model involve quite a lot of parameters and quickly requires lots of data to estimate well. Proposes a MIRT model that uses a Q matrix: when Q = 0, those loadings are forced to zero, otherwise they are estimated from the data. Reduces the dimensionality of the estimation problem.

Koedinger, K. R., & McLaughlin, E. A. (2016). Closing the loop with quantitative cognitive task analysis. In

*Proceedings of the 9th International Conference on Educational Data Mining*(pp. 412–417). http://www.educationaldatamining.org/EDM2016/proceedings/paper_152.pdfDemonstrates an example where intuition would suggest one form of practice would help students learn to solve certain problems, but a cognitive task analysis suggests a different form of practice would be most helpful because it targets the missing cognitive skill. The example is algebra story problems; data from a variety of problems involving different cognitive tasks suggests the problem is “not in the story problem comprehension but in the production of more complex symbolic forms”, so practice building more symbolic forms would transfer to solving word problems.