Hoppa till sidinnehåll
Prov

Extensions and applications of item response theory

Publicerad: 4 februari

Joakim Wallmark visar i sin avhandling hur avancerad statistik kan göra prov som högskoleprovet mer rättvisa och träffsäkra.

Författare

Joakim Wallmark

Handledare

Professor Marie Wiberg, Umeå universitet Docent Maria Josefsson, Umeå universitet

Opponent

Dr. Patricia, Martinkova, Charles University, Prague

Disputerat vid

Umeå universitet

Disputationsdag

2025-02-07

Abstract in English

This doctoral thesis focuses on Item Response Theory (IRT), a statistical method widely used in fields such as education and psychology to analyze response patterns on tests and surveys. In practice, IRT models are estimated using collected test data, which allows researchers to assess both how effectively each item measures the underlying trait—such as subject knowledge or personality characteristics—that the test aims to evaluate, and to estimate each individual’s level of that trait. Unlike traditional methods that simply sum predetermined item scores, IRT accounts for the difficulty of each item and its ability to measure the intended trait.

The thesis consists of four research articles, each addressing different aspects of IRT and its applications. The first article focuses on test equating, ensuring that scores from different versions of a test are comparable. Equating methods with and without IRT are compared using simulations to explore the advantages and disadvantages of incorporating IRT into the kernel equating framework. The second and third articles introduce and compare different types of IRT models. Through simulations and real test data examples, these studies demonstrate that more flexible models can better capture the true relationships between test responses and the underlying traits being measured.

Finally, the IRTorch Python package is presented in the fourth study. IRTorch supports various IRT models and estimation methods and can be used to analyze data from different types of tests and surveys. In summary, the thesis demonstrates how IRT-based equating methods can serve as an alternative to traditional equating methods, how more flexible IRT models can improve the precision of test results, and how user-friendly software can make advanced statistical models accessible to a wider audience.