On the validity of reading assessments. Relationships between teacher judgements, external tests and pupil self-assessments
Stefan Johansson
Professor Monica Rosén och universitetslektor Eva Myrberg
Professor Astrid Pettersson, Stockholms universitet
Göteborgs universitet
2013-02-25
On the validity of reading assessments. Relationships between teacher judgements, external tests and pupil self-assessments
Institutionen för pedagogik och specialpedagogik
Abstrakt
Många aktörer i skolan och samhället i stort har intresse av validiteten i de slutsatser som dras på grundval av olika bedömningsformer. I den här avhandlingen undersöks validitetsaspekter i olika bedömningsformer, närmare bestämt lärares bedömningar, resultat på ett standardiserat läsprov och elevers självskattningar i årskurs 3 och 4. Data inhämtades från den internationella undersökningen PIRLS 2001 (Progress in International Reading Literacy Study, 2001) och ca 11000 elever och 700 lärare deltog. Strukturell ekvationsmodellering med latenta variabler utgjorde den huvudsakliga metoden för analys. Ett av de viktigaste resultaten var att lärare väl kunde skatta elevernas språkliga kunskaper inom den egna klassen, medan de däremot har svårare att göra detta på ett samstämmigt och likvärdigt sätt över klassrum. Resultaten tyder också på att faktorer på elevnivå (kön och socioekonomisk status (SES)) påverkade lärarens skattningar av elevernas färdigheter. Flickor och elever med högre SES fick lite högre lärarbedömning jämfört med pojkar och elever som hade lägre SES, som hade samma resultat på PIRLS. En förklaring till detta kan vara att läraren väger in fler aspekter än vad provet kan mäta, exempelvis muntlig förmåga. Lärarens kompetensnivå var viktig för såväl elevernas resultat i skolan som hur väl läraren bedömde elevens kunskaper. Vidare visade resultaten att elevers självskattningar stämde relativt väl överens med såväl lärarens bedömning som elevens provresultat.
On the validity of reading assessments. Relationships between teacher judgements, external tests and pupil self-assessments
The purpose of this thesis is to examine validity issues in different forms of assessments; teacher judgements, external tests, and pupil self-assessment in Swedish primary schools. The data used were selected from a large-scale study––PIRLS 2001––in which more than 11000 pupils and some 700 teachers from grades 3 and 4 participated. The primary method used in the secondary analyses to investigate validity issues of the assessment forms is multilevel Structural Equation Modeling (SEM) with latent variables. An argument-based approach to validity was adopted, where possible weaknesses in assessment forms were addressed.
A fairly high degree of correspondence between teacher judgements and test results was found within classrooms with a correlation of .65 being obtained for 3rd graders, a finding well in line with documented results in previous research. Grade 3 teachers’ judgements correlated higher than those of grade 4 teachers. The longer period of time spent with the pupils, as well as their different education, were suggested as plausible explanations. Gender and socioeconomic status (SES) of the pupils showed a significant effect on the teacher judgements, in that girls and pupils with higher SES received higher judgements from teachers than test results accounted for.
Teachers with higher levels of formal competence were shown to have pupils with higher achievement levels. Pupil achievement was measured with both teacher judgements and PIRLS test results. Furthermore, higher correspondence between judgements and test-results was demonstrated for teachers with higher levels of competence.
Comparisons of classroom achievement were shown to be problematic with the use of teachers’ judgements. The judgements reflected different achievement levels, despite the fact that test-results indicated similar performance levels across classrooms.
Pupil self-assessments correlated slightly lower to both teacher judgement and to test results, than did teacher judgements and test results. However, in spite of their young age, pupils assessed their knowledge and skills in the reading domain relatively well. No differences in self-assessments were found for pupils of different gender or SES.
In summary, a conclusion of the studies on the three forms of assessment was that all have certain limitations. Strengths and weaknesses of the different assessment forms were discussed.