The Poisson-Gamma model for speed tests Norman Verhelst Frans Kamphuis National Institute for Educational Measurement Arnhem, The Netherlands The student monitoring system • Measurement of individual development – Common scale • Estimation of distribution (norms) – Twice per grade (M3, E3,…,M8) • Several subjects – Arithmetic – Reading comprehension – Technical reading Two types of speed tests • Basic observation is the time to complete a task – AVI cards • Basic observation is the number of completed subtasks within the time limit – Tempotests (TT) – Three Minute Test (TMT) Example tempotest (E4) • Op de politieschool spelen ze ook rook koor een soort toneel • Het lijkt wel wat op ‘politie en boefje spelen stelpen slepen’. • Net zoals op de basisschool. • Wat poe doe boe je bij een gevecht? • Je pistool trekken? • Nee, dat mag zomen zomaar zomer niet. Example TMT • Easy version – – – – – – – – as fee oom uur zee oor … poot (=150) • Hard version – – – – – – – – banden geluid tante beker kuiken koffer … brandweerwagen (=150) Models • Measurement model: Poisson – What is the relation between the (latent) ability and the test performance? • Structural model: Gamma – The distribution of the latent ability in one or more populations? (M3, E3, M4,…,M8) Measurement model: Poisson (1) xvi : observation (number read/number correct) v : student index i : task index P( xvi ; ) xvi xvi ! e , ( xvi 0,1, 2,3, ) Measurement model: Poisson (2) P( xvi ; ) xvi xvi ! e , ( xvi 0,1, 2,3, ) vi i v i i : time limit (in minutes) i : easiness of task i (dimensionless) v : ability (#subtasks/minute) Parameter estimation: incomplete design (JML) k statistics: sv d vi xvi en ti d vi xvi i 1 v k normalisation: i 1 i 1 sv v i dvi i i i ti i v dviv Person parameters ˆv dvi iˆ i i is the corrected reading time (weights: i ) s v ˆ v ˆ E ˆ | v ˆ s v v v ˆ SE (v ) v ˆv ˆv Design TMT • 3 difficulty levels (1, 2, 3) • For each level: three parallell versions (a, b, c) • Each student participates twice: medio and end of same grade • At each administration: 3 cards of levels 1, 2 and 3 (in that sequence) • M3: only cards 1 and 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 voor de groepen 4-7 medio 1 2 a a a a a a a b a b a b a c a c a c b a b a b a b b b b b b b c b c b c c a c a c a c b c b c b c c c c c c eind 3 a b c a b c a b c a b c a b c a b c a b c a b c a b c 1 b b b b b b b b b c c c c c c c c c a a a a a a a a a 2 b b b c c c a a a b b b c c c a a a b b b c c c a a a 3 b c a b c a b c a b c a b c a b c a b c a b c a b c a Two step procedure • Estimate the task parameters σi – JML = CML • Estimate latent distribution while fixing the task parameters at their CML -estimate Advantage If X 1 and X 2 indep. Poisson with parameters 1 en 2 , then X 1 X 2 is Poisson distributed with parameter 1 2 sv svi i P[v i i ] P(v ) i Structural model: distribution of reading speed (θ) 1 g ( ; , ) exp( ) ( ) E ( ) Var ( ) 2 Marginal distribution of the sum score s f ( s ) P( s | ) g ( )d 0 ( ) 1 e e d ( ) 0 s ! s Negative Binomial (Gamma-Poisson) ( s) f ( s) s s !( ) ( ) s p 1 p ( s ) s f (s) p (1 p) s !( ) Negative binomial ( 1) ( ) ( s ) ( ) s 1 ( j ) ( ) ( ) j 0 f (s) s 1 ( j ) j 0 s! p (1 p) s EAP | s Gamma( s, ) s E ( | s) s SD( | s) Reliability SS ' p Validation (tempo test) M4 25 20 15 10 5 0 25 50 75 100 125 150 175 Validation (tempo test) 1.00 exp(M4) obs(M4) 0.75 exp(E4) obs(E4) 0.50 0.25 0.00 25 50 75 100 125 gobserveerde scores 150 175 Validation (TMT) M3 30 25 20 15 10 5 0 0 50 100 150 200 Latent class model • Population consists of two latent classes of size π and 1 - π respectively • The latent variable is gamma distributed in each class • Parameters –π – α1 en β1 – α2 en β2 • EM-algorithm M3 (pi = 0.54) class 1 class 2 mixture 0 20 40 60 theta (words per minute) 80 100 Validation (TMT) M3 30 25 20 15 10 5 0 0 50 100 150 200 Validation (TMT) 1.00 0.75 0.50 exp(M3) obs(M3) 0.25 exp(E3) obs(E3) 0.00 0 50 100 150 aantal woorden gelezen 200 250 Norms (TMT) M3 E3 M4 E4 M5 E5 M6 E6 M7 E7 M8 1.00 0.75 0.50 0.25 0.00 0 20 40 60 80 theta (= woorden per minuut) 100 120 Thank you Example: student v i i d vi i i 8 0.93 - 1 8 1.11 8.88 3 0 6 0.85 - 4 1 6 1.05 6.30 5 0 5 1.09 - Task i 1 dvi 0 2 δv : 15.18 122 sv 122 v 8.04 (subtasks/minute on a standard task) 15.18 122 SE ( v ) 0.73 15.18 Problems • • • • SE(π) large Local maxima? Thick right tail of observations >2 classes? – Initial estimates • Homogeneity of test material • Local independence Simulation E3 1 size class 1 0.8 0.6 0.4 0.2 0 10 15 20 25 average class 1 30 35 40 real pi = 0.51; estimated pi = 0.93 cumulative frequency 1000 800 Obs. Exp. 600 400 200 0 0 50 100 score 150 200 250 Averages (1000 replications) Class 1 Class 2 Overall Mean 28.15 44.07 35.99 SD 2.71 3.22 0.43 Standard deviations (1000 rep.) Class 1 Class 2 Overall Mean 13.31 17.44 17.66 SD 2.21 1.68 0.47