Het Poisson model bij toetsen onder tijdsdruk

advertisement
The Poisson-Gamma model
for speed tests
Norman Verhelst
Frans Kamphuis
National Institute for Educational Measurement
Arnhem, The Netherlands
The student monitoring system
• Measurement of individual development
– Common scale
• Estimation of distribution (norms)
– Twice per grade (M3, E3,…,M8)
• Several subjects
– Arithmetic
– Reading comprehension
– Technical reading
Two types of speed tests
• Basic observation is the time to complete
a task
– AVI cards
• Basic observation is the number of
completed subtasks within the time limit
– Tempotests (TT)
– Three Minute Test (TMT)
Example tempotest (E4)
• Op de politieschool spelen ze ook rook
koor een soort toneel
• Het lijkt wel wat op ‘politie en boefje
spelen stelpen slepen’.
• Net zoals op de basisschool.
• Wat poe doe boe je bij een gevecht?
• Je pistool trekken?
• Nee, dat mag zomen zomaar zomer niet.
Example TMT
• Easy version
–
–
–
–
–
–
–
–
as
fee
oom
uur
zee
oor
…
poot (=150)
• Hard version
–
–
–
–
–
–
–
–
banden
geluid
tante
beker
kuiken
koffer
…
brandweerwagen
(=150)
Models
• Measurement model: Poisson
– What is the relation between the (latent)
ability and the test performance?
• Structural model: Gamma
– The distribution of the latent ability in one or
more populations? (M3, E3, M4,…,M8)
Measurement model: Poisson (1)
xvi : observation (number read/number correct)
v : student index
i : task index
P( xvi ;  ) 

xvi
xvi !
e

, ( xvi  0,1, 2,3, )
Measurement model: Poisson (2)
P( xvi ;  ) 

xvi
xvi !
e

, ( xvi  0,1, 2,3, )
   vi   i  v   i
 i : time limit (in minutes)
 i : easiness of task i (dimensionless)
v : ability (#subtasks/minute)
Parameter estimation:
incomplete design (JML)
k
statistics: sv   d vi xvi en ti   d vi xvi
i 1
v
k
normalisation:   i  1
i 1
sv
v 
i dvi i i
i 
ti
 i  v dviv
Person parameters
ˆv   dvi iˆ i
i
 is the corrected reading time (weights:  i )
s
v
ˆ
v 
ˆ



E ˆ |   
v
ˆ
s


v
v
v
ˆ
SE (v ) 


v
ˆv
ˆv
Design TMT
• 3 difficulty levels (1, 2, 3)
• For each level: three parallell versions (a, b, c)
• Each student participates twice: medio and
end of same grade
• At each administration: 3 cards of levels 1, 2
and 3 (in that sequence)
• M3: only cards 1 and 2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
voor de groepen 4-7
medio
1
2
a
a
a
a
a
a
a
b
a
b
a
b
a
c
a
c
a
c
b
a
b
a
b
a
b
b
b
b
b
b
b
c
b
c
b
c
c
a
c
a
c
a
c
b
c
b
c
b
c
c
c
c
c
c
eind
3
a
b
c
a
b
c
a
b
c
a
b
c
a
b
c
a
b
c
a
b
c
a
b
c
a
b
c
1
b
b
b
b
b
b
b
b
b
c
c
c
c
c
c
c
c
c
a
a
a
a
a
a
a
a
a
2
b
b
b
c
c
c
a
a
a
b
b
b
c
c
c
a
a
a
b
b
b
c
c
c
a
a
a
3
b
c
a
b
c
a
b
c
a
b
c
a
b
c
a
b
c
a
b
c
a
b
c
a
b
c
a
Two step procedure
• Estimate the task parameters σi
– JML = CML
• Estimate latent distribution while fixing the
task parameters at their CML -estimate
Advantage
If X 1 and X 2 indep. Poisson with parameters 1 en  2 ,
then X 1  X 2 is Poisson distributed with parameter 1   2
sv   svi
i
P[v   i i ]  P(v )
i
Structural model:
distribution of reading speed (θ)


 1
g ( ; ,  ) 
 exp( )
( )

E ( ) 


Var ( )  2

Marginal distribution of the sum score s

f ( s )   P( s |  )  g ( )d
0



( ) 
 1  

e


e
d


( ) 0 s !
s
Negative Binomial
(Gamma-Poisson)

(  s)  
f ( s) 
s 
s !( ) (   )
s
p

 
1 p 

 
(  s ) s

f (s) 
p (1  p)
s !( )
Negative binomial
(  1)  ( )
(  s )  (  ) s 1

  (  j )
( )
 (  ) j 0

f (s) 
s 1
(


j
)
j 0
s!

p (1  p)
s
EAP
 | s Gamma(  s,    )
 s
E ( | s) 
 
 s
SD( | s) 
 
Reliability
SS '  p 

 
Validation (tempo test)
M4
25
20
15
10
5
0
25
50
75
100
125
150
175
Validation (tempo test)
1.00
exp(M4)
obs(M4)
0.75
exp(E4)
obs(E4)
0.50
0.25
0.00
25
50
75
100
125
gobserveerde scores
150
175
Validation (TMT)
M3
30
25
20
15
10
5
0
0
50
100
150
200
Latent class model
• Population consists of two latent classes
of size π and 1 - π respectively
• The latent variable is gamma distributed in
each class
• Parameters
–π
– α1 en β1
– α2 en β2
• EM-algorithm
M3 (pi = 0.54)
class 1
class 2
mixture
0
20
40
60
theta (words per minute)
80
100
Validation (TMT)
M3
30
25
20
15
10
5
0
0
50
100
150
200
Validation (TMT)
1.00
0.75
0.50
exp(M3)
obs(M3)
0.25
exp(E3)
obs(E3)
0.00
0
50
100
150
aantal woorden gelezen
200
250
Norms (TMT)
M3
E3
M4
E4
M5
E5
M6
E6
M7
E7
M8
1.00
0.75
0.50
0.25
0.00
0
20
40
60
80
theta (= woorden per minuut)
100
120
Thank you
Example: student v
i
i
d vi i  i
8
0.93
-
1
8
1.11
8.88
3
0
6
0.85
-
4
1
6
1.05
6.30
5
0
5
1.09
-
Task i
1
dvi
0
2
δv :
15.18
122
sv  122   v 
 8.04 (subtasks/minute on a standard task)
15.18
122
SE ( v ) 
 0.73
15.18
Problems
•
•
•
•
SE(π) large
Local maxima?
Thick right tail of observations
>2 classes?
– Initial estimates
• Homogeneity of test material
• Local independence
Simulation E3
1
size class 1
0.8
0.6
0.4
0.2
0
10
15
20
25
average class 1
30
35
40
real pi = 0.51; estimated pi = 0.93
cumulative frequency
1000
800
Obs.
Exp.
600
400
200
0
0
50
100
score
150
200
250
Averages (1000 replications)
Class 1
Class 2
Overall
Mean
28.15
44.07
35.99
SD
2.71
3.22
0.43
Standard deviations (1000 rep.)
Class 1
Class 2
Overall
Mean
13.31
17.44
17.66
SD
2.21
1.68
0.47
Download