[Rasch] IRT based reliability
johnbarnard at bigpond.com
Thu Jun 27 11:29:45 EST 2013
Mike is of course correct and I know that he tries to keep the response
short. As for question 1, we prefer to work with the concept "information"
in Rasch and IRT instead of reliability. Information can be used to
calculate the SEM for each item and person which is not a constant as in
CTT. This is a powerful concept and we can estimate a "reliability index"
from it if this is what you want.
As for question 2, in CTT the reliability index is used to compute a single
SEM and applied to all test takers to indicate the precision of the
measures. A CAT is a different ball game. Appropriate, well targeted items
are administered to each test taker and each test taker has a unique SEM
which increasingly decreases as items are administered that are increasingly
converging to the ability estimate of each individual. So in a CAT each test
taker gets a very precise measure. Reliability has to do with the test
whilst SEM is about the precision of the measures. In CAT the latter is
usually more important and of interest.
Prof John J Barnard (D.Ed.;Ph.D.;Ed.D.)
Executive Director: EPEC Pty Ltd
From: rasch-bounces at acer.edu.au [mailto:rasch-bounces at acer.edu.au] On Behalf
Of Mike Linacre
Sent: Wednesday, 26 June 2013 10:17 PM
To: rasch at acer.edu.au
Subject: Re: [Rasch] IRT based reliability
Thank you for your questions, Ou Zhang.
1. What index do you use in calculating IRT based reliability?
Reply: We all use Spearman (1910) Reliability coefficients = "true"
variance / observed variance, computed for the current person sample.
In CTT, these are known as KR-20 and Cronbach Alpha and are computed
from raw scores. In Rasch methodology, Reliability is computed from the
estimated person measures (locations) and their standard errors. The
"Test" Reliability = (Observed Person Measure Variance - Mean Person
Error Variance) / (Observed Person Measure Variance)
2. How to model the reliability in CAT?
Reply: We can compute CAT Rasch reliability using the formula in 1., but
the Reliability coefficient will be lower than for a fixed-length test
because the CAT tests are shorter and so the Error Variances are larger.
If we want higher CAT Reliability, then we administer more items to each
person. However, rather than Reliability for a person sample, CAT test
design focuses on the precise measurement of persons near cut-points or
in crucial measurement ranges. There is usually no advantage in
attempting to obtain high precision (= high Reliability) for very low or
very high performers.
Can anyone suggest good References?
Rasch mailing list
Rasch at acer.edu.au
More information about the Rasch