[Rasch] FW: c-parameter
Margaret Wu
wu at edmeasurement.com.au
Mon Jun 4 21:32:33 EST 2012
Hi Steve and David
I think we are not discussing the same issue. I agree if you define
"discrimination" as "discrimination" at each threshold, then, sure, for the
Rasch partial credit model, you need to have equal discrimination. That's
why I said in my earlier post that for the Rasch partial credit model, the
category scores need to be integer with no jumps, so that the conditional
probabilities on adjacent categories are of the Rasch form (with equal
discrimination).
I am referring to "discrimination" of an item as the item information
function as Ray pointed out. If a partial credit item has 5 score categories
(and fits the PC model), it will provide more information about the
abilities of students than an item with 2 score categories. Because the
5-score categories can divide respondents into 5 groups of (roughly)
increasing ability. In this sense, an item with a maximum score of 5 is more
"discriminating" than an item with a maximum score of 2. Whether a partial
credit item can "support" more score categories depends on how the item
works (I will say it depends on how "discriminating" the item is). It's not
an arbitrary decision made by the test writers. That is, if a 5-category
partial credit item fits the PC model, then, collapsing it to 2 categories
will not fit the PC model. Conversely, if an item fits a partial credit
model with 3 score categories, we can't make 5 score categories out of it
and still fitting the PC model. In real-life, we make these decisions about
the number of score categories when we write the item and the marking guide.
These decisions need to be checked out after trial analysis, in very much
the same way as we check whether all dichotomous items fit the Rasch model.
If we run a generalized 2PL model, we will get estimates of item category
scores. This information can inform us about the maximum score an item can
"support". And if we round the item category scores to integers (with no
gaps), we can still have a Rasch model. Otherwise we need to do some
educated guesses in re-scoring the items. Item writers may be quite good at
estimating item difficulties, but it's very hard to estimate item
discrimination (or, appropriate maximum score for a partial credit item).
When I ask item writers how they decide on what an item should be marked out
of, they nearly always tell me that more difficult items should have higher
maximum score. It's easy to see why item difficulties should not be used to
determine item scores. Just think of a simple dichotomous item test where
each item has a score of 1 (for equal discrimination), but surely the items
have very different difficulties.
I agree that the 2PL model does not have sufficient statistics for item
parameters and the desirable measurement properties as the Rasch model, but
one can make use of the 2PL results to improve how a partial credit item can
be scored - whether it should have a maximum score of 5 or 3, or whatever. I
am calling the maximum score of a partial credit item as the weight of an
item. Weighting items is essentially what a 2PL model does. That's the
similarity between 2PL and partial credit models.
David, what I referred to as generalized 2PL model is an extension of the
MCML model originally formulated by Ray Adams and Mark Wilson. In the MCML
model, the score matrix, B, is assigned, not estimated. I extended the MCML
to include the estimation of the score matrix, B. B is a score matrix at the
item category level. I have already added this to a beta version of
ConQuest. I will try to write up some examples to illustrate what we are
discussing here.
Regards
Margaret
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailinglist.acer.edu.au/pipermail/rasch/attachments/20120604/554ea2f2/attachment.html
More information about the Rasch
mailing list