[Rasch] Fan: > Is replicating Fan's research worthwhile? I'mvoting no for now...
ici_kalt at yahoo.com
Sun Nov 4 09:29:26 EST 2007
We have discussed many times that Rasch model is paradigmatic and that IRT tries to describe data. But we have to establish once and for always that the Rasch model IS NOT a part of IRT, even that models look similar the background is different. Is it possible to make a signed declaration to this regard? If not, let's sign a declaration saying that IRT and Rasch model are the same thing, but we must be clear on it (I prefer to say Rasch model IS NOT a part of IRT). Now, accepting that IRT only intends to describe data, I pass to the second part.
It makes me noise to think that CTT and Rasch go in different ways, and even opposite directions. This is not the case and we all know that, otherwise how can we accept that the number right responses is sufficient to calculate the Rasch measure?
What we can say with CTT is very close to what we can say with Rasch and viceversa, or quite close. But we have to make some distinctions.
Concerning item analysis:
1) Rasch analysis regards item difficulty and fit. We need to improve the fit models as they depend on N, population size.
2) CTT regards item difficulty and discrimination. If possible for everybody please don't consider point biserial correlation as item discrimination. rpbis is only a Pearson correlation between measures of persons and the dicothomy of right-wrong response: high correlation indicates that the item relates to the remaining items of the set, it is more associated to the validity of the item to measure the same variable as the set where it belongs; this correlation has nothing to do with discrimination, even if many authors and many of us are using rpbis to identify discrimination. Previous definitions of discrimination (say PD) as the difference between subgroups (high and low) are more comprehensible and correct.
Please note that there is a high correlation between rpbis and PD, so this is the reason why we use rpbis and not PD.
If we accept what I have just said, and if we know that there is a high correlation between CTT parameters and Rasch measures, so this could be the reason why we may use CTT and not Rasch.
But we prefer Rasch measures over CTT, so we should prefer PD over rpbis.
a) The model [CTT]:(Difficulty-PD) is far much better that the model [CTT2]:(Difficulty-rpbis), no doubt.
b) The model [Rasch]:(measure-FIT) is much better that the model [CTT], no doubt.
What are we intending to do with a test? something following the idea of these 3 properties according to Guttman: [UOI]:unidimensionality, ordered and inclusive (we are measuring a single dimension, items can be ordered in a continuum and if a person can answer an item of difficulty D1, he probably can answer another item easier than D1).
We are using the [UOI] model, under a probabilistic approach with [Rasch], or deterministic if we follow [CTT]. If we have a test with the three properties [UOI], then [CTT] and [Rasch] will produce very close results. No doubt. This has nothing to do with magic, with the reliability of the test or hazard, we are following the same model, so no surprises regarding the models.
But if we produce a test without a model behind us, no matter if we have one dimension, no matter if a person answers easy or hard items, no matter the distribution of difficulties, every model will provide a different picture of what we are measuring.
So what is the benefit of the Rasch model+ The Rasch model helped to make clear the concept of measure as a probabilistic approach to Guttman's concept. Once this became clear we may use [CTT] or [Rasch] and we will have convergent results. This concept of measure was not clear before Rasch, this is true. This concept is so strong that even IRT may function and provide convergent results to [Rasch] and [CTT].
If you don't follow this concept, not only [CTT] is different to [Rasch], but it appears that ONLY IRT may provide results to make interpretations of what is considered "THE REALITY". In my country, many people passed from [CTT2] to IRT, because they need flexibility, more description, even if they cannot measure, it is not a matter of quality of the items, a matter of measurement or the definition of a metric, it is the need to explain "what I did with the test". They do not know if they are doing a bad measure, Ithey don't care, what they need is to explain what they did, they need to show that there is a model that fits their data, because for them good fit means "good measurement". This is the wrong idea and the bad message that IRT provides. In Rasch model, good fit implies good measurent, in IRT, good fit only implies good fit of a function to my data. If data are wrong, the function represents wrong data, How can I produce good measures with wrong functions over
wrong data? I would like to think that one day some people would understand this.
In addition, IRT is better to many people because it provides the pseudo-guessing parameter, this has been discussed several times by Ben Wright, so I shall not mention again. Also, IRT provides item discrimination and Rasch model cannot.
I have one idea that I would to share here. I´ve proposed several times to see discrimination in a different way than IRT does. No paper has been accepted the way I propose (even that there is a Rasch Measurement Transaction with my proposal, at least partially) I shall insist on it even if the idea will be discarded. My idea is not to define discrimination as a function of the slope of the curve at the point of inflection, Of course, with this definition, the Rasch model has a single shape and its slope (and therefore its "discrimination") is unique for this model, this doesn't occur with 2PL or 3PL having flexibility to show different slopes.
My proposal is to define item discrimination the same way we do in CTT, i.e. as the difference between higher and lower groups. Just look at the Rasch curve, please define your cut off point for higher and lower persons, obtain the difference in the area below the curve in between the limits defined by the cut off pint and the extreme measures of the persons. The difference of the areas will produce different values the Rasch curve, you will not have one single value, please try it. If we define discrimination this way, we may report what we are intending to show: that higher people respond better than lower people, and you get the amount of that difference according to the distribution of people. I know this is not an intrinsic property of the curve, discrimination is not intrinsic.
I don't have the FAN paper, but may other people say similar things for CTT and Rasch model, without concern of the model behind. Aldo, many people suggest that IRT is better than Rasch.
I think we have to modify some ideas regarding measurement, to define what we consider CTT and Rasch models are, to set a new definition for item discrimination, to limit the use of rpbis to what it has been created and to improve the FIT models to have more robust formulations. There are a lot of things to do!
FAMILIA DE PROGRAMAS KALT.
Mariano Jiménez 1830 A
Col. Balcones del Valle
78280, San Luis Potosí, S.L.P. México
TEL (52) 44-4820 37 88, 44-4820 04 31
FAX (52) 44-4815 48 48
web page (in Spanish AND ENGLISH): http://www.ieesa-kalt.com
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Rasch