heene at psy.lmu.de
Wed Oct 31 06:46:31 EST 2007
Hello to all and hello especially to Paul,
Well, let's open that can of worms and I will put my head on the block.
Let's start with the Fan paper. The fact that e.g. item parameters from the classical test theory (CTT) and IRT models are highly correlated is not new. E.g. Nunnaly (Psychometric Theory, 1978) gives a formula for approximating item slopes from the "classical" item discrimination index. Furthermore "classical" item difficulties show a strong monotone relationship with item parameters from IRT models or the Rasch model. So why worry about Rasch and CTT? -Because the comparison of item and person parameters is a red herring.
As far as I see it the jumping point in that article is that Fan admited that although parameters from item response models and those from CTT are highly correlated, almost thirty percent of the items didn't fit the Rasch model (see also Trevor's book for a detailed discussion of that issue, especially chapter 5). So as long as there is no fit to the Rasch model the interpretation of the data is not given because the sum score isn't a sufficient statistic. Proponents of the CTT always bother me by saying: "The Rasch model is too restrictive". But then they use the unweighted sum score. So the basic "logic" is: shrugging off the model explicitely but applying it implicitly.
So why don't they test the data-model fit? I guess it's because falsification is no longer a virtue in Psychology. For me the main advantage of the Rasch model ("beside" specific objectivity and unidimensionality) is that it is a falsifiable model. I don't want to be arrogant but I guess Rasch modelling could have had an impact on psychological theories if many researchers were willing to falsify at least some of their constructs. As Guttman (1971, Psychometrika, 36(4)) puts it:
"Scale analysis has been criticized by some on interesting grounds that it
does not always 'succeed' like item analysis. Scales don't always exist,
but people have come to believe that they MUST exist. And some researchers
are frustrated if they can't construct scores with which to continue to be
This point is also stressed in Andrich's articles (2002, 2004) about the resistance against the Rasch model.
And now we come to the can of worms... Yes, I agree with you, that applying the Rasch model doesn't make the criterion validities of our tests better. And I guess that that they would even be lower if the scales would met the requirements of the Rasch model, especially unidimensionality. As far as I can see from my Rasch analyses most of our tests (including ma own!) are hodge-podge tests, hence, show multidimensionality which increases criterion validities. From a sheer practical standpoint, one could say that this is enough, but from a scientific standpoint one could argue:
"Test results and numerical tables are further accumulated; consequent
action affecting the welfare of persons are proposed, and even taken, on the grounds of -
nobody knows what" (Spearman, 1927, p. 15).
"I have the feeling they are discovering that the real problem is not with the technology of assessment, but with the very nature of the constructs we try to assess."
I absolutely agree with you. And I think that's what Andrich (2002) means by saying: "Discovering the problem after it is solved". I also keeled over when I read the following lines of Roskam (1985):
"These considerations bring me to the question what 'specific objectivity' or 'sample independence' means. This property of the Rasch-family implies that subject parameters can be estimated independent of the (unknown) item parameters, and independent of the selection of items from a universe of content. But it can also be said that a set of items is Rasch homogeneous to the xtent that all items have equal logistic regressions on a single common factor. The latter is a requirement following from specific objectivit, but I wonder if the reverse is true: do equal logistic regressions imply specific objectivity (especially if fit to the Rasch model is obtained by eliminating deviant items? If a set of items satisfies the Rasch model, the only thing which can be said is that the association among the items can be explained by their particular regression on a single common latent variable."
-That's exactly the point: Deleting items contradicts the idea of specific objectivity.
Or, to put it more bluntly: "To throw away items that do not 'fit' unidimensionality is like throwing away evidence that the world is round" (Guttman, 1977).
-Hence, it's a euphemism of unidimensionality and specific objectivity.
And this reminds me on the words of Roskam (1985):
"...can we develop a process model for problem solving which generates, e.g. the Rasch model? As far as I can see, it is extremely difficult, if not impossible to devise a stochastic process model, which generates a logistic response model and contains both subject and item parameters.
In this persepctive, it seems rather unlikely that the same process model would be valid for attitudes as well as for abilities, and this makes me sceptical about the general validity of any latent trait model."
So, again, I finally argue that taking the Rasch model and it's implications more seriously would have led to more insights and "wow" effects. And perhaps it would have led us earlier to neural networks. But it is too often (mis-) used as a tool.
But, in my humble oppinion, my impression is that the Rasch model was too often used "as an envelop- or cover-model to connect a cognitive process with a probabilistic response mechanism" (...) but "has not, however, linked its own stochastic models with those of cognitive processes" (Roskam, 1985).
But perhaps I am completely mistaken. So I don't want to tread on somebodies feet here. As Trevor knows I am sometimes trapped in technical details and tend to loose the overview...
You wrote: ".. an entire new area of psychological science investigation "connectionist cognitive psychology", a neurophysiological model for developmental brain networks and brain function, artificial intelligence, computational biology, and literally physical working models of the development of human intelligence, stock market trading algorithms, seriously accurate market research/brand analysis prediction models, security camera person-feature detection, biometric sensors, handwriting, and speech detection, etc, etc..."
I am not expert in neural networks, I am just a beginner, but as far as I can see, neural networks, as efficient as they are, have still their problems. And, most importantly, they are also based on numerous prior assumptions about the nature of learning and development (e.g. number of layers, the extent of connectivity between layers, the number of units, initial activation values etc.). And I doubt if neural networks are really different from other "data modeling techniques" and how the results can be generalized because of capitalizing on chance. But as I have already said, I am just a beginner.
Andrich, D. (2002). Understanding Rasch measurement: Understanding resistance to the data-model relationship in Rasch's paradigm: A reflection for the next generation. Journal of Applied Measurement, 3(3), 325-359.
Andrich, D. (2004). Controversy and the Rasch Model: A Characteristic of Incompatible Paradigms? Medical Care, 42(1; Supplement:I-7).
Current issues in item response theory. In E. E. Roskam (Ed.),. Measurement and personality assessment (pp. 3-20). Amsterdam: North Holland.
Guttman, L. (1977). What is not what in statistics. The Statistican, 26(2), 81-107.
Spearman, C. (1927). The abilities of man. London: Mac Millan.
Wright, B. D. (1977). Misunderstanding the Rasch model. Journal for the Educational Measurement, 14(2), 97-116.
More information about the Rasch