[Rasch] Noise in observations ...

Paul Barrett pbarrett at hoganassessments.com
Mon Nov 5 06:52:29 EST 2007




	
________________________________

	From: rasch-bounces at acer.edu.au
[mailto:rasch-bounces at acer.edu.au] On Behalf Of Rense
	Sent: Sunday, November 04, 2007 8:40 PM
	To: 'Mike Linacre (RMT)'; rasch at acer.edu.au
	Subject: RE: [Rasch] Noise in observations ...
	
	
	
	...
	Rasch scaling adds linear measures, straightforward equating,
and superior fit statistics to an otherwise blind system. This is far
more important than the coincidence that raw sums and linear measures
correlate highly in the mid range.

Hello Rense
 
There is a clear contrast between Rasch, IRT, and the use of sum scores
(I'm purposely avoiding the use of CTT as you need no "formal test
theory" to work with sum scores). That distinction seems best to be
described in terms of how each approaches the issue of validity.
 
The Rasch and IRT approaches both propose a data model; philosophically,
the Rasch model may be considered as distinct from IRT but in the end,
the researcher still has to test for model fit before the claims they
might wish to make about their measures can be regarded as valid. So,
much effort goes into isolating items which do indeed fit the model. 
 
As Moritz pointed out in an earlier message - the problem for Rasch is
that the "measurement philosophy" requires that the model fit the data
as expected by the axioms of conjoint additivity, yet the statistical
tests of fit employed seem to be unreliable indicators of this kind of
fit. There seems to be an uncomfortable silence about Moritz's email and
this issue in general (as there is with Michell's 2004 arguments
(Michell, J. (2004) Item Response Models, pathological science, and the
shape of error. Theory and Psychology, 14, 1, 121-129. - if anybody
would like a pdf of this paper to see his arguments  for themselves ,
please email me "back-channel/privately"). 
 
Either way, a great deal of effort goes into attaining a test which
"fits" the data model as specified. The PCA/residual analyses are
utilized as a method of determining whether the resultant scale has
removed much of covariance between the test items.
 
But note, none of this activity is concerned with whether the test
actually measures anything of interest. The focus is on model fit and
scaling. The relation between the test scores and perhaps theoretically
or pragmatically important outcomes comes second, unless the items were
generated as part of a strong theory of a construct.
 
The "blind" sum score approach assumes "additivity" as a "convenience".
The "validity" question is that concerned solely with "the relation
between the test scores and perhaps theoretically or pragmatically
important outcomes". this may be either via linear, ordinal, or
classification means. In reality, this whole approach is "fuzzy" - no
real precision need be assigned those scores as there is no extant
psychological theory which would claim that any psychological variable
can be measured with the precision of a physical variable. I admit, some
in CTT will assume that claim (i.e. there are such things as precise,
"true scores"), but this is patent scientific nonsense once one
considers the neuroscience of the human brain, and the two-way causal
processes within which psychological variables might be observed (in
reality, sum-scores are simply a placeholder for CTT to work,
statistically speaking).
 
So, the sumscore approach is blind only to the extent that, causality of
those scores is not directly or exactly specified, and the scores
themselves are treated more as simple orders than equal-interval numbers
- even if the methods of analysis might be linear (for pragmatic
convenience only). Other than that, it's "anchors" are more criterion
based than measurement-theory based. So an awful lot of activity has to
go in defining the kinds of criteria which would justify/make the
"scale" useful for use.
 
Part of me feels that the Lexile system has actually shown "the way"
this works. You acquire theoretically guided empirical observations of
the phenomenon of interest, these are cross-related to important
outcomes or theoretical expectations. You then note there is a
"regularity" between magnitudes on one variable and another; you define
a formal measurement model for this regularity, test it, and show that
the formal model will account for those empirical relations.
 
If my brief characterization for Andrews' detailed messages is correct,
then such work satisfies Michell's (2004) criticisms, but also would
seem to be an exemplar of my "sum-score" with criterion-validity
approach - i.e. you spend more time acquiring empirical relations than
you do scaling, at first. Then, after you have clearly identified the
empirical relations between your "variable" and coherent theoretical or
pragmatic outcomes, you  might choose to model these with an appropriate
mathematical function.
 
The problem for me is that in my world of individual difference
variables (personality, intelligence, motivation, values), there is
insufficient empirical evidence about magnitudes of these "variables"
and their relation to criterion outcomes at each level/order of
magnitude. Sure, there are the usual smorgasboard of feeble or
statistically "corrected" correlations between magnitudes and outcomes -
but these barely constitute evidence for any kind of formal model to be
fitted to the data as yet.
 
I am minded here of two approaches to science, the deductive vs
abductive (Haig, B. (2005) An abductive theory of scientific method.
Psychological Methods, 10, 4, 371-388.). Rasch as more "deductive" than
a sum-score approach ...but maybe this is too far a stretch ... 
 
Regards .. Paul
 
 
 

 

Paul Barrett, Ph.D.

2622 East 21st Street | Tulsa, OK  74114

Chief Research Scientist

Office | 918.749.0632  Fax | 918.749.0635

pbarrett at hoganassessments.com

      

hoganassessments.com <http://www.hoganassessments.com/> 

 

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailinglist.acer.edu.au/pipermail/rasch/attachments/20071104/26cb292e/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/bmp
Size: 29886 bytes
Desc: att47b91.bmp
Url : https://mailinglist.acer.edu.au/pipermail/rasch/attachments/20071104/26cb292e/attachment.bmp 


More information about the Rasch mailing list