[Rasch] Fan: > Is replicating Fan's research worthwhile?I'mvoting no for now...

Paul Barrett pbarrett at hoganassessments.com
Sun Nov 4 14:33:06 EST 2007




________________________________

	From: rasch-bounces at acer.edu.au
[mailto:rasch-bounces at acer.edu.au] On Behalf Of Nianbo Dong
	Sent: Saturday, November 03, 2007 7:16 PM
	To: Trevor Bond; Lang, William Steve; Jim Sick; Rasch List
	Subject: Re: RE: [Rasch] Fan: > Is replicating Fan's research
worthwhile?I'mvoting no for now...
	
	
	This is an insteresting discussion.
	 
	For your informaion. Courville 's (2004) dissertaion "An
empirical comparison of item response theory and classical test theory
item/person statistics" can be downloaded from the link below:
	http://txspace.tamu.edu/handle/1969.1/1064
<http://txspace.tamu.edu/handle/1969.1/1064> 
	 

 
Hello Nianbo
 
Thanks for this reference - this is some thesis .. and the conclusions
are again similar to Fan's. 
 
Tailored Testing, CAT, and indeed score equating (for Mark!) would all
apparently work equivalently if based upon item difficulties and person
abilities computed using either set of statistics (CTT or Rasch)? 
 
The measurement thesis and "promise" which sits behind Rasch scaling
(and IRT to a lesser extent) is a powerful one - but if the practical
reality is that "it just doesn't matter" - then clearly this begs the
question that if the same amount of energy and innovative thinking had
gone into improving the measurement or assessment, and predictive
validity of the kinds of attributes we might wish to measure/assess,
would this have yielded bigger dividends by now, even if it meant there
would never have been a Winsteps, RUMM, or BILOG/MULTILOG?
 
Whilst some might feel these are "silly" questions - they do matter as
companies like my own will have to decide whether to invest substantial
finances in the methodologies for the "New Rules of Measurement" or
simply continue with their current, what might be called
"problem-oriented" and "predictive validty first" approach.
 
I'm not advocating anyone stop what they are doing - but given Michell
(2004), and the charge that Rasch and IRT models are dependent upon
sufficient error being included in the responses, and the more error,
the better the fit of the models, then one not schooled in the Rasch
philosophy might well ask how a tool for measurement only works if there
is enough "error" in the mix. 
 
Unless of course as Mike has previously indicated, magnitudes of the
kinds of psychological attributes we hope to measure can only be
measured if their observations contain sufficient noise (as per some
examples from Astrophysics I believe). But that is some claim to make,
and goes against almost every tenet of science, which is devoted to
reducing error in observations.
 
Anyway, just some more musings ... I can't help probing the limits of my
understanding and others' on these issues! 
 
Regards .. Paul 
 
Michell, J. (2004) Item Response Models, pathological science, and the
shape of error. Theory and Psychology, 14, 1, 121-129.
 
 
 

 

Paul Barrett, Ph.D.

2622 East 21st Street | Tulsa, OK  74114

Chief Research Scientist

Office | 918.749.0632  Fax | 918.749.0635

pbarrett at hoganassessments.com

      

hoganassessments.com <http://www.hoganassessments.com/> 

 

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailinglist.acer.edu.au/pipermail/rasch/attachments/20071103/9712fbd9/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/bmp
Size: 29886 bytes
Desc: atta6f65.bmp
Url : https://mailinglist.acer.edu.au/pipermail/rasch/attachments/20071103/9712fbd9/attachment.bmp 


More information about the Rasch mailing list