[Rasch] Re-calibration Procedure

ranganaths ranganath.s at excelindia.com
Wed Jun 23 14:47:52 EST 2010

Hello Sir,


            Thanks for introducing me to the Equating methods. I am reading
one white paper on "IRT Equating Methods" which says 

"Suppose that two separate LOGIST calibrations using the 3PL model have 20
items in common. The mean and the SD of the item difficulty estimates for
the 20 items in the first calibration are .76 and 1.06 while, in the second
calibration, they are .43 and .97 resply. The parameters of the linear
transformation can be determined as follows:

                        (b1 - 0.76)/1.06  =  (b2 - 0.43)/0.97



            This is the case with the same 20 items administered to 2 groups
and equating the 2 set of values to bring on the same scale, but if we have
say 100 Qns for which we have the test parameters assigned and out of this
if we take 2 Qns to administer to the examinees. I have read that and also a
fact that in statistics more the sample size you have more the accuracy of
the estimated value. We have estimated the item parameters for this 2 Qns
previously and we have more data to add to that to obtain more precision or
rather re-calibrate adjusting the values of the item parameters of the 2 Qns
to make it more precise. If re-calibration is the case, what exercise should
be done to make the estimation more precise with more data on each



Rangnath S



From: John Barnard (EPEC) [mailto:JohnBarnard at bigpond.com] 
Sent: Monday, June 21, 2010 11:05 AM
To: 'ranganaths'; rasch at acer.edu.au
Subject: RE: [Rasch] Re-calibration Procedure


Dear Ranganath


This is quite a mouth full! It is not only for CAT, but for all serious item
banking work. If you use classical statistics and calculate the item
difficulty and discrimination (point biserial correlation) you will have
some sample dependent information. However, these can change significantly
if you administer the same item to another cohort in another test. To
counter this, equating is needed. Although classical equating methods yield
some results, modern test theory is much more robust and sophisticated.


After administration of a test you do a calibration, i.e. derive item
difficulty estimates and person ability estimates. Depending on which model
you use, you will get one, two, three parameters for each item and ability
estimates relative to this on the same scale. Be careful, Rasch usually
standardises on item difficulty and IRT models on person ability. You cannot
simply cross over to another model once you have calibrated one data set.
The problem is that this scale is "unique" so if you calibrate some items
with some other items in a second administration, the common items will have
different parameters. An equating process is required to get them on a
common scale. This is how you build an item bank on one scale so that you
can use any subset of items to obtain comparable ability estimates. You can
see that it is not simply a process of taking the average!


CAT is a different ball game - it is a sophisticated application which
requires items to be on a common scale. It is an efficient way to administer
less items without compromising precision. In "conventional" testing, if you
want to compile different tests and compare abilities (performance), the
items in the bank you use must have been equated to a common scale or you
have to do the equating afterwards with common items, people or an external
exercise. (We actually talk about the linking of items and equating of






Prof John J Barnard (DEd;PhD;EdD)
Executive Director: EPEC Pty Ltd
CEO: CAT Measures Pty Ltd
ASC: Asia, Africa and Australia
Honorary Faculty UCT; Adj. USyd

It is the responsibility of the recipient(s) to ensure that the e-mail is
virus free. Although antiviral software is used, no responsibility is
accepted for any problems caused by viruses. 

-----Original Message-----
From: rasch-bounces at acer.edu.au [mailto:rasch-bounces at acer.edu.au] On Behalf
Of ranganaths
Sent: Monday, 21 June 2010 3:13 PM
To: rasch at acer.edu.au
Subject: [Rasch] Re-calibration Procedure



            It is well known that in the case of CAT, for the item to be
included in the item bank, It needs to be calibrated and this happens over a
period of time on administering the item in various tests. Say for eg, we
have administered test1, test2.  testn the same question Q1. The item Q1
gets item parameter value a1,b1 and c1 in test1. The response vector being
v1 for the item Q1 in test1. In the successive tests should the response
vectors(v1,v2 . vn) for the same item Q1 be merged with the previous test
response to get the calibrated value of a1, b1 and c1.




Is it enough to have the a1, b1 and c1 value and then proceed having similar
values for the item in different tests and then take average of the
corresponding values to arrive at the final a, b and c values.




Ranganath S

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailinglist.acer.edu.au/pipermail/rasch/attachments/20100623/1e9ddd9a/attachment.html 

More information about the Rasch mailing list