[Rasch] Reverse Digit Span Item Scoring Question

Schulz, Matt mschulz at pacificmetrics.com
Fri Dec 14 04:49:57 EST 2012

I think it's reasonable to assume the items are locally independent so I would go for scoring method #1.  The other two methods loose information concerning e.g., item fit, person fit, etc.

From: rasch-bounces at acer.edu.au [mailto:rasch-bounces at acer.edu.au] On Behalf Of Fidelman, Carolyn
Sent: Thursday, December 13, 2012 9:32 AM
To: rasch at acer.edu.au
Subject: [Rasch] Reverse Digit Span Item Scoring Question

HI All,

We are considering creating our own scale for a reverse digit span task that is to be administered to children ages 5-11 longitudinally and I am having trouble understanding which Rasch model to use or how to structure the data.

What is done in such a test is that a child is first asked to repeat back two numbers that they hear on an audio recording, but in reverse order. They get five such items and, if they get three of them right, they go on to the next level. Each level consists of an additional number of numbers to repeat back all the way up to 9. Most adults can do about 7. This is correlated to general intelligence and is predictive of a number of achievement related outcomes. This is modeled off of something from the Woodcock Johnson COG III called Numbers Reversed and is Test 7 in their battery of cognitive assessments.

My hope was that the WJ documentation would tell us which Rasch model was used with this "trials"-type data. But it doesn't really go into it, perhaps for proprietary reasons? It says that the original calibration program was devised by Woodcock in approximately 1970. Much has been done to develop models that fit various data types since that time.  It is not clear what the Woodcock procedure is exactly and how it differs from or is similar to standard Rasch 1 parameter, rating scale  or partial credit model.  I am concerned about this because we need to decide whether the structure of the data is to be discrete or clustered and determine whether it is also a problem that it is hierarchical where performances are nested within type.

One question I have for this test is "What is an item?"  Is an item a single trial or the  5-trial (4-trial later on) cluster? How is that scored for the purposes of score calibration? Is it a 0,1 for each trial or is it a scale from 0-5 (0-4 later on) for each cluster? Would we lose too much information going to a polytomous scoring?  The administration protocol is that if the child gets three in a cluster such as the two-numbers type, they move on to the next greater number type. If they only get two in a cluster, they are considered to have achieved that span but not more and no further items are presented.

Item type/names by number of digits to recall:


Example item scorings:

1. One record of a discrete, dichotomously scored raw score file:

1111111110111000088888888888888888*                                                       *8=not presented

2. One record of a clustered, dichotomously scored file:


3. One record of a clustered, polytomously scored file:


What we have seen in other ECLS tests is that when we violate the assumption of local item independence, through either a prompt or format effect, item parameters can be inflated and standard errors artificially low or unstable. A possible further violation in this case is that of not taking into account the extra variability between levels of data in a hierarchy, in addition to those within item sets.

Does anyone here know informally or through some publication source(1), what exact Rasch model the WJ folks used for this test? Or, regardless of that, what do you think one would do here? I am not comfortable with just treating it as in example 1 above.  Thanks for any help you can offer!


Carolyn G. Fidelman, Ph.D.
Early Childhood, International & Crosscutting Studies, NCES
Rm 9035, 1990 K St. NW | 202-502-7312 | carolyn.fidelman at ed.gov<mailto:carolyn.fidelman at ed.gov>

(1) I consulted the following:

Jaffe, L. E. (2009). Development, interpretation, and application of the W score and the relative proficiency index (Woodcock-Johnson III Assessment Service Bulletin No. 11). Rolling Meadows, IL: Riverside Publishing.


McGrew, K., Schrank, F., Woodcock, R. (2007). Woodcock-Johnson Normative Update: Technical Manual. Rolling Meadows, IL: Riverside Publishing.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailinglist.acer.edu.au/pipermail/rasch/attachments/20121213/0407d0ea/attachment.html 

More information about the Rasch mailing list