# [Rasch] Estimating Rasch Measures for Extreme Scores

Iasonas Lamprianou liasonas at cytanet.com.cy
Fri Apr 29 18:38:49 EST 2011

```Dear colleagues,
I rarely submit requests in this list unless it is urgent and important because I respect the time of the people who tend to reply most often. I would like to thank them. This time, it is important and that is why I politely request you to help me. The post is long, but it has to do with the problems Rasch-users face in the harsh world of academia. I think that the post concerns most of us.

I am trying to help a student with her PhD thesis (so I am writing on her behalf). She submitted her thesis and her examiners spotted some problems and she has to address them.

The problem: The PhD thesis is about the performance of students.  For each student participating in the study (N>1000), the researcher has his/her score on four subjects: language, science, maths and history. For each subject, each student has three teacher assessments which were awarded in January, March and June. Each score runs from E (Failure) to A (Excellent). So, overall, each student has three ordinal teacher assessment measures for each of four subjects. It is a typical repeated measures case for four variables/subjects with three measures per variable/subject.

Design: Since the data are ordinal (E=1=Failure to A=5=Excellent) the researcher used a Partial Credit Rasch model with three “items” to build four Ability scales, one for each subject (the Rating Scale did not have good fit). Also, the student used all 12 scores (4 subjects X 3 measures) to produce one overall Ability ‘Academic Performance’ measure. Then, the researcher used these Rasch ability measures as dependent variables to run OLS regressions.

Issue 1:
A serious problem spotted by the examiners is that a large proportion of students (around 20%) has perfect scores (three ‘A’s) on some of the four subjects. The researcher used a Winsteps routine to find measures of ability for those students with extreme scores. The examiner has major reservations about the validity of this decision and asks whether these data (extreme scores) should be dropped. The examiner says: “If a Rasch analysis is to be used to derive attainment scores, the final distribution must provide a realistic representation of attainment. This means that the large group of candidates who achieve perfect scores (on the extreme right of the histograms) need to be properly represented. These scores need to be appropriately dealt with by Rash (if this is possible), or they need to be removed from the analysis (with an
assessment made about the impact of the resulting loss of data). ”
To the defense of the researcher, the distance between the “perfect score” and the “perfect-1” estimate is neither huge nor unreasonable: it is around 1.4 logits on a scale which extends from around -11 to 11 logits. When the researcher draws the scatterplot between raw scores and logits, the sigma-curve looks beautifully smooth and the estimates of the extreme scores look neither “too extreme” nor out of tune with the rest data points on the scatterplot. The distance between the “perfect score” and the “perfect-1” estimate is not grossly out of line compared to the other distances between raw scores estimates (for example, the distance between the “perfect-1” and the “perfect-2” scores is only around 0.3 logits smaller).
(a) The researcher needs strong references to defend her decision NOT to drop the extreme data estimates. Can anyone please provide strong peer-reviewed papers to support the decision to keep the extreme score estimates as valid representations of the ability of the participants?

Issue 2:
Stemming from the previous comment, one of the suggestions of the examiners is that the researcher could ditch the Rasch model and instead sum the three measures in one subject (e.g. A+B+B=5+4+4=13) and then use this sum for an OLS regression. The examiner says “A serious discussion needs to be held about the benefits, if any, the Rasch analysis provides over a more direct analytical path (e.g.
a linear regression of results averaged over three
[teacher assessments]”. We all know that this is simply wrong to do because we cannof average ordinal measures and the student already explains this in her Methodology section, but she probably needs more references.
(b) Can anyone please provide a list of (recent, if possible) papers in good peer-reviewed journals which explain that this is not the right thing to do?

Issue 3:
Another suggestion of the examiners is that the researcher could ditch the Rasch model and just use the ordinal measure (E=1=Failure to A=5=Excellent) as a dependent variable in a proportional odds models. This means that the researcher should run three different models for each subject (for the Teacher Assessment awarded in January, March and June).
(c) Can anyone pleased provide a list of (recent, if possible) papers in good peer-reviewed journals which explain that this is NOT better than using the Rasch model to get one linear measure instead of three ordinal?

I feel that the examiners did a very good job overall and were very fair and consistent. They spent too much time to read every little detail in a long thesis, they spotted some important issues and we need to credit them for this. I feel that we may want to help the student address these interesting issues to the full satisfaction of the examiners.