[Rasch] Estimating Rasch Measures for Extreme Scores

Steven L. Kramer skramer1958 at verizon.net
Sat Apr 30 02:13:12 EST 2011

I hope these aren't a naive questions.

If I understand correctly, from an IRT perspective one can compute for each student either a Maximum Likelihood Estimate (MLE) score, or a Bayesian score.  The Bayesian score assumes that ability is normally distributed, and accounts for this by shrinking estimated student scores towards the mean.  For students with perfectly correct scores the MLE predicts infinite ability, but the Bayesian score predicts finite ability.  (Similarly for students who get perfectly "zero" scores.)  

My questions:
Is the "Bayesian estimate"  the way WinSteps computes scores for extreme scores?
If so, couldn't your student just use the Bayesian estimate for everyone, thus incorporating a consistent measurement theory?
>From a Rasch point of view, what would be the downside of using a Baysian instead of a MLE approach?

Steve Kramer
Arcadia University
  ----- Original Message ----- 
  From: Rense Lange 
  To: Iasonas Lamprianou 
  Cc: rasch list 
  Sent: Friday, April 29, 2011 10:46 AM
  Subject: Re: [Rasch] Estimating Rasch Measures for Extreme Scores

  Since getting better faculty seems out of the question, would this solve it? Re-run twice using Facets. 
    a.. Once use all items, including those giving "perfect" scores, 
    b.. Once with the perfect items removed for each student/teacher getting perfect ratings.  

  Then plot the two sets of students/teachers? estimated parameters - they should be very similar (Y = X). While you are at it, you could show the probably non-trivial effects of differences in rater leniency/ severity on the student/teacher evaluations. 

  Rense Lange

  2011/4/29 Iasonas Lamprianou <liasonas at cytanet.com.cy>

    Dear colleagues,
    I rarely submit requests in this list unless it is urgent and important because I respect the time of the people who tend to reply most often. I would like to thank them. This time, it is important and that is why I politely request you to help me. The post is long, but it has to do with the problems Rasch-users face in the harsh world of academia. I think that the post concerns most of us.

    I am trying to help a student with her PhD thesis (so I am writing on her behalf). She submitted her thesis and her examiners spotted some problems and she has to address them.

    The problem: The PhD thesis is about the performance of students.  For each student participating in the study (N>1000), the researcher has his/her score on four subjects: language, science, maths and history. For each subject, each student has three teacher assessments which were awarded in January, March and June. Each score runs from E (Failure) to A (Excellent). So, overall, each student has three ordinal teacher assessment measures for each of four subjects. It is a typical repeated measures case for four variables/subjects with three measures per variable/subject.

    Design: Since the data are ordinal (E=1=Failure to A=5=Excellent) the researcher used a Partial Credit Rasch model with three  items  to build four Ability scales, one for each subject (the Rating Scale did not have good fit). Also, the student used all 12 scores (4 subjects X 3 measures) to produce one overall Ability  Academic Performance  measure. Then, the researcher used these Rasch ability measures as dependent variables to run OLS regressions.

    Issue 1:
    A serious problem spotted by the examiners is that a large proportion of students (around 20%) has perfect scores (three  A s) on some of the four subjects. The researcher used a Winsteps routine to find measures of ability for those students with extreme scores. The examiner has major reservations about the validity of this decision and asks whether these data (extreme scores) should be dropped. The examiner says:  If a Rasch analysis is to be used to derive attainment scores, the final distribution must provide a realistic representation of attainment. This means that the large group of candidates who achieve perfect scores (on the extreme right of the histograms) need to be properly represented. These scores need to be appropriately dealt with by Rash (if this is possible), or they need to be removed from the analysis (with an
    assessment made about the impact of the resulting loss of data).
    To the defense of the researcher, the distance between the  perfect score  and the  perfect-1  estimate is neither huge nor unreasonable: it is around 1.4 logits on a scale which extends from around -11 to 11 logits. When the researcher draws the scatterplot between raw scores and logits, the sigma-curve looks beautifully smooth and the estimates of the extreme scores look neither  too extreme  nor out of tune with the rest data points on the scatterplot. The distance between the  perfect score  and the  perfect-1  estimate is not grossly out of line compared to the other distances between raw scores estimates (for example, the distance between the  perfect-1  and the  perfect-2  scores is only around 0.3 logits smaller).
    (a) The researcher needs strong references to defend her decision NOT to drop the extreme data estimates. Can anyone please provide strong peer-reviewed papers to support the decision to keep the extreme score estimates as valid representations of the ability of the participants?

    Issue 2:
    Stemming from the previous comment, one of the suggestions of the examiners is that the researcher could ditch the Rasch model and instead sum the three measures in one subject (e.g. A+B+B=5+4+4=13) and then use this sum for an OLS regression. The examiner says  A serious discussion needs to be held about the benefits, if any, the Rasch analysis provides over a more direct analytical path (e.g.   a linear regression of results averaged over three   [teacher assessments] . We all know that this is simply wrong to do because we cannof average ordinal measures and the student already explains this in her Methodology section, but she probably needs more references.
    (b) Can anyone please provide a list of (recent, if possible) papers in good peer-reviewed journals which explain that this is not the right thing to do?

    Issue 3:
    Another suggestion of the examiners is that the researcher could ditch the Rasch model and just use the ordinal measure (E=1=Failure to A=5=Excellent) as a dependent variable in a proportional odds models. This means that the researcher should run three different models for each subject (for the Teacher Assessment awarded in January, March and June).
    (c) Can anyone pleased provide a list of (recent, if possible) papers in good peer-reviewed journals which explain that this is NOT better than using the Rasch model to get one linear measure instead of three ordinal?

    I feel that the examiners did a very good job overall and were very fair and consistent. They spent too much time to read every little detail in a long thesis, they spotted some important issues and we need to credit them for this. I feel that we may want to help the student address these interesting issues to the full satisfaction of the examiners.

    Thank you for your time

    In anticipation of your help
    Jason Lamprianou
    University of Cyprus

    Rasch mailing list
    Rasch at acer.edu.au
    Unsubscribe: https://mailinglist.acer.edu.au/mailman/options/rasch/rense.lange%40gmail.com

  Rense Lange, Ph.D.
  via gmail


  Rasch mailing list
  Rasch at acer.edu.au
  Unsubscribe: https://mailinglist.acer.edu.au/mailman/options/rasch/skramer1958%40verizon.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailinglist.acer.edu.au/pipermail/rasch/attachments/20110429/536f4730/attachment.html 

More information about the Rasch mailing list