Surely this is the reason we teach? To make items easier for those who learn.

Dear all,
Please do not forget that Rasch analysis aims at measuring an existing (reflective) variable, not at  INVENTING (constructive) a latent trait. Add to the discussion outlined below some reasoning concerning the nature of the DIFFING/NON DIFFING items, the reason why they are or not relevant to the hypothesized trait, the potential CAUSES for DIF (not only p values) etc. The more Rasch is taken only as a statistical game, the more you will strive to adapt reality to fit indexes, not the reverse.

Hi Alex,

DIF analyses could induce artificial DIF which is an artefact of some items displaying real DIF. For example, when you find some items favouring one group, some other items will inevitably favour the other group. David Andrich and Curt Hagquist have a few papers discussing this topic which could be quite helpful in your case. Please see the reference below.

Andrich, D., & Hagquist, C. (2012). Real and artificial differential item functioning. Journal of Educational and Behavioral Statistics, 37(3).
Andrich,  D.,  &  Hagquist,  C.  (2015).  Real  and Artificial  Differential  Item  Functioning  in Polytomous  Items. Educational  and  Psychological  Measurement, 75(2).



Dear Rasch researchers,

I would like to ask for your advice in a DIF-related issue. We performed a Rasch analysis of a 33-item questionnaire to choose items for a short version, in two steps. First we examined items infit and outfit, and used the thresholds of mean squares outside the 0.6-1.4 range and standardized fit statistics outside the +/-2.0. No items were excluded at this step (all fitted this criterion). Then we examined DIF for gender, age, and education (in this order), and excluded items with DIF > 0.5 logits. We excluded 3 items based on gender DIF, 4 items based on age DIF, and 12 based on education DIF. We thus arrived at a 14-item version, for which we examined again DIF for all 3 variables. We noticed that there appeared DIFs for age > 0.5 logits (e.g. .60, p<.000), even if in the earlier steps of the selection process these items had no problems. This is a noticeable and significant difference according to the winsteps manual (http://www.winsteps.com/winman/table30_1.htm ): « [DIF CONTRAST] should be at least 0.5 logits for DIF to be noticeable. "Prob." shows the probability of observing this amount of contrast by chance, when there is no systematic item bias effect. For statistically significance DIF on an item, Prob. ≤ .05. ». But I do not know how to figure out if this is also a substantive difference and if I should exclude those items as well, in other words continue item selection until all items show DIF <.05 for all variables (age, gender, education). This is particularly puzzling since these items showed acceptable DIF in previous runs. I understand in principle that this can happen, but what does it mean: are this items good enough or not? Should they be kept or excluded? Are there other criteria and tests that I should consider?

Any suggestions or references for further reading would be much appreciated!

Many thanks,

More information about the Rasch mailing list