# [Rasch] Item arrangement and measure invariance

Agustin Tristan ici_kalt at yahoo.com
Sun Sep 13 05:34:35 EST 2009

```Thank you Donald, it is a useful idea to take into account for speed tests. In such case the pattern is: more correct answers if the item is located at the beginning of the tests and less correct answers if the item is located at the end.
In such condition the item difficulty main increase or reduce according to the position, but the problem to solve is to explain why or how a different arrangement may produce always harder items?
Thank you.
Regards
Agustin

FAMILIA DE PROGRAMAS KALT.
Ave. Cordillera Occidental  No. 635
Colonia  Lomas 4ª San Luis Potosí, San Luis Potosí.
C.P.  78216   MEXICO
(52) (444) 8 25 50 76
(52) (444) 8 25 50 77
(52) (444) 8 25 50 78
web page (in Spanish): http://www.ieia.com.mx
web page (in English) http://www.ieesa-kalt.com/English/Frames_sp_pro.html

--- On Sat, 9/12/09, Donald Van Metre <devium at mindspring.com> wrote:

From: Donald Van Metre <devium at mindspring.com>
Subject: Re: [Rasch] Item arrangement and measure invariance
To: "Agustin Tristan" <ici_kalt at yahoo.com>
Date: Saturday, September 12, 2009, 2:25 PM

#yiv1796137607 {font-family:Geneva, Arial, Helvetica, sans-serif;font-size:9pt;background-color:#ffffff;color:black;}#yiv1796137607 p{margin:0px;}

Hello,

Just a thought, but if the test is timed, the items at the end of the test are likely attained only by the more capable students.  The weaker students will tend to get "timed out" before reaching the last items. This means your ability estimates on these items are based on responses from a relatively few strong students, which may be depressing the difficutly measures.  If these items are moved to an earlier position, you will get more responses from the weaker students, who will more often fail, pushing the difficulty upwards.

I work with practice TOEFL-iBT tests, which are timed.  We see a significant fall-off in the number of candidate responses as we get deeper into the tests.  For example, on one test over 1,000 students answered item number one, but less than 400 reached the last one.  I'm quite sure that if we swapped the first and last items, the statistics would change.  I would expect placing item number one at the end of the test would increase its difficulty estimate.

Cheers,

-Donald-

-----Original Message-----
From: Agustin Tristan
Sent: Sep 12, 2009 9:38 AM
To: Theo Dawson , -Rasch
Subject: Re: [Rasch] Item arrangement and measure invariance

Thank you Theo for your comments indicating different ways to teach for the test.
The point of my question is how can be shown (if there is a way to demonstrate it) that a new item arrangement will produce significantly different difficulties, and worse: always produce systematically higher difficulties.
If I have an item bank with previously calibrated items and I choose a set of items to produce two or more versions of the same test, arranged in different ways, the question is how can it be possible that I get systematically higher difficulties of the items? Bad calibration procedure? bad sample size during the pilot test? a version is administered systematically to the weakest students? Which of the versions contains the items with the "real" diifficulties? Item bank calibrations are useless?

Regards
Agustin

FAMILIA DE PROGRAMAS KALT.
Ave. Cordillera Occidental  No. 635
Colonia  Lomas 4ª San Luis Potosí, San Luis Potosí.
C.P.  78216   MEXICO
(52) (444) 8 25 50 76
(52) (444) 8 25 50 77
(52) (444) 8 25 50 78
web page (in Spanish): http://www.ieia.com.mx
web page (in English) http://www.ieesa-kalt.com/English/Frames_sp_pro.html

--- On Sat, 9/12/09, Theo Dawson <theo at devtestservice.com> wrote:

From: Theo Dawson <theo at devtestservice.com>
Subject: Re: [Rasch] Item arrangement and measure invariance
To: "Agustin Tristan" <ici_kalt at yahoo.com>
Date: Saturday, September 12, 2009, 8:09 AM

There are many ways of teaching to the test. For example, specifically teaching people to game the items, focusing on content knowledge (drill & kill) over understanding and real-world applications, "covering the material" rather than diving in deeper.

Testing, especially high stakes testing, drives instruction in many ways. Those of us who make tests are accountable for their effects, many of which are harming our children's minds. Does anyone really think that 3rd graders should spend hours every day learning to fill in bubbles on standardized tests? (I work with teachers who are forced to engage in this practice.) Who is this good for? And who, in their right mind, thinks it is defensible to make major decisions about students' futures on the basis of a single test score? Anyone who really understands the concepts of validity and reliability knows this is an unethical practice. Yet testing services all over the country willingly make tests that are used in this way---and lobby for more.

During the last 10 years, I've watched more and more students learn to hate learning (because they are not part of the tiny minority that enjoys memorizing), taught too many undergraduates who have never learned to defend an answer, and encountered way too many people who haven't got a clue how to apply anything they have learned in school to a real-world problem. These students know how to memorize facts and use formulas. They know how to figure out what the teacher wants. These skills are not useful when they leave school. I have also watched good teachers lose heart because they are required to engage in teaching practices that they know are harmful to their students. Who can say this is progress?

As a cognitive developmental psychologist and test developer, I think we need to completely rethink testing—its meaning, purpose, methods, and use—if we are really interested in serving the needs of students and societies.

When I was conducting my dissertation research, one of my respondents said this, "Testing is part of the conversation between a teacher and a student that tells the teacher what the student is most likely to benefit from learning next." Today's tests have lost sight of this purpose. In fact, I would argue that they have undermined the conversation between students and teachers and helped to de-professionalize teaching. I know we can change this if we have the will.

Theo

discotest.org
testingsurvey.us

On Sep 11, 2009, at 10:02 PM, Agustin Tristan wrote:

Hello Mike, teaching to the test is out of question at this moment, as nobody knows the items and their distribution.
Regards
Agustin

FAMILIA DE PROGRAMAS KALT.
Ave. Cordillera Occidental  No. 635
Colonia  Lomas 4ª San Luis Potosí, San Luis Potosí.
C.P.  78216   MEXICO
(52) (444) 8 25 50 76
(52) (444) 8 25 50 77
(52) (444) 8 25 50 78
web page (in Spanish): http://www.ieia.com.mx
web page (in English) http://www.ieesa-kalt.com/English/Frames_sp_pro.html

--- On Fri, 9/11/09, Mike Linacre (RMT) <rmt at rasch.org> wrote:

From: Mike Linacre (RMT) <rmt at rasch.org>
Subject: Re: [Rasch] Item arrangement and measure invariance
To: Rasch at acer.edu.au
Date: Friday, September 11, 2009, 6:20 PM

Agustin:

Are teachers "teaching to the test", so that students are expecting a specific item ordering?

Could this be a "Hawthorne Effect" in reverse?

Mike L.

At 9/11/2009, you wrote:
> A discussion here is that a new item arrangement will produce significantly different difficulties, and worse: always produce higher difficulties. I cannot find a reason to say that a different arrangement of the items will produce SYSTEMATICALLY HIGHER values of the difficulty.

_______________________________________________
Rasch mailing list
Rasch at acer.edu.au
https://mailinglist.acer.edu.au/mailman/listinfo/rasch

___________________________________
Donald Van Metre
devium at mindspring.com
Tel: (303) 415-1752  Fax: (303) 415-1796

-------------------------------------------------
Please consider the environment before you print
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailinglist.acer.edu.au/pipermail/rasch/attachments/20090912/1b4f16b5/attachment.html
```

More information about the Rasch mailing list