[Rasch] Fw: Interval or nominal

Parisa Daftari Fard pdaftaryfard at yahoo.com
Wed Aug 17 15:03:49 EST 2011

Dear Professor Hess,
Thank you for your question. Actually nationality is hidden within Raters as facet. I have not included another dummy facet yet. I have run a three facet analysis with raters + question types + items as facets. What I am trying to do is to see if all native speakers agree on the appropriacy of a response or there is any kind of bias. 
This problem is different from what you have done on writing . There you have many examinees. Here 50 raters or so are rating one single questionnaire or let's say one single interview. Let me revise my question. In such cases, when only rating is important, can we ignore Item measures because there is only one examinees and then only focus on person measure (person fit), vertical ruler table , and bias table? 
I think as we on purpose included some questions that we knew they were not appropriate, the items are shown to be the most difficult items and therefore misoutfited,  whereas the reality is that we did this to see native speakers' reaction. 

From: Robert Hess <Robert.Hess at ASU.edu>
To: 'Parisa Daftari Fard' <pdaftaryfard at yahoo.com>
Sent: Tuesday, August 16, 2011 10:42 PM
Subject: RE: [Rasch] Fw: Interval or nominal

I realize that this may be a somewhat redundant question, but I am unable to discern this from the discussion or the print out. Are you including the nationality of the rater as a facet in your overall model? That does not mean simply to include rater as a facet, but in reality you have a situation where rater is nested within nationality. I see no indication that this is being considered. As I see the problem, you're trying to establish the distinction between nationality of raters rather than just raters themselves. So first you have to establish that raters from (for example) the United States respond in a manner consistently similar to each other while simultaneously distinct from (for example) raters from Mexico. 
To answer your initial question, your analysis is based on the assumption that you are dealing with ordinal data and that you are attempting to make the ordinal distances become equal and hence (through the application of Facets) transform into interval measures. 
This is what we were doing in the mid-90s when dealing with raters and writing assessment. We had a real problem with rater’s scoring site influencing the scoring given to items on the test. Once we realized that we needed to assume that raters were nested within their scoring site and therefore the scoring site became a facet we were then able to tease out this influence in trying to establish viable scores.
While this may be redundant, my hope is that it may clear up some problems.
Robert Hess.
Emeritus Prof. of Educational Measurement and Evaluation
Arizona State University
From:rasch-bounces at acer.edu.au [mailto:rasch-bounces at acer.edu.au] On Behalf Of Parisa Daftari Fard
Sent: Tuesday, August 16, 2011 12:11 AM
To: Tom Conner; rasch list
Subject: Re: [Rasch] Fw: Interval or nominal
Dear Rense and Tom,
Thank you so much for your comments. The question is that whether native speakers varied in the way they rate responses. and if there is any kind of bias among them. 
We reach answers anyway. But I am not sure if this answer is correct conceptually as you mentioned earlier. Facet produces four important information that needs to be reported (besides many other good things). They are item fit, person fit, vertical ruler table, Bias table. My problem is with item fit and measure statistics. There in that table we have two items with the highest measure. I guess the only interpretation we need to make is that these situations were difficult for the examinee that raters give 1 or 2 (from 5) to them. But the reality is that we have only one questionnaire or interview and all raters rate it. And we want to see if they vary in their decision or not. Do you think this scoring (as raters are more important than examinee) is coding or rating?
I appreciate your kind help in advance.
Table 3  24 items Measurement Report  (arranged by mN).
|  Total   Total   Obsvd  Fair-M|        Model | Infit      Outfit   |Estim.| Correlation |       |                     |
|  Score   Count  Average Avrage|Measure  S.E. | MnSq ZStd  MnSq ZStd|Discrm| PtMea PtExp | Group | Nu 24 items         |
|    71      51       1.4   1.32|   3.45   .27 | 1.64  2.7  1.45  2.0|  .51 |   .58   .32 |     2 | 10 2                | in subset: 2
|    73      51       1.4   1.45|   2.95   .27 | 2.00  4.0  1.80  3.4|  .32 |   .41   .33 |     4 | 19 4                | in subset: 4
|    97      51       1.9   1.86|   1.76   .22 | 1.56  2.1  1.45  1.8|  .58 |   .59   .40 |     1 |  6 1                | in subset: 1
|   105      51       2.1   1.93|   1.59   .20 |  .77 -1.0   .74 -1.1| 1.39 |   .50   .43 |     2 | 12 2                | in subset: 2
|   107      51       2.1   2.04|   1.33   .20 |  .93  -.2   .85  -.5| 1.17 |   .52   .43 |     1 |  4 1                | in subset: 1
|   119      51       2.3   2.27|    .90   .18 |  .99   .0  1.02   .1| 1.14 |   .35   .46 |     1 |  1 1                | in subset: 1
|   115      51       2.3   2.31|    .82   .19 |  .68 -1.5   .69 -1.5| 1.42 |   .53   .45 |     3 | 18 3                | in subset: 3
From:Tom Conner <connert at msu.edu>
To: Parisa Daftari Fard <pdaftaryfard at yahoo.com>
Sent: Tuesday, August 16, 2011 4:21 AM
Subject: Re: [Rasch] Fw: Interval or nominal
I think your problem is conceptual.  Is there a single concept that each item taps and such that the response numbers indicate more or less of that concept?  If that is not true then a Rasch analysis is not appropriate.  If it is true then you simply analyze the data with Winsteps or Facets with two facets.  Then do an analysis of variance with type of respondent as the independent measure.


On 8/15/11 7:30 PM, Parisa Daftari Fard wrote: 
Dear Rasch members
I guess I am a bit confusing. Let me explain the situation more. In all research on Speech act, several examinees give answers to a questionnaire. Their replies are rated on a scale (for example from zero to 3). This means that the responses are rated because always 3 means the response is better. In this research there is noexaminees but a questionnaire. These raters are to say their idea about the appropriacy of the response to that situation. This makes the questionnaire look like a multiple choice. 
If I run Facet on such a data, I will have an item measure table in which some of the items are reported as very difficult (misfit) not because the raters do not agree on their rating but because the answer to that item was 1 (very unsatisfactory). I guess this is not real misfit but the function of rating score. That is if the option "very unsatisfactory" was coded as 5, the item might not be reported as a difficult item.
I hope this time I am able to explain the situation. Because this is a PhD dissertation of my friend, I hope to have your help and comments that if we are correct to use facet for such a rating situation or not.
I appreciate your kind help in advance.
Dear Rasch Members,
>I have faced a complex situation and I hope to have your kind decisive comments. In a paper we are trying to investigate the possible bias of some native speakers answering a questionnaire on speech act. In that questionnaire, there are several situations and only one answer for each situation. Raters need to rate each response to that situation on five options likert scale  with 1 = very unsatisfactory, 2 = unsatisfactory, 3 = somehow appropriate, 4 = appropriate, and 5 = most appropriate. 
>for example, 
>1.   While eating in a restaurant, you notice an insect in your food. What would you say to complain to the waiter?
>Answer:Hey man, come here. What the hell is it? Where on the earth do you cook your food?
>1. very unsatisfactory  2. unsatisfactory  3. somehow appropriate  4. appropriate  5. most appropriate
>1.      The teacher gave you a lower-than-expected grade in your exam. What would you say to complain?
>Answer:Excuse me, something might have been wrong. It’s not my grade, sir.
>1. very unsatisfactory  2. unsatisfactory  3. somehow appropriate  4. appropriate  5. most appropriate
>The problem here is that the possible answer to some of the questions is 2, for some questions is 3,  for some other is 5 and the like. Of course, native speakers (raters) are the norms and there is no wrong or correct answer to each question. What we are trying to see is the way different native speakers from different cultures approach the appropriacy of the answer to that specific situation (bias and variation)
>My Question is that What kind of scoring is this? nominal, ordinal or interval? 
>My second question is that is it possible to search for possible Bias in this type of research through Facet?
>I should highly appreciate your reply beforehand.
>Parisa Daftarifard
>Rasch mailing list
>Rasch at acer.edu.au
>Unsubscribe: https://mailinglist.acer.edu.au/mailman/options/rasch/rense.lange%40gmail.com

Rasch mailing list
Rasch at acer.edu.au
Unsubscribe: https://mailinglist.acer.edu.au/mailman/options/rasch/connert%40msu.edu
Tom Conner
Professor of Sociology
Michigan State University
office: 517 355-1747
cell: 517 230-0343
"What if there were no hypothetical questions?"
No virus found in this message.
Checked by AVG - www.avg.com
Version: 10.0.1392 / Virus Database: 1520/3838 - Release Date: 08/16/11
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://mailinglist.acer.edu.au/pipermail/rasch/attachments/20110816/1205275e/attachment.html 

More information about the Rasch mailing list