Agustin Tristan ici_kalt at yahoo.com
Thu Oct 18 02:13:28 EST 2007

Hi Anthony.
  Let's review some abstractions for the units and the test construction.
  1) To align any object and count the number of alignments.
  Let's take Item as unit. Item is a tangible thing: you may add items to "fill" a test, you may add "correct answers" on items to produce a raw score of knowledge. You may order persons adding items and make some inferences, for instance: if you have 10 correct items and I have 8 correct items, do you know more knowledge than me? Regularly we say yes. 
  Let's use this idea. 
  Money BILL as unit. Bill is a tangible thing: you may add bills to fill a wallet, you may add "right (not broken or destroyed)" bills to produce an amount of money. You may order persons adding their bills and make some inferences, for instance: if you have 10 bills (complete or broken, doesn't care) and I have 8 bills (complete or broken, doesn't care), do you have more money than me? Regularly, no, it depends on the denomination of the bill (you may have 10 bills of 1 dollar and I have 8 bills of 10 dollars, I have more money than you, even that you have more bills);  and it also depends on the quality of the bill, broken bills cannot be used.
  In fact, when we are adding items we should ask about the denomination of the item, the same as the denomination of the bill. Why in social sciences and education, we don't care about the denomination of the item and its quality? do we care if our item is good or not? do you care if your item fits the model? It seems we just wish to add bills (broken or not doesn´t care).
  In fact, we should not add items. We should not align objects thinking we have "more", to align is not the good procedure... do you align bills to know how much money do you have?  If you don't like bills, try to add or align pebbles, fruits or any other thing just counting, not taking into account their "value" or denomination and their quality. What if we try to align  watches trying to get the hour of several days...You cannot "add" the instrument itself, therefore you cannot align or add items.
  2) Logit. What does it mean?
  It depends what you're measuring, and I agree with you that it is a difficult thing to imagine. 
  One logit is a "BIG" unit, it is like the AMPERE, it is huge, many things have to be measure in milliamperes. 
  Kilogram (fundamental unit) is not so big, respect to our human scale (except if you receive a hit with one kilo baseball in your head...). If you weight today 70 kilograms and tomorrow you weight 71 kilograms, probably you will not notice a difference, your wife or girl friend will not notice this either. 72 kilograms, slighty noticeable. If you have 70 kilograms, probably you're on the mean of men, according to your height and age. as 1 Kilogram is SMALL, you need 2 or more to identify differences at human scale. You may set 70 as the origin of the scale for men according to the specifications, and may set 65 for adult women under similar circumstances.
  Celsius or Farenheit (fundamental unit) is SMALL for some industrial applications, but it is BIG with respect to our human scale. If you have 36.5 oC you may feel well, but with 37.5 oC you begin to fell sick, 38.5 oC very noticeable! If you have 36 oC you're on the expected mean according to health standards, and you may set this as the origin. As oC is BIG, a non noticeable difference is 0.5 oC, or even 0.25 oC. 
  Meter is very BIG at human scale. If you measure 1 meter more than me it is very noticeable! Probably centimeter or inches are better for human scale. A difference of 1 centimeter is not noticeable, one inch is slighlty noticeable.
  One logit shows BIG differences between knowledge, the only problem is that the origin has to be set according to specifications. If you have 0 logits you may be in your normal knowledge according to the standards of your scholar degree, experience, age, and so forth. If you measure 1 logit, you have a noticeable difference above the mean, 2 logits are very noticeable; similar thing if you measure -1 or -2. As logit is BIG at human scale, a slight measure is 0.5 logits, in fact 0.25 logits is small and you will not notice a difference. 
  If you measure 0.25 and I measure 0.5, it is true that I have more knowledge than you, but it is a slight difference. The 1/4 logit rule (or 1/2 logit rule) is  a denomination commonly used in some papers concerning DIF indicating that values below 1/4 logit doesn't show differences.
  We must use the logit units in several ways. We have to practice with it, and it may take several years before we can make some interpretations of it. For instance, what is the tangible interpretation of VOLT?, of AMPERE?, let's try to interpret CANDLES, COULOMB, JOULE and other units... It is clear that we would like to have something simple as meter or kilogram, but not all the physical units are so simple, even NEWTON is difficult to interpret to many persons (it is about 1/10 of kilogram, but do you know your weight in pounds, kilograms or newtons?). Do you know the pressure you're applying to your bed when you're lying on it in PASCAL? How big is a PASCAL?
  A scale in logits mean that we're trying to represent a scale corresponding to knowledge, ability, insight, competency...a latent trait. The unit is a logit, that indicates an amount of noticeable knowledge that makes a clear difference between two persons, or two items, or a person before and after a training. The ruler produced by a test, may run from -3 to +3 (in a practical environment, even that you may have measures below and above that figures in some special tests), and 0 is the origin, indicating the expected value of knowledge or performance for a "regular" person corresponding to the task under analysis. +2 logits indicates a high performace in a person or a high difficulty for an item.
  Sub-units of the scale should be not bigger than 1/4 logit, less than 1/4 logit could be good but probably very small at the human level. In some cases you may need differences below 1/4 logit to improve the precision of your test, but certainly not above 1/4 logit.
  3) How can you build the scale of your test?
  It is clear that you must NOT produce a test with these characteristics:
  A) collection of any kind of items
  B) using non calibrated items
  C) items not fitted to the model
  D) items with defects
  E) not caring the distribution... 
  ...NONE of the above!!!
  We must be very careful about the construction of the test, the same as the company X producing thermometers or balances, they must pass a rigorous quality control, we're risking something with a bad instrument. So do we in education: we're measuring abilities or competencies of persons, they are risking the rest of their lives with our ill produced test!
  We should produce a test with items from easy to hard (in difficulties), for example from -2 to +2, uniformly spaced, accordingly to our 1/4 rule. For instance you may choose this set of items difficulties:
  -2, -1.75,-1.50, -1.25, -1.0, -.75, -.50, -.25, 0, .25, .50, .75, 1.0, 1.25, 1.5, 1.75, 2.0
  That is 17 items uniformly spaced.
  How can you choose this set of items? You must have an Item bank with MANY calibrated items (at least from 5 to 10 times the number of items of your test, in this case from 85 to 170 at least in your item bank), All the items must be produced according to the specifications of the test, all the items muts be "good" items (good content, good sintax, well produced and validated by experts on the field and also with good writing skills, well calibrated, with acceptable FIT, etc.). You must choose the best items close to the distribution indicated above.
  Hope this helps.
  Sorry for this long text, hope it is clear.

Anthony James <luckyantonio2003 at yahoo.com> wrote:
    Dear folks,
  Another dumb question...
  I'll be grateful for any comments.
  When we are talking about "units of measurement" in the physical sciences we are talking about some tangible things. (I avoide the word "concrete" because I know you don't like it in this context and argue that all measures are abstractions). However, I mean kilo or meter , for example, are understandable attributes that have a concerete existence. A "sample meter" ,i.e., a rod of 1 meter can be aligned with any object and count the number of alignments. Or we can put some potatos on the pan of a scale and put enough weaights on the other pan until the beam is balanced. 
  1. How does the Rasch model make this "sample meter" or the one-kilo weight to compare the performance of the students against?
  2. Whose performance  is considered as the unit and how is it constructed?
  3. What's the defenition of a logit?
