```I would argue however, that density is a well-understood and useful
construct that is related to the constructs used to measure it in
clear ways. When we create an average score or sum scores on different
assessments, we are not making meaning in a similar way. I would argue
that we are losing meaning, much in the same way we would lose meaning
if we summed or averaged the measures used in calculating mass---just
because they all had something to do with the object, and multiple
observations told us they tended to be correlated with one another.

The deeper meaning of a measurement may not seem so important if all
we want to know is if someone has "passed", but it means everything if
we want the assessment to tell us something about what that student
can benefit from learning next.

> This is a great discussion, both from a philosophical view and from
> a practical view.
> Here is what I would say: Of course the Rasch Model is wrong, in
> that no data will ever fit it perfectly. So are other models. The
> point is, is Rasch useful. Does it help us in our understanding. The
> answer is yes, it helps me!
>
> Now here is an interesting paradox:
> In objectives-referenced testing (or standards-referenced testing)
> we compute how students do on each standard. We may then make some
> decision on whether thy have "mastered" the skills described. We
> also look at the total test score, e.g. Mathematics. We report how
> well the students do on that score domain. We may then say that a
> student has or has not "passed" the test. For both these purposes we
> may use the Rasch model (or 3 param IRT, or Clasical Test Theory, or
> even some Decision Theory model).
>
> We treat the total test score as something meaningful. We treat the
> score on each objective as something meaningful.
> Clearly the score on objective 1 means something different from the
> score on objective 2 -- otherwise why would we report them
> separately? Yet the scores on objectives 1 thru n add up to the
> total test score -- which we also report! Recall that each subscore
> can be shown to measure a more or less independent skill. So we are
> adding apples, oranges, bananas, grapes, plums, pears, etc. to get a
> "fruit" score. We do so with aplomb (please excuse the pun).
>
> I guess this is akin to the fact that we may measure the height,
> length, and width of an object to infer its density, given its mass.
> In this case, the unifying entity is the object itself. I suppose
> the underlying entity in the testing scenario is the person. It has
> been noted that score on one reading test closely predicts the score
> on most other reading tests, for a given individual. In fact, the
> score on a reading test even predicts the score on a mathematics
> test rather well. Yet we don't really believe the skills underlying
> are all the same.
>
> In spite of clear paradoxes, we proceed.
>
