Hi all

isn't it somewhat strange that one should buy a commercial book to assess the details on dimensionality assessment of one of the largest psychometric attacks on the tax payer's wallet in the history of mankind?

In general, the lack of publicy available data and documentation on the psychometric procedures followed is worrying. I'd suggest that if PISA wants to gain back some street credibility in scientific circles, they simply have to make public the data, analysis code, and decision procedures used to arrive at their results. By this I don't mean the woolly stuff that's in the tech reports I've seen, but simply datasets with analysis code that reproduce the analyses used to assess e.g. dimensionality and DIF.

As long as PISA cannot offer such documentation, we should give Kreiner the benefit of the doubt, simply because, in contrast to PISA's, his work is in fact sufficiently detailed to pass the minimum scientific tresholds of reproducibility and transparency.


Hi Steve,

In the article "Linking PISA competencies over three cycles - results from Germany", Claus H. Carstensen (p. 204) explains how the uni-dimensionality of PISA items are assessed. His description may be of value to the conversation.

Carstensen, C. H. (2013). Linking PISA competencies over three cycles - results from Germany. In M. Prenzel, M. Kobarg, K. Schöps & S. Rönnebeck (Eds.), Research on PISA: research outcomes of the PISA research conference 2009 (pp. 199-214): Springer.


There  is a lot of noise in the TES article, but one potentially legitimate
problem: Kreiner claims that the PISA questions violate the
Rasch uni-dimensionality assumption to such a degree that a Rasch model
can't be used, or at least can't be used with sufficient precision to rank
countries meaningfully.  He says he tested this by trying out legitimate
subsets of questions, and investigating whether the differing subsets
predicted rankings that were similar to one another-and they didn't.
Specifically, "Canada could have finished anywhere between second and 25th
and Japan between eighth and 40th.".
Ray Adams responded by saying that PISA accounts for this problem, noting,
"We have always shown things like range of possible ranks, standard errors
and so on. We've also reported the effects of item selection ."

In fact the 2012 country report on Japan looks nothing like "between 8th and
40th".  In 2012 Japan was ranked between 1st and 3rd in all subjects.
Kreiner may have been using data from a different year, but I can't imagine
that the RANGE of possible ranks shrank from 32 (40-8) down to 2 (3-1).  I
see only two possibilities:  either Kreiner's methodology was absolutely
lousy, or else the PISA "range of ranks" was computed ASSUMING
uni-dimensionality and did not adequately CHECK for uni-dimensionality.

Ray, do you have any technical articles you can reference explaining how
PISA either designed test items for uni-dimensionality or else checked that
uni-dimensionality was an adequate model for creating country ranks? Also,
are there any tech reports on how PISA determined each country's potential
range of ranks?   Finally, I'd appreciate any tech reports in which PISA
investigated how choosing differing subsets of items affected country
ranking, or else articles (not necessarily PISA) explaining why a procedure
like Kreiner's sub-setting is not a legitimate test.    I suspect that
Kreiner's claims are simply based on invalid methodology, but I'd like to be
able to verify that suspicion.

Yes, Jason.

Are PISA, TIMSS and similar studies really intended to advance education
worldwide or to advance political agendas? If the answer is "political
agendas" then the most politically-acceptable statistical methodologies are
the ones to choose. If the answer is "advance education" then everyone,
including the politicians, should be working towards discovering and using
the most effective statistical methodologies.

According to www.rasch.org/software.htm<http://www.rasch.org/software.htm> there are now seven Rasch-related R
modules. They are free. Wonderful! But there are 5,889 R modules. We will
need more Rasch R modules before we make a noticeable impact.

> Mike, this is a great idea. But can the policy makers and the
> politicians allow us (the academics) to spoil their new toys ( the
> international studies)? The politicians use us (the academics) to
> produce data and reports which then the politicians use to carry out
> their little in-fightings and political debates.
>  We cannot afford to angry them, because we need their money and
> support. Maybe we need to train them on how to use our data most
> appropriately and sensibly. Pisa and Timms tables, for example, can be
> useful, but they are not the equivalent of The Bible.
> Having said all that, we need to thank Margaret and the other
> researchers for providing the methodological tools and packages (have
> you all had a glance of the TAM package on the R platform?). But we
> also need to thak Paul for seeding the seeds of doubt, because this is
> the only way for science to prosper.
> Jason

