 Here is a added comment to the issue. 

Looking at the old ('50's and 60's) psychophysics literature about perceptual thresholds et al., we find that there is virtually always a tradeoff that observers (subjects, etc.) have to make between accuracy and speed. More of one leads to less of another. People make judgements about how careful they are on the basis of the payoffs between speed (skill) and accuracy (fluency). There is inevitably a negative correlation between the two. 

Look up the Theory of Signal Detection, (TSD). Speed and  accuracy are measured, but the hidden parameters are Sensitivity and Bias. My mathematic skills are not adequate to show the relationships between TSD and Rasch theory, but it is clear to me that there are some deep connections.


I could have made this much clearer.  This is all in its very early stages.  The item was just a mock-up.  We have no real data at this point.  The two pieces of performance of interest are:
1.     A rate count with the parenthesized portions omitted.  This could probably be controlled via a computer interface. The seconds taken to read the separate strings of non-parenthesized text could be summed.  The number of words over this sum would give us a rate (words per time unit).  There is no consideration, in this part, of whether the words were read "correctly", just how long it took to get to the next set of parenthesized words.    
2.     The "contextualization"/"comprehension" anchor for the rate using the student's correct selected response from within the parenthesized word sets.  (Poisson count, right?)
Within a single item, there is no intention to try to combine the time it takes to read a single sentence or sentence fragment with the selected word from a following parenthesized word set - the noise anticipated from attempting this would likely obscure any signal.  We do believe, however, that across sentences and sentence fragments within an item and across several such items, that a useful "fluency" signal can emerge.  
It seems like the solution you are suggesting would work under this structure.  And it seems as though the two variable approach could be used, especially as we try to understand how these variables work together. 
Is this what the data for one of your items looks like?
1) How many seconds were required?  (Poisson count or percent of maximum) 
2) How many words were attempted? (Poisson count or percent of maximum)
3) How many those words (or of the maximum) were done correctly? (Percent or Poisson count)
Then the major challenge is deciding the relative weighting of speed and accuracy. The analysis for one of my clients failed at this point. They could not make the decision :-(
For the analysis, it is probably easier to measure two variables, "speed" and "accuracy" (paralleling "addition" ability and "subtraction" ability for an arithmetic test). Then combine the resulting pairs of measures using the chosen weighting.
This type of data is analyzable with Facets.
The task is for the student to read to each parenthesized word set and select the word that fits.
Hello All.
An item format used to assess reading fluency might look something like this:
Oscar was awakened by a loud (book / color / boom).  He jumped out of bed and (sang / ran / flower) to his window.  In the dim light he could (see / danger / shout) a figure that looked like something (garden / out of / bucket) a story he read.  This seemed strange to (glass / him / listen). 
The task is for the student to read to each parenthesized word set and select the word that fits.  A count of correctly selected fitting words is tallied.  Also, the time it takes to complete the reading is recorded.  For longer passages a fixed time to complete the passage is assigned.  Traditionally, reading fluency in early years of education is assessed through oral reading.  Computer capabilities may allow items such as the example above to be administered without the need to read orally to a human or by using voice capturing capabilities to record and score oral reading performance for speed and accuracy.  In addition, use of computers may help in de-confounding rate of reading from the comprehension aspect of the performance.  However, the item presents a measurement challenge, at least to me.
My question: Has a Rasch-type model been developed that could handle or be adapted to handle this type of dual/multi-faceted performance?     
>From my current fragile understanding of Rasch’s early models for reading fluency (as they were presented and summarized in Lord and Novick, chapter 21), it seems as though neither model 1 (misreadings in an oral reading test) nor model 2 (number of words read during some fixed time period) can handle the example passage.  While rate (words per minute) can be handled, there is a comprehension component in the form of the parenthesized word sets.  Whether or not these two aspects of performance can be formed into a single measureable variable is an (perhaps, the) empirical question.  Whether or not they should be formed into a single variable could be important especially from a practical perspective.  Perhaps a Rasch mixed model is feasible(?).  
Any comments, thoughts, suggestions, or leads are welcome.
