MathEd.net: Think Your Students Can Solve This Problem? (Hint: They Probably Can't)

In his book chapter "Aspects of the Art of Assessment Design," Jan de Lange gives some examples of tasks found on large-scale standardized assessments, such as the TIMSS, PISA, and NAEP. Here's a problem that was rejected from the PISA (it would have been given to 15-year-olds):

For health reasons people should limit their efforts, for instance during sports, in order not to exceed a certain heartbeat frequency. For years the relationship between a person's recommended maximum heart rate and the person's age was described by the following formula:

Recommended maximum heart rate = 220 - age

Recent research showed that this formula should be modified slightly. The new formula is as follows:

Recommended maximum heart rate = 208 - (0.7 x age)

Question 1: A newspaper article stated: "A result of using the new formula instead of the old one is that the recommended maximum number of heartbeats per minute for young people decreases slightly and for old people it increases slightly." From which age onwards does the recommended maximum heart rate increase as a result of the introduction of the new formula? Show your work.

Question 2: The formula Recommended maximum heart rate = 208 - (0.7 x age) also helps determine when physical training is most effective: this is assumed to be when the heartbeat is at 80% of the recommended maximum heart rate. Write down a formula for calculating the heart rate for most effective physical training, expressed in terms of age. (de Lange, 2007, pp. 103-104)

So why do you think this problem was rejected from the PISA? Is it not authentic? Is it not relevant to mathematical content? Is it culturally biased? Nope. It was rejected because when it was field tested, fewer than 10% of students got it correct. As a high school math teacher, I can't identify any mathematics most of my 15-year-olds haven't "learned" (I'm using that term loosely, given the circumstances), but that doesn't mean they can do the problem.

de Lange's point is that we must be careful about excluding problems from our assessments (particularly the large-scale ones) because they are too difficult. There are political pressures from many countries to exclude such items because they make countries look bad, but that's not a great reason to not use them. Having high-quality test items that we can use to track long-term trends is more important than saving face. Without problems like these, we risk not asking ourselves why our students have difficulties with such problems.

While de Lange makes a good argument, opinions may vary. How do you feel about item difficulty on assessments? Does it matter if it's a large-scale test like the PISA versus a classroom assessment? Would you give students a test item that had a known 10% success rate?

References
de Lange, J. (2007). Aspects of the Art of Assessment Design. In A. H. Schoenfeld (Ed.), Assessing Mathematical Proficiency, Mathematical Sciences Research Institute Publications (pp. 99-111). New York: Cambridge University Press.