Calling Colorado Teachers: Can I Analyze Your Assessment?

Sometime in the next week or so I need to observe a math or science teacher. I'm turning to you, my PLN, for a volunteer. I'm open to any grade level, and here's the official description of the assignment:
Analysis of the Assessment Practices of a Math/Science Teacher
For this assignment you will observe a school mathematics or science teacher to analyze opportunities to assess student understanding embedded within a classroom context. Determine the extent to which assessment is embedded in instruction, detailing the kinds of questions and tasks used during instruction.
  1. Consider what it means to assess understanding
  2. As part of your observation, be prepared to distinguish between questions that guide the discussion and questions that elicit student understanding
  3. To what extent does the teacher create opportunities to gauge student thinking and learning? When does the teacher "check in" with students?
  4. To what extent do students responses influence instruction?
Additionally, as part of this assignment include a brief interview of the teacher’s beliefs about assessment. Some questions to consider…
  1. What is the purpose of assessment?
  2. When do you assess your students?
  3. [if appropriate] How do you use information from observations and listening to students to inform instruction?
These are just suggested questions. You are encouraged to adapt these and include other questions that you consider important. The report can be set up as a 2-part narrative that summarizes the observation and the teacher’s responses to the interview. Be sure to highlight specific aspects of what was observed and discussed, so that you can report your findings to the group and class on April 13th.
To be clear, I am not wishing to observe you giving a test or quiz, or to analyze the content and design of your test or quiz. I want to observe you during an instructional task and analyze what questions you ask, how you ask them, and how your students respond. As you can see from the assignment, I'm looking for the formative processes you use to determine if your students truly understand the content.

So if you don't mind an extra body in your classroom and can spare some minutes to discuss assessment, please email me at or DM @MathEdnet on Twitter. If you don't feel right for the study but can arrange something with a teacher that you think would be great to observe, that would be great, too. I'm in Boulder, am willing to travel a reasonable distance, and am available any day of the week. Your help will be greatly appreciated and I look forward to working with you!

Think Your Students Can Solve This Problem? (Hint: They Probably Can't)

In his book chapter "Aspects of the Art of Assessment Design," Jan de Lange gives some examples of tasks found on large-scale standardized assessments, such as the TIMSS, PISA, and NAEP. Here's a problem that was rejected from the PISA (it would have been given to 15-year-olds):

For health reasons people should limit their efforts, for instance during sports, in order not to exceed a certain heartbeat frequency. For years the relationship between a person's recommended maximum heart rate and the person's age was described by the following formula:

Recommended maximum heart rate = 220 - age

Recent research showed that this formula should be modified slightly. The new formula is as follows:

Recommended maximum heart rate = 208 - (0.7 x age)

Question 1: A newspaper article stated: "A result of using the new formula instead of the old one is that the recommended maximum number of heartbeats per minute for young people decreases slightly and for old people it increases slightly." From which age onwards does the recommended maximum heart rate increase as a result of the introduction of the new formula? Show your work.

Question 2: The formula Recommended maximum heart rate = 208 - (0.7 x age) also helps determine when physical training is most effective: this is assumed to be when the heartbeat is at 80% of the recommended maximum heart rate. Write down a formula for calculating the heart rate for most effective physical training, expressed in terms of age. (de Lange, 2007, pp. 103-104)

So why do you think this problem was rejected from the PISA? Is it not authentic? Is it not relevant to mathematical content? Is it culturally biased? Nope. It was rejected because when it was field tested, fewer than 10% of students got it correct. As a high school math teacher, I can't identify any mathematics most of my 15-year-olds haven't "learned" (I'm using that term loosely, given the circumstances), but that doesn't mean they can do the problem.

de Lange's point is that we must be careful about excluding problems from our assessments (particularly the large-scale ones) because they are too difficult. There are political pressures from many countries to exclude such items because they make countries look bad, but that's not a great reason to not use them. Having high-quality test items that we can use to track long-term trends is more important than saving face. Without problems like these, we risk not asking ourselves why our students have difficulties with such problems.

While de Lange makes a good argument, opinions may vary. How do you feel about item difficulty on assessments? Does it matter if it's a large-scale test like the PISA versus a classroom assessment? Would you give students a test item that had a known 10% success rate?

de Lange, J. (2007). Aspects of the Art of Assessment Design. In A. H. Schoenfeld (Ed.), Assessing Mathematical Proficiency, Mathematical Sciences Research Institute Publications (pp. 99-111). New York: Cambridge University Press.

Trying to Understand Colorado's School Funding Problems

Yesterday Colorado's Senate Education Committee approved House Bill 10-1369, the school finance bill. If passed, it requires every school district to reduce the state share of their total program funding by 6.35 percent. (State Share = Total Program - Local Share) But here's the catch: seven Colorado districts (Clear Creek, West Grand, Gunnison, Estes Park, Park, Aspen, and Summit) collect less than 6.35 percent's worth of state aid. For example, Park only collects $81 of state aid. (This seems impossible, but that's what the state says.) Instead of losing just that $81, they have to reduce local revenues (which means reducing their override mil levy) by more than a quarter million dollars so they have the same 6.35 percent reduction as every other district. How cutting local revenues helps the state budget is beyond me, so it seems the legislature is doing this in the name of equity. I'd like to think I'm somehow reading the information incorrectly, but I'm afraid I'm not.

(Update 3/31/2010: The Colorado State Senate approved the school finance bill, but thanks to an amendment proposed by Bob Bacon of Fort Collins, the seven districts named above will not suffer any loss of local funding.)
(Update 4/16/2010: After the House rejected Bacon's amendment, a committee agreed that the seven districts will lose money through state categorical grants. I'm not totally sure what that money is, but I hope to find out. The compromised bill still has to be approved by both houses.)
(Update to the update 4/16/2010: Previously I had confused the "factors" from the "categoricals," and the post now recognizes them properly.)

It may sound crazy, but it's just another example of Colorado's school funding problems. Colorado's school finance formula, established in 1994, determines base funding (a set per-pupil amount) and funding factors (extra money for cost of living, personnel costs, district size, at-risk students, and online students). Together, these define "Total Program" funding. In addition to Total Program funding, some schools get state "categorical" funding, which provide extra money for six categories: small attendance centers, English language proficiency, gifted and talented, special education, transportation, and vocational education. Colorado's tax revenue and education spending picture is defined by three sometimes contradictory laws: the Gallagher Amendment, TABOR, and Amendment 23.

Gallagher, passed in 1982, protects Coloradoans from increasing residential property taxes. Unfortunately, it might be doing that job too well -- Gallagher mandates that residential properties account for 55 percent of all property tax, the same as in 1985. The housing boom of the 1990s and early 2000s added greatly to the value of Colorado's residential property, forcing property tax rates down significantly in order to keep revenues in line with the 55 percent rule.

TABOR (the Taxpayer's Bill of Rights), passed in 1992, puts revenue and spending limitations on all levels of Colorado government, including schools. Government can't grow any faster than inflation plus population growth. In good years, excess revenue is returned to taxpayers. Unfortunately, that means Colorado has a difficult time building up a rainy-day fund in the event of hard times, such as those we have now. In 2005 Coloradoans passed Referendum C, a five-year "time-out" from TABOR, but during that time the state didn't have many opportunities to stockpile surpluses.

The restrictions of Gallagher and TABOR were causing some pretty dire restrictions on education funding, so in 2000 voters passed Amendment 23. The amendment was to restore funding to pre-TABOR levels and mandate minimum increases (at least the rate of inflation) for education spending. You should be wondering: if Amendment 23 requires spending increases, how is the legislature decreasing school spending next year? Due to the tight budget, lawmakers have had to "reinterpret" Amendment 23 to apply it only to base per-pupil funding, and not categorical factors.

So what are the consequences? Gallagher lowered property taxes, reducing one of a school district's primary local revenue sources, and pushed greater funding responsibility onto the state's general fund. TABOR placed restrictions on the general fund, causing underfunding of schools. Amendment 23 promised better funding for schools, but it didn't solve the revenue problems and Colorado still ranks very low nationally in various measures of K-12 spending. There's a current movement to exempt education spending from TABOR, but doing so would allow the legislature to backfill spending in other areas with education dollars, then raise taxes to fill the education hole. Whether you like TABOR or not, creating a loophole is probably not the best way to untangle Colorado's education spending knot.

One effort to untie the Gallagher/TABOR knot is the Lobato v. State case. That case challenges the state's constitutional responsibility to maintain a "thorough and uniform system of free public schools." If successful, the lawsuit would force Colorado to re-examine its 1994 school finance formulas, which in turn could force changes in Gallagher and TABOR. Defining "adequate" school funding isn't easy, and much of what was done in 1994 involved the political negotiation of available funds, and not an analysis of how much is actually costs to educate a student. There are three common ways to do this:

  1. The "Successful School District" Approach (Empirical Identification) - Identify school districts that are performing at or above a set level, then examine how much it's costing them
  2. The "Professional Judgement" Approach (Resource Cost Model) - Ask experts to estimate the resources necessary for success
  3. The "Statistical Prediction" Approach (Cost Functions) - Use statistics to extrapolate funding based on extrapolations of current data

A good adequacy analysis will use multiple approaches and look for agreement between the models. Analyses recently completed for Colorado suggest adequacy levels near 150 percent of current state funding. Such increases could dramatically change education and school funding in Colorado for the long-term. In the short-term, however, it appears there is no fix to the current budget hole.

(If I've been incorrect in any of the above, by all means, suggest a correction in the comments!)

Boulder Valley Students "Earn" Zeros on Mis-administered CSAP

According to a story in Boulder's Daily Camera this afternoon:

"Sixty-seven students at Superior's Eldorado K-8 School will receive zeros on the writing portion of this year's CSAP test after an eighth-grade teacher broke the rules by having students practice on writing topics copied from a previous year's test."

Two years ago, the exact same situation happened at Runyon Elementary in Littleton. In both cases, the schools realized what happened soon after the test was administered and contacted CDE to report their concerns. In both cases, it appears CDE will give a score of zero to every affected student. This situation upset lawmakers enough in 2008 to include a provision for it in their CAP4K, more officially known as Senate Bill 08-212. It amended Colorado Revised Statute 22-7-604(3) to read:

"...the department shall identify and implement alterations in the calculation method, or other appropriate measures, to ensure that, to the fullest extent practicable, a public school is not penalized in the calculation of the school's CSAP-area standardized, weighted total score by inadvertent errors committed in the administration of an assessment."

So why doesn't that law apply to this new case in Boulder? Simple: it was repealed last year by Senate Bill 09-163, for reasons I never heard discussed. Assuming this is as honest of a mistake as BVSD claims, it doesn't seem fair to give kids zeros. After all, I'm sure the kids know more than nothing. I hope we hear an explanation from CDE and the Colorado legislature about the repeal of the CAP4K provision, and I won't be surprised to find it back in the law soon.

Taking the Salary Schedule Beyond Two Dimensions

If you were to listen to the TV and movie industry, you'd be convinced that two-dimensional content belongs in a museum next to 35mm film and cassette tapes. Soon we'll all be wearing funny glasses in our living rooms, watching Avatar in 3D. Three is more than two, so 3D must be better than 2D, right?

The vast majority of teacher salary schedules are two-dimensional, with experience on the vertical axis and credit hours on the horizontal axis. It's a simple system, and there are some good reasons that teachers unions have fought to keep them, namely the inability for an employer using such a salary system to discriminate against females and minorities doing the same work with similar qualifications. However, I wouldn't go so far as to say 2D salary schedules are fair; we hold on to them because they are familiar, easy to use, and teachers accept them as part of their employment. Merit pay might be no more or less fair than a 2D salary schedule, but salary schedules are an unfairness most teachers can agree upon.

Why 2D? Why not 1D? Why not 3D?
So why are our salary schedules two-dimensional, and why did we choose experience and education as the two dimensions? Well, a one-dimensional system is unnecessarily limited; it suggests that only one variable determines the quality of a teacher, and few of us like being told that only one thing about us matters. So why not three dimensions? I think we don't have 3D or higher-dimensional salary schedules because we can't print them in rows and columns on a piece of paper. You can't tell me that years of service and credits earned are the only two objective measures worthy of the salary schedule. I'm currently earning three credit hours for a graduate-level class in the assessment of math and science. If I went back to teaching mathematics now, those three hours would count the same as my three hours in cartography.

Hypothetically, let's ponder a third dimension: student load. The number of students a teacher sees daily impacts the sheer quantity of work most teachers have to do, so why not reward them for it? (Remember, I'm not looking for a perfect measure here - just something that makes at least as much sense as graduate credit hours.) We'd have to determine how and when the number of students in a teacher's class is counted, but as student numbers are tied to school funding, there shouldn't be any budget-busting surprises here. But now we're beyond 2D, and our rows and columns might best give way to a formula. Imagine a school where teacher salaries range between $30,000 and $50,000. Our formula might look like this:

Salary = $25,000 + ($750 * Years) + ($100 * Credits) + ($50 * Students)

Under this formula, a brand-new teacher with no graduate credits and 120 students would make $31,000. A veteran teacher with 20 years of experience, 45 graduate credits, and 120 students would make $50,500. How'd I pick the $750, $100, and $50 amounts? Pretty arbitrarily, just as districts now negotiate for the size of steps in their 2D salary schedules.

Should Salary Growth Be Linear Along All Axes?
Here's where things get controversial. Which teacher is bound to show more growth and improvement: the one moving from year 2 to year 3, or the one moving from year 16 to year 17? Should more growth mean a greater increase in pay? Should all credit hours be counted equally? Consider this formula:

Salary = $25,000 + (4000*sqrt(Years)) + ($20 * Credits + $20 * Credits Specific to Current Teaching Assignment + $60 * Credits Earned in Last 3 Years) + ($50 * Students)

Our brand-new teacher with no credits and 120 students still makes $31,000, and our veteran with 20 years of experience, 45 graduate credits (let's say 30 are specific to their current assignment and 3 were taken in the past 3 years), and 120 students makes $50,568.54. Like I've said, I'm picking these figures rather arbitrarily to provoke some thought, and further judgement of the formula is left as an exercise to the reader.

If you could pick a third (or fourth, or fifth) variable to take us beyond a two-dimensional salary schedule, what would it be?

Do Online Gradebooks Compromise Our Teaching?

Aaron Eyler recently raised the question of online gradebooks on his blog. While Aaron's concerns centered more on "what does a grade mean" and the easier-than-ever assumptions we can make by looking at a letter in a gradebook, I've been more concerned about the negative effects online gradebooks might be having on how we teach. I'm all for running an open classroom and I like knowing that parents and students are monitoring progress, but I believe online gradebooks have some negative consequences. For example:

1. Do online gradebooks discourage formative assessments? From my standpoint, once you assign a fixed grade to an assignment, the gradebook sees it as summative. (Even if the teacher doesn't.) Formative assessments are important tools in both assessment and instruction, and often can go on for days without deserving a grade in the gradebook. Unfortunately, we get external pressure to put everything in the gradebook, whether it be from parents or administrators who want to monitor progress or athletic directors needing grades to determine eligibility. Students can also lack motivation if they aren't seeing their grades change as they work.

2. Should online gradebooks show class assignment averages? Suppose you give an assignment to ten students and the scores are 90, 90, 90, 80, 80, 70, 70, 70, 0, and 0. (The use of zeros is another issue, but they were expected in my school if students didn't turn in assignments.) All the students who turned in the assignment passed, and half the class got a B or better (80% = B). But because of the zeros the class average is 64%, which was failing at my school. Sadly, more than once a parent or administrator would contact me and say I had failed to teach the students because "the class got an F on the assignment."

3. Online gradebooks (at least the ones I've used) only accept numbers as input, significantly restricting options teachers have for giving feedback. Butler (1987) performed a study that revealed that indicating the grade earned on an assignment had negative impacts on performance. If you have a choice between grades, feedback, and grades plus feedback, going with feedback only can lead to the most improvement because students will focus on the feedback and use it to improve. With online gradebooks, the grade received on any assignment is a click away, potentially rendering the feedback less useful.

Poor grading practices can have negative effects on both assessment and instruction. I'd be surprised to find a school not using an online gradebook, whether it be popular systems like Infinite Campus and Powerschool, or systems from smaller players like Go.edustar, SME, MyGradeBook, Thinkwave, and Gradeconnect. (A Google search reveals many more!) Each gradebook has its own limitations, but my three concerns above likely will exist in all of them. What are your experiences with online gradebooks? Am I underestimating the positives? Are there negatives that I haven't listed? I'd love to hear your thoughts.


Butler, R. (1987). Task-involving and ego-involving properties of evaluation. Journal of Educational Psychology, 79(1987), 474-482.