CSAP Item Treemaps

Colorado's NCLB standardized testing program, the Colorado Student Assessment Program (CSAP) is administered to all Colorado public school students in grades 3-10 during late March or early May. Math is given to each grade as a series of three, one-hour testing sessions.

In my experience, the CSAP is very comprehensive, but getting details about the content of the test is a bit tricky. Colorado's standards are written for K-12; its benchmarks detail the standards further for spans of grade levels (K-4, 5-8, and 9-12). These are in no way detailed enough to design a curriculum, so the next level of detail is the assessment framework - the (generally) specific list of grade-specific topics from which the CSAP is designed. Analysis at this level might bring accusations of "teaching to the test," but CDE doesn't provide schools with much further guidance.

The assessment frameworks are a prime example of standards being a mile wide and an inch deep. They suggest quality content, for sure, but such a massive amount of it that most teachers I've seen become a bit bewildered when trying to comprehend it all. With a finite amount of time to teach this content, should a teacher prioritize? If so, how?

I've known and seen teachers who take what I consider to be an unethical shortcut: each spring they riffle through the test books and note the questions in the test. Those same teachers argue that because they aren't copying the problems or planning to use them verbatim, it isn't unethical. I argue that if CDE wanted teachers to see they test, they'd release the tests each year. CDE hasn't even released a single math test item since 2005.

CDE does release, however, item maps that indicates details about every single test item, including assessment framework objective and difficulty. The data does not come without a disclaimer, however. From the CDE website:

Purpose of Item Maps
The item maps contain information that may be of some assistance examining a school or [sic] districts adopted curricular alignment to the state standards. They are not an instructional tool, and cannot be used to develop curriculum.

Please refer to the Standards for Educational and Psychological testing relative to the ethical and appropriate use of assessment data, including item maps.


Item maps must not be used to create yearly instructional targets. Please keep in mind that objectives are assessed on a cyclical basis, and item focused instruction based on item map information is not only ineffective, it is an unethical use of the information provided.

While I admire CDE for wanting their data to be used in an ethical manner, the offering of item maps under these conditions comes across as a bit of a tease. CDE has drawn a very fine line when they say the data can be used to evaluate the alignment of curriculum to the standards, but cannot be used to actually develop any curriculum. Does this mean as long as your modifying a curriculum and not building one from scratch, you're okay? Surely that's not what CDE meant, but it does come across that way.

Back to the prioritizing problem: how do teachers prioritize content in their curriculum to meet the demands of the standards in a finite school year? In my case, my school year has been particularly finite - I first taught in a block-scheduled, single-semester (of an average of 85 days) system, then moved to a 4-day week school with a 144-day school year. Sure, our periods might have been a few minutes longer than a traditional 180-day school, but in math that doesn't usually translate into more lessons being taught. We were 36 days shorter than 180-day schools, the equivalent of one whole quarter for us. That's 468 days total for K-12, or an extra 3.25 144-day school years. Prioritization is necessary for performance.

After several years of struggle, I turned to the item maps in the hopes I could better align (not just what was taught, but for how long) my curriculum to the standards. As a mathematician, I knew the potential pitfals of focusing too narrowly on a single year of testing, but when I started there was five years' worth of high school item map data. I combined, averaged, and summarized the best I could, and learned more about the structure and progression of the standards than I ever thought I would. (I'm embarrassed now to admit how little I knew when I started teaching.) My final product, a new set of spreadsheets with relative rankings of the objectives within each standard, was complete. I showed it to a few other teachers, but they didn't seem to find them as useful as I had. I had learned through the process of building them, not from the final product, and other teachers seemed to get the same overwhelming feeling at looking at my spreadsheets as they would have gotten from the original item maps.

In June of 2009 I decided to tackle the problem again, this time with a goal of visually representing the data. I originally wanted a circle chart where you could drill-down into each slice to reveal details at the standard, benchmark, and assessment framework levels, but the coding proved to complicated. My searching for a solution led me to treemaps and javascript that I could easily work with. Here, visually, are the treemaps for each grade level:

Objectives are easily prioritized by shape and color, and the aggregation of many year's worth of data helps avoid overlooking topics that might not be addressed on any one particular year's test. Is this "teaching to the test?" Perhaps. That's a common battle cry against standardized education. But if there are standards to meet, and a generally well-designed, comprehensive test that measures students' performance on those standards, and item map data that can help us summarize the content of that test so it can be better prioritized in a finite school year, why not use it? If you have a better set of data or a better argument against using it, I'd love to hear it.