Tuesday, July 31, 2012

Settling slope and constructive Khan criticism

This was co-written with Frederick Peck, a fellow Ph.D. student in mathematics education at the University of Colorado at Boulder and the Freudenthal Institute US. We each have six years of experience teaching Algebra 1 and are engaged in research on how students understand slope and linear functions. Fred shares his research and curriculum at RMEintheClassroom.com.



Sal Khan (CC BY-NC-ND Elvin Wong)
The Answer Sheet has recently been the focus of a lively debate pitting teacher and guest blogger Karim Kai Ani against the Khan Academy's Salman Khan. While Karim's initial post focused mainly on Sal Khan's pedagogical approach, Karim also took issue with the accuracy of Khan Academy videos. As an example, he pointed to the video on slope. Specifically, Karim claimed Sal's definition of slope as "rise over run" was a way to calculate slope, but wasn't, itself, a definition of slope. Rather, Karim argued, slope should be defined as "a rate that describes how two variables change in relation to one another." Sal promptly responded, saying Karim was incorrect, and that "slope actually is defined as change in y over change in x (or rise over run)." To bolster his case Sal referenced Wolfram Mathworld, and he encouraged Valerie Strauss to "seek out an impartial math professor" to help settle the debate. We believe that a better way to settle this would be to consult the published work of experts on slope.

Working on her dissertation in the mid-1990s, Sheryl Stump (now the Department Chairperson and a Professor of Mathematical Sciences at Ball State University) did some of the best work to date about how we define and conceive of slope. Stump (1999) found seven ways to interpret slope, including: (1) Geometric ratio, such as "rise over run" on a graph; (2) Algebraic ratio, such as "change in y over change in x"; (3) Physical property, referring to steepness; (4) Functional property, referring to the rate of change between two variables; (5) Parametric coefficient, referring to the "m" in the common equation for a line y=mx+b; (6) Trigonometric, as in the tangent of the angle of inclination; and finally (7) a Calculus conception, as in a derivative.

(CC BY-NC-SA Raymond Johnson)
If you compare Karim and Sal's definitions to Stump's list, you'll likely judge that while both have been correct, neither have been complete. We could stop here and declare this duel a draw, but to do so would foolishly ignore that there is much more to teaching and learning mathematics than knowing what belongs in a textbook glossary. Indeed, research suggests that a robust understanding of slope requires (a) the versatility of knowing all seven interpretations (although only the first five would be appropriate for a beginning algebra student); (b) the flexibility that comes from understanding the logical connections between the interpretations; and (c) the adaptability of knowing which interpretation best applies to a particular problem.

All seven slope interpretations are closely related and together create a cohesive whole. The problem is, it's not immediately obvious why this should be so, especially to a student who is learning about slope. For example, if slope is steepness, then why would we multiply it by x and add the y-intercept to find a y-value (i.e., as in the equation y=mx+b)? And why does "rise over run" give us steepness anyway? Indeed, is "rise over run" even a number? Students with a robust understanding of slope can answer these questions. However, Stump and others have shown that many students -- even those who have memorized definitions and algorithms -- cannot.

(CC BY Amber Rae)
This returns us to Karim's original point: There exists better mathematics education than what we currently find in the Khan Academy. Such an education would teach slope through guided problem solving and be focused on the key concept of rate of change. These practices are recommended by researchers and organizations such as the NCTM, and lend credence to Karim's argument for conceptualizing slope primarily as a rate. However, even within this best practice, there is nuance. For instance, researchers have devoted considerable effort to understanding how students construct the concept of rate of change, and they have found, for example, that certain problem contexts elicit this understanding better than others.

Despite all we know from research, we should not be surprised that there's still no clear "right way" to teach slope. Mathematics is complicated. Teaching and learning is complicated. We should never think there will ever be a "one-size-fits-all" approach. Instead, educators should learn from research and adapt it to fit their own unique situations. When Karim described teachers on Twitter debating "whether slope should always have units," we see the kind of incremental learning and adapting that moves math education forward. These conversations become difficult when Sal declares in his rebuttal video that "it's actually ridiculous to say that slope always requires units*" and Karim's math to be "very, very, very wrong." We absolutely believe that being correct (when possible) is important, but we need to focus less on trying to win a mathematical debate and focus more on the kinds of thoughtful, challenging, and nuanced conversations that help educators understand a concept well enough to develop better curriculum and pedagogy for their students.

Khan Academy (CC BY-NC-ND Juan Tan Kwon)
This kind of hard work requires careful consideration and an open conversation, even for a seemingly simple concept like slope. We encourage Sal to foster this conversation and build upon what appears to be a growing effort to make Khan Academy better. Doing so will require more than rebuttal videos that re-focus on algorithms and definitions. It will require more than teachers' snarky critiques of such videos. Let's find and encourage more ways to include people with expertise in the practice and theories of teaching mathematics, including everyone from researchers who devote their lives to understanding the nuance in learning to the "Twitter teachers" from Karim's post who engage this research and put it into practice. This is how good curriculum and pedagogy is developed, and it's the sort of work that we hope to see Sal Khan embrace in the future.



*Sal's point is that if two quantities are both measured in the same units, then the units "cancel" when the quantities are divided to find slope. As an example, he uses the case of vertical and horizontal distance, both measured in meters. The slope then has units of meters/meters, which "cancel". However, the situation is not so cut and dry, and indeed, has been considered by math educators before. For example, Judith Schwartz (1988) describes how units of lb/lb might still be a meaningful unit. Our point is not to say that one side is correct. Rather, we believe that the act of engaging in and understanding the debate is what is important, and that such a debate is cut short by declarative statements of "the right answer."

References

Schwartz, J. (1988). Intensive quantity and referent transforming arithmetic operations. In J. Heibert & M. J. Behr (Eds.), Number Concepts and Operations in the Middle Grades (Vol. 2, pp. 41–52). Reston, VA: National Council of Teachers of Mathematics.

Stump, S. L. (1999). Secondary mathematics teachers' knowledge of slope. Mathematics Education Research Journal, 11(2), 124–144. Retrieved from http://www.springerlink.com/index/R422558466765681.pdf

Friday, July 27, 2012

RYSK: Stump's Secondary Mathematics Teachers' Knowledge of Slope (1999)

This is the ninth in a series of posts describing "Research You Should Know" (RYSK).

I think just about every Algebra 1 student I ever taught came to me from Prealgebra knowing what slope was. At least they thought they knew what slope was. They could usually echo the words "rise over run," and I admit that very early in my career I probably would have found that somewhat satisfactory. But with each new Algebra 1 class (I taught 14 sections in 6 years), my students' limited understandings of slope became more frustrating. Honestly, it wasn't until my last year of teaching that I really felt I had the kinds of problems, activities, and explanations to help students construct an understanding of slope that I was happy with.

In discussions with my graduate school colleagues Fred Peck and Michael Matassa, I found that my experience wasn't unique. We were interested in exploring slope further, and that led us to an article by Sheryl Stump. Her name seemed familiar, and as soon as I saw her picture I realized that I'd had lunch with her at a conference just a few weeks before. I suppose if I'd found the article earlier I could have talked to her about slope instead of swapping stories about our common Midwestern roots, but she's been kind enough to reply to my emails when we've wanted to know more.

I think one of the reasons I liked Stump's article was that it focused on teachers instead of students. After all, the biggest reason I became dissatisfied with my students' understanding of slope was because my own understanding had grown more sophisticated with each trip through the curriculum. In her article, Secondary Mathematics Teachers' Knowledge of Slope (1999), Stump investigated the definitions, understandings, and pedagogical content knowledge of 18 preservice and 21 inservice teachers. Nearly all the inservice teachers had degrees in math or math education, including eight with math or math ed masters degrees, and their teaching experience ranged from 1 to 32 years.

Stump's review of previous literature on slope revealed a number of descriptions, including ratios, tangent, and -- because of its applications in physics, calculus, and other real-world applications -- the key concept of linear functions and rates of change. Stump also found everyday ways of thinking about slope, such as the downward slant of a hill from top to bottom. These different, yet related, descriptions of slope had led to misunderstandings in previous studies with students. No one had yet tackled this kind of research with teachers, so Stump designed and administered a survey and conducted interviews to understand what her study participants understood about slope. Her questions, "What is slope?" and "What does slope represent?" elicited responses that were sorted into seven categories:
  • The category of geometric ratio included representations such as "\(\frac{\mbox{rise}}{\mbox{run}}\)" and "vertical change over horizontal change" and focused on slope as a geometric property.
  • The category of algebraic ratio included representations such as "\(\frac{y_2 - y_1}{x_2 - x_1}\)" and "the change in y over the change in x, in which slope was defined by an algebraic formula.
  • The words "slant", "steepness", "incline", "pitch", and "angle" were categorized as involving a physical property.
  • Responses referring to slope as the rate of change between two variables were categorized as involving a functional property.
  • The parametric coefficient category included references to m in the equation y = mx + b.
  • A trigonometric conception of slope referred to the tangent of the angle of inclination.
  • A calculus conception included mention of the concept of derivative. (p.129 )
Both the preservice and the inservice teachers in Stump's study averaged about 2.5 representations per teacher in their definitions. The geometric ratio representation of slope was easily the most common for both groups (83% of preservice, 86% of inservice). but preservice teachers most commonly (61%) used algebraic ratio as a second representation, while inservice teachers commonly (81%) described a physical property. Descriptions of slope using the parametric, trigonometric, and calculus conceptions were rare or nonexistent.

Stump then gave the two groups six math questions, each designed to test different understandings of slope. Both the first question, about rate of growth, and the second question, finding a linear equation given its parameters, were answered correctly by 100% of the teachers in both groups. Questions about slope as speed, read from a graph, were answered correctly by about two-thirds to three-fourths of teachers in each group. The most dissimilar performance came on a question about angle of inclination, answered correctly by 33% of preservice teachers and 67% of inservice teachers.

Next Stump asked the teachers, "What mathematical concepts must students have experience with before they can truly understand slope?" (p. 132). By a wide margin, both groups said a geometric representation was most important, but only three teachers in each group mentioned experiences with functional relationships. Similarly, when asked for real-world contexts for understanding slope, both groups tended to choose a physical property instead of a functional property. About a quarter of the teachers in each group didn't mention either, naming algebraic or geometric representations instead (p. 133).

Stump's teacher interviews allowed her to dig more deeply into teachers' understandings about how students learn about slope. When asked about student difficulties, almost all the inservice teachers referred to a calculation procedure, saying "they put the x's over the y's" or "the order in which they subtract them" (p. 139). Preservice teachers predicted similar difficulties with symbol manipulation. One preservice teacher said:
My guess is that some might be frightened off as soon as you introduce a mathematical definition or a formula for a line, like the slope-intercept of the equation. As soon as some people see equations, they just go nuts, especially with symbols instead of numbers. ... Not because they don't understand what slope is, but because they are not making the connection between the intuitive and even the not-so-intuitive idea of taking the ratio of this to this. Not making the connection between that and the symbolic abstract equation on paper. That's just a guess. I haven't had experience with that. (pp. 139-140)
In her discussion section, Stump acknowledges teachers' tendency to think of slope first as a geometric ratio, with a smaller majority commonly thinking of it as a physical property. Very few teachers -- less than 20% -- had a functional conception of slope. Stump continues:
Considering the importance of the study of functions for high school students, it is especially troubling that functional situations involving slope were missing from so many teachers' descriptions of their instructional practices. Their students may thus miss opportunities to make this important connection while forming their conceptions of slope. Rizzuti (1991) found that instruction that included multiple representations of functions allowed students to develop comprehensive and multi-faceted conceptions of functions. Based on the results of the present investigation, it is questionable whether the participating teachers could assist their students in developing such a rich conception of slope. (p. 141)
Finally, Stump asks some important questions for further study, such as, "When textbooks connect various representations of slope, do teachers emphasise those connections for their students? Can teachers learn to make connections even if textbooks do not emphasise them?" (p. 141). I don't think we really know the answers to those questions, but I do absolutely agree with Stump's closing recommendation: "Both preservice and inservice mathematics teachers need opportunities to examine the concept of slope, to reflect on its definition, to construct connections among its various representations, and to investigate functional situations involving physical slope situations" (p. 142). It's good to see that kind of work being done, such as with Fred Peck and Michael Matassa's teaching experiment research and curriculum on slope they shared at ICME-12.

References

Stump, S. L. (1999). Secondary mathematics teachers’ knowledge of slope. Mathematics Education Research Journal, 11(2), 124–144. Retrieved from http://www.springerlink.com/index/R422558466765681.pdf

Thursday, July 26, 2012

How the Race (to the Top) Was Won (Part 2 of 2)

Note to self: In the future, don't go on a multi-state road trip and become otherwise distracted for more than a month between Parts 1 and 2 of a two-part blog post. Sorry, readers!

In my previous post, I set out to answer the following questions about Phase 1 of Race to the Top (RTT):
  1. Where in the RTT rubric could states score the most points?
  2. For the portions of the RTT rubric identified in #1, which states scored highest?
  3. For the states identified in #2, what did their application propose and what were the judges' comments?
The answer to #1 was Section D of the RTT application, "Great Teachers and Leaders." If you break down that section, you'll find that the highest subsection scores belonged to Delaware, Tennessee, Georgia, South Carolina, Rhode Island, Kentucky, Louisiana, and Kansas. This post will dig into the actual applications to see what proposed reforms warranted those high scores, along with some of the comments made by the judges during the scoring.

(D)(1) – Providing High-Quality Pathways for Aspiring Teachers and Principals (21 points)

The RTT scoring rubric specifies that this criterion must be judged for both teachers and principals. High points are awarded for alternative certification routes that operate independently of institutions of higher education (IHEs) and include at least four of the following five definitional criteria: (a) programs are operated by a variety of providers, (b) candidates are admitted selectively, (c) candidates have school-based experiences and ongoing support, (d) limited coursework, and (e) certifications are the same as traditional programs (Kentucky Department of Education, 2010, p. 7; U.S. Department of Education, 2010b, p. 10).

Kentucky has a 20-year history of alternative certification programs, and in 2003 the Kentucky Legislature allocated funds for the creation and growth of such programs (Kentucky Department of Education, 2010, p. 118). Kentucky now has seven alternative programs, including specific programs for people with extensive work experience, college faculty, and veterans of the Armed Forces, as well as district-based, university-based, and institute-based options. Most of the programs are selective; the work experience route requires at least ten years of work in the area of certification and several others require Bachelor's degrees in the relevant content area (Kentucky Department of Education, 2010, p. 120). Ten percent of Kentucky's current teachers and 17 percent of new teachers in 2009-2010 completed an alternative program.

There were only two aspects of this portion of Kentucky's application that received criticism from the reviewers, and only two of the five reviewers deducted any points for this section. The first deficiency was in the area of alternative principal licensing. Kentucky only has one program for alternative principal certification, and it does not meet all five of the definitional criteria (U.S. Department of Education, 2010c, p. 32). Only one new principal was alternatively licensed in 2009-2010, and that individual went through a university-based alternative program (Kentucky Department of Education, 2010, p. 122). The other deficiency was in Kentucky's process for identifying and acting upon teacher and principal shortages. Shortages are identified by Kentucky's LEAs, but current efforts to place teachers are limited both in number and geographic reach (U.S. Department of Education, 2010c, p. 32).

My take: Kentucky got perfect scores from 3 of 5 judges, and the top score overall in this area, despite only having one alternative program for administrators that produced only one new administrator the previous year. This was the best proposal in the country.

(D)(2) – Improving teacher and principal effectiveness based on performance (58 total points across four subsections)

This criterion, also applicable to both teachers and principals, is worth a maximum of 58 points. Rhode Island scored 94 percent of those points to lead all applicants. Unlike the previous criterion, this one is divided into four distinct subsections. Rhode Island had the top score or tied for top score in three of those subsections.

(D)(2)(i) - Measuring Student Growth (5 points)

Even though this subsection is only worth five points, Tennessee was the only state to be awarded a perfect score. To earn the points, states must "establish clear approaches to measuring student growth and measure it for each individual student" (U.S. Department of Education, 2010b, p. 11).

Tennessee has been using the Tennessee Value-Added Assessment System (TVAAS) since 1992. Tennessee claimed to track every grade 3-12 student in every subject, and the state claims the TVAAS is the largest student database in history (Tennessee Department of Education, 2010, p. 82). Despite the "every subject" claim, a brief look at the publicly-viewable data on the TVAAS website (https://tvaas.sas.com/evaas/welcome.jsp) indicates only core subjects are tested, and not at every grade. Tennessee has used the TVAAS to determine Adequate Yearly Progress, to support progressive districts, identify strengths and weaknesses in grades or subjects, and inform instruction, but prior to the 2009-2010 school year only 14 percent of the state's teachers had access to the database (Tennessee Department of Education, 2010, p. 82). Now that Tennessee has opened up access to all teachers, they plan to train both current teachers and administrators, as well as pre-service teachers, in the use of the database. Not only will this value-added data be linked to teacher and principal compensation and evaluations, the state "will monitor and report access and usage of the system at the teacher, school, and district levels" (Tennessee Department of Education, 2010, p. 82). No explanation is given in Tennessee's application for this level of monitoring and reporting, but one might assume it is to apply pressure to ensure that the system is universally used.

None of the five reviewers of Tennessee's application made any mention of the criticisms that have been levied against the TVAAS, even though such research is easily found (Amrein-Beardsley, 2008; Kupermintz, 2003). The statistical model employed in the TVAAS, the Education Value-Added Assessment System (EVAAS), may currently be the best value-added model available, but experts have had difficulty resolving its flaws because neither the statistical algorithms nor full value-added data sets have been disclosed for peer review. As one researcher stated, "My own and others' attempts to access the EVAAS value-added data have consistently gone without response or been refused with the justification that the value-added data, if released to external researchers, might be misrepresented" (Amrein-Beardsley, 2008, p. 68).

My take: Tennessee got a perfect score despite not collecting data for all students, at all grade levels, for all subjects, and they use a system that researchers aren't allowed to inspect because the state doesn't think they'll understand it. This was the best proposal in the country.

(D)(2)(ii) – Developing Evaluation Systems (15 points)

This subsection encourages the development of an evaluation system that "differentiate[s] effectiveness using multiple rating categories that take into account data on student growth ... as a significant factor" (U.S. Department of Education, 2010b, p. 11). The evaluation system should be "rigorous, transparent, and fair" and be "designed and developed with teacher and principal involvement" (U.S. Department of Education, 2010b, p. 11).

Rhode Island was the top-scoring state in this subcategory, but unlike Tennessee, Rhode Island does not have a value-added assessment system currently in place. Instead, it is rushing to implement one by the 2011-2012 school year so it can be "fully operational" by 2013-2014 (Rhode Island Department of Education, 2010, p. 95). Rhode Island plans to use this system liberally in educator evaluations:
Every decision made in regard to the professional educators in Rhode Island, whether made by an LEA or the state, will be based on evidence of the respective teacher's or principal's impact on student growth and academic achievement in addition to other measures of content knowledge, instructional quality, and professional responsibility. These new RI Standards ensure that no child in Rhode Island will be taught by a teacher who has received an "ineffective" evaluation for two consecutive years. (Rhode Island Department of Education, 2010, p. 97)
Instead of mandating a single statewide evaluation system, Rhode Island will allow individual LEAs to develop their own, provided they comply with the rigorous standards specified by the state. If LEAs choose to develop their own systems (or fail to), they can/must adopt a state-provided evaluation system.

Reviewers of Rhode Island's application awarded them 96 percent of the possible points for this subsection. In response to Rhode Island's "no child will be taught by an ineffective teacher" clause, one reviewer commented, "This is bold, it shows the seriousness of effort and it is an incredibly important foundation for RTT plans to get traction" (U.S. Department of Education, 2010d, p. 5). Only one reviewer seriously questioned Rhode Island's aggressive timeline for implementing their evaluation system. Even though the state forecasts a "fully operational" value-added system by 2013-2014, value-added data will account for 40 percent of a teacher's evaluation starting in 2011-2012 before rising to 45 percent in 2012-2013 and 51 percent in 2013 (Rhode Island Department of Education, 2010, p. 98; U.S. Department of Education, 2010d, p. 44).

My take: Rhode Island outscored every other state by mandating that districts transparently and fairly evaluate teachers based on data that didn't exist yet and a growth model the state didn't yet have.

(D)(2)(iii) – Conducting Annual Evaluations (10 points)

Two states, Tennessee and Rhode Island, scored the maximum ten points on this subsection. To earn the maximum points, the RTT scoring rubric requires that states have policies requiring "annual evaluations of teachers and principals that include timely and constructive feedback" (U.S. Department of Education, 2010b, p. 11) and that those evaluations include student growth data.

Tennessee gained favor in the scoring by having recently passed their "First to the Top Act," which establishes a 15-member Teacher Evaluation Advisory Committee tasked with developing a new evaluation system. All participating Tennessee LEAs will use the new evaluation system as described:
The evaluation system may be used to publicly report data that includes, but is not limited to, differentiation of teacher and principal performance (percentage in each rating category), the LEA's ability to increase the percentage of effective teachers and principals, and percentage of compensation based on instructional effectiveness. To ensure accountability on improving performance of teachers and principals, the state will encourage LEAs to set annual improvement goals, with a minimum of 15% improvement in terms of the number of educators moving up in each rating category. (Tennessee Department of Education, 2010, p. 86)
Much appears to be hinging on the application's use of "may be used" and "the state will encourage." One reviewer, despite awarding a perfect score, advised that "It would make sense to pilot some of these ideas in several districts and make any needed adjustments before adopting them statewide in July, 2011" (U.S. Department of Education, 2010e, p. 4). Meanwhile, another reviewer questioned, "With such heavy weighting on student achievement data, it is not clear what solutions the State has to evaluate teachers in non-tested subjects or grades" and "It is not clear if this new evaluation system will need to be collectively bargained, and if so, how the State intends to secure teacher buy-in" (U.S. Department of Education, 2010e, p. 12). None of the reviewers explicitly questioned the ability to expect a minimum, annual 15 percent improvement of the number of teachers moving up the evaluation rating categories. Only time will tell if this is a sustainable goal.

Compared to Tennessee, Rhode Island's annual evaluation proposal looks decidedly unremarkable and received few comments from reviewers. Rhode Island called for annual evaluations at a minimum, and the state is responsible for providing teachers and principals the academic growth data that constitutes the bulk of their evaluation. The evaluations must also be based on the “quality of instruction (or, for principals, quality of instructional leadership and management), demonstration of professional responsibilities, and content knowledge” (Rhode Island Department of Education, 2010, pp. 101-102). LEAs are expected to review evaluations to guide their professional development programs.

My take: Tennessee planned to evaluate everyone but only had a system designed to measure teachers in just a few subjects. How they would negotiate an expansion of their system wasn't clear. How they expected endless annual 15% improvements wasn't clear. Still, this and Rhode Island's rather bland proposal were the best in the country.

(D)(2)(iv) – Using Evaluations to Inform Key Decisions (28 points)

By far the largest subsection of criteria (D)(2), constituting nearly half its possible points, this subsection is targeted towards using evaluations to inform "key decisions." The RTT rubric specifies four such "key decisions:"
(a) Developing teachers and principals, including by providing relevant coaching, induction support, and/or professional development;
(b) Compensating, promoting, and retaining teachers and principals, including by providing opportunities for highly effective teachers and principals ... to obtain additional compensation and be given additional responsibilities;
(c) [Granting] tenure and/or full certification (where applicable) to teachers and principals using rigorous standards and streamlined, transparent, and fair procedures; and
(d) Removing ineffective tenured and untenured teachers and principals after they have had ample opportunities to improve, and ensuring that such decisions are made using rigorous standards and streamlined, transparent, and fair procedures. (U.S. Department of Education, 2010b, p. 11)
South Carolina and Rhode Island tied for the top score on this subsection, each earning 93 percent of the possible points. South Carolina currently uses two data systems: the system for Assisting, Developing, and Evaluating Professional Teaching (ADEPT) and the Program for Assisting, Developing, and Evaluating Principal Performance (PADEPP). (Acronym-loving South Carolina's RTT application is named INSPIRE, short for “Innovation, Next Generation Learners, Standards & Assessments, Personalized Instruction, Input and Choice, Redesigned Schools, Effective Teachers & Leaders, and Data Systems.”) South Carolina plans to tie these systems into their state-controlled certification system (which determines contract and due process rights) and statewide salary schedule (U.S. Department of Education, 2010f, p. 29). With the state handling certifications, tenure, and salaries, it will be much easier for South Carolina to implement the reforms specified in the RTT scoring rubric.

One reviewer only awarded 18 of 28 points and had particularly critical comments for this part of South Carolina's proposal:
The state proposes to provide induction support for beginning teachers and principals. There is no mention of coaching services after the induction period. The state application explains various statutory issues related to tenure and insists that tenure will be related to performance. The explanation is inadequate and does not lay out a clear plan. (U.S. Department of Education, 2010f, p. 4)
A different reviewer gave South Carolina the maximum 28 points for this subsection, saying only that "all beginning teachers and principals [will] receive induction support and mentoring" and "Salary incentives are part of South Carolina's plan, teacher effectiveness, retention, full certification, and removal, if necessary" (U.S. Department of Education, 2010f, p. 20). This kind of variability is a problem with the design of the RTT rubric and will be discussed in the conclusion of this post.

South Carolina might have the top-scoring proposal for using evaluations to inform decision-making, but their assessment and data systems have some glaring overall problems. The statewide data systems, ADEPT and PADEPP, do not use a value-added model. Some LEAs are piloting a value-added "approach," and the state plans on developing or selecting a statewide model in the near future. The data used in that eventual model will first be from their current statewide assessment and the Measures of Academic Progress (MAP), but the state plans to abandon those assessments in favor of one aligned with the Common Core K-12 standards, whenever one becomes available (South Carolina Department of Education, 2010, p. 102).

Rhode Island equaled South Carolina's score, but did so while retaining a more traditional measure of local control. In most cases, the LEAs will be setting their policies to meet the proposed goals of Rhode Island's RTT application, and the State Department of Education will assume an enforcement role. For the compensation piece, Rhode Island proposes funding four pilot programs with RTT dollars. By 2015, LEAs will be able to choose one of the four compensation models or develop their own with the state providing guidance and support (Rhode Island Department of Education, 2010, p. 106).

As discussed previously, Rhode Island plans to use their evaluation system for promotion, retention, and certification of teachers. LEAs will have to prove to the state that they are using evaluation data in these decisions and report to the state those teachers who have earned promotions or leadership responsibilities, which will require at least an "effective" or "highly effective" rating on their annual evaluation (Rhode Island Department of Education, 2010, p. 107). LEAs will also have to certify that they have removed all non-tenured ineffective teachers and any teacher marked "ineffective" two years in a row (Rhode Island Department of Education, 2010, pp. 108-109). The state will continue to manage the certification system and current educators will be subject to the new rules as their current certificates renew.

My take: Despite the high point value of this subsection, the U.S. DoE seems unclear if they believe more strongly in local control or state control, in current tests or future tests, or in mentoring or induction support. These were the best proposals in the country.

(D)(3) – Ensuring equitable distribution of effective teachers and principals (25 points across two subsections)

Louisiana led all states by taking 90 percent of the maximum 25 points for this section, but did not have the high score in either of the two subsections. Instead of reviewing Louisiana's application, we will instead focus on the applications from Georgia and Kansas.

(D)(3)(i) – Ensuring equitable distribution in high-poverty or high-minority schools (15 points)

The RTT scoring rubric for this subsection requires policies that "ensure that students in high-poverty and/or high-minority schools ... have equitable access to highly effective teachers and principals ... and are not served by ineffective teachers and principals at higher rates than other students" (U.S. Department of Education, 2010b, p. 11). Georgia earned 93 percent of the 15 available points in this subsection to lead all states. Georgia's strategy clearly delineates into solving problems of supply and demand. On the demand side, Georgia plans to award bonuses to effective teachers and principals in high-need schools "tied to the degree of reduction made in the student achievement gap every year" (Georgia Department of Education, 2010, p. 121). To entice effective teachers to move to high-need rural areas, the state is proposing $50,000 tax-free bonuses that vest over three years and require the teacher to maintain a high rating on the state's Teacher Effectiveness Measure (TEM). Districts wanting to participate in this program must compete for the funds and prove that the teachers eligible for bonuses have an established record of high achievement. Georgia is being bold with this plan, despite their decision not to "[offer] these kinds of bonuses to principals, having experimented with significant bonuses for principals in the past and having found that these incentives were not effective in getting principals to relocate" (Georgia Department of Education, 2010, p. 121). To improve the supply side of equitable teacher distribution, Georgia will work with LEAs to improve professional development and partner with organizations like Teach for America and The New Teacher Project that have experience recruiting teachers for hard-to-fill positions.

Only one reviewer offered the most glaring criticism of Georgia's plan: "There is also detail missing ... on the systems to ensure distribution over time" (U.S. Department of Education, 2010g, p. 40). The RTT money allocated for the bonuses is temporary, and programs like Teach for America and The New Teacher Project are not well-known for placing teachers who remain in high-need areas for more than a few years.

My take-away: Georgia actually had a straightforward approach here -- fill difficult assignments by offering significantly more money to teachers who have shown an ability to raise scores and close achievement gaps. Will it work? No one's sure, but this proposal should be worth following up on. After all, it was the best proposal in the country.

(D)(3)(ii) – Ensuring equitable distribution in hard-to-staff subjects and specialty areas (10 points)

Kansas, whose application ranked 29th overall, makes a surprise appearance at the top of the scoreboard. They introduce this section of their application with some startling statistics:

The Teaching in Kansas Commission found that:
  • 42% of Kansas teachers leave the field after seven years,
  • 36% of Kansas teachers can retire within the next 5 years,
  • 25% fewer students entered the teaching profession over the past six years,
  • An 86% decrease in Kansas teacher biology licenses will occur within 6 years,
  • A 50% decrease in chemistry licenses will occur within 6 years, and
  • A 67% decrease in physics licenses will occur within 6 years. (Kansas Department of Education, 2010, p. 81)
Kansas's plan mostly consists of expanding their UKanTeach program at the University of Kansas, both at KU and to other institutions of higher education around the state. Kansas claims that “UKanTeach is dramatically increasing the number of math and science teachers graduating from KU, resulting in over 100 new, highly qualified science and math teachers each year” (Kansas Department of Education, 2010, p. 81). They claim this “dramatic increase” without citing the number of graduates before the UKanTeach program and fail to address non-STEM hard-to-staff subjects such as special education and language instruction. Neither of these criticisms were mentioned by any of the five reviewers of Kansas's application. Additionally, despite having other plans for teacher preparation and retention in hard-to-serve areas, the reviewers almost universally fail to cite them in their comments (U.S. Department of Education, 2010h).

My take: Kansas's proposal sounds practical but lacks details. Can UKanTeach do anything for non-STEM teachers? Why did the judges find this to be the best proposal in the country without an answer to that question?

(D)(4) – Improving the effectiveness of teacher and principal preparation programs (14 points)

This criterion asks for a quality plan for linking student achievement and growth to the in-state teacher and principal preparation programs and expanding those programs identified as successful.

Tennessee, which earned 90 percent of the possible points, uses brief but strong language to sell this part of their application. They proudly boast "The cornerstones are competition and accountability," and "Our State Board of Education (SBE) has broken the monopoly on teacher preparation held by institutions of higher education" (Tennessee Department of Education, 2010, p. 110). Tennessee claims to publicly report their teacher preparation program quality data, but a search of their Department of Education website (http://tn.gov/education/) when the RTT results were announced revealed nothing. Tennessee planned in 2010 to gather stakeholders from across the state to examine how they link student achievement data to teacher preparation programs and develop a plan to "reward programs that are successful and support or decertify those that fail to produce effective teachers" (Tennessee Department of Education, 2010, p. 111). Most of the reviewers of Tennessee's application cited a lack of focus on principal preparation programs to match those for teachers (U.S. Department of Education, 2010e).

Rhode Island doesn't use Tennessee's tough language, but claims to "[act] aggressively to close programs that do not meet its rigorous current standards and has closed two programs, including a principal preparation program, in the last 5 years" (Rhode Island Department of Education, 2010, p. 125). Every educator preparation program in the state must be re-approved every five years and Rhode Island plans to include data from teacher and principal evaluations in the re-approval process. Specifically, Rhode Island wishes to track how many educators from each preparation program earn full Professional Certification and a disaggregation of preparation program graduates in high vs. low poverty and minority schools (Rhode Island Department of Education, 2010, p. 125).

My take: It's troubling to see Rhode Island acknowledge their closing of two teacher/principal preparation programs, and more troubling to see the judges view that as a positive achievement, without knowing in detail the specific failures of those institutions that led to the failure. How were the programs not meeting Rhode Island's "rigorous standards" and what efforts had been made to improve them? It would have been far more impressive for our country's best proposals to describe a successful rebuilding of those programs than their simple termination.

(D)(5) – Providing effective support to teachers and principals (20 points)

This criterion is based on two goals: provide ongoing, targeted professional development and supports while also monitoring and improving the quality of the professional development and supports. The supports could include “coaching, induction, and common planning and collaboration time to teachers and principals” (U.S. Department of Education, 2010b, p. 12).

Delaware earned 95% of the available points by requiring all participating LEAs to adopt a comprehensive professional development plan that contains all the supports specified in the rubric. Despite being the top-scoring plan, one reviewer commented:
The key weakness of this plan is the lack of specificity about how LEAs will know what is a good PD model and what is not – this section seems vague and not well thought through. Compared to other plans in the Delaware application, this area is not very creative nor clear. (U.S. Department of Education, 2010i, p. 15)
Delaware does specify plans for certifying effective professional development programs and requiring states to adopt such high-quality programs by the 2010-2011 school year, but the eleven pages of description in the Delaware application didn't translate into rich commentary from the reviewers, despite the high scores.

My take: It's as if the reviewers are confident in Delaware's plan despite not being able to accurately describe what the plan contains. Somehow, this was still better than the proposals from all other states.

Discussion

Taken all together, we see a policy preference for: (a) many alternative routes to certification, (b) an extensive value-added assessment system, (c) teacher and principal evaluations based on student performance and growth data, (d) annual evaluations of all teachers and principals, (e) teacher and principal compensation, promotion, and retention policies tied to evaluations, (f) incentives for teachers and principals to serve in high-need areas, (g) programs to increase the supply of teachers for hard-to-fill subjects, (h) quality, accountable teacher preparation programs, and (g) effective professional development.

This should be no surprise, because this is precisely what the RTT rubric asked for. How did this encourage a large pool of innovative and creative reforms? Is Kentucky's 20-year-old alternative licensure program creative? Is Tennessee's value-added assessment system, in use since 1992, innovative? It's very possible the RTT rubric has stifled creativity and innovation as much as it encouraged it. Even worse, states may have abandoned the innovative ideas they developed in Phase 1 and instead chose to copy the above high-scoring states in the hopes of winning funding.

A very troubling aspect of many proposed policies is the dependence of so many important decisions on a value-added student performance model that is not 100 percent transparent. Regardless of opinions concerning the use of value-added models, or beliefs that value-added models could achieve perfect accuracy and reliability, the use of a non-transparent model (such as the EVAAS) in so-called transparent evaluation systems is a significant flaw. Software is patentable and profitable, while the underlying mathematics is not, so the motivations for keeping at least some parts of these growth models secret is understandable, even if undesirable. Still, the RTT process could have been strengthened significantly if the scoring rubric had required 100 percent transparency for any and all statistical operations provided on educational data.

My final criticism of this process lie in the RTT rubric itself. Why have 500 total points? Why is "providing high-quality pathways for aspiring teachers and principals" worth 21 points and "ensuring equitable distribution of teachers" worth 25? Who decided that one category should be worth four points more than the other and why? If those four points had been allocated elsewhere, would the results have changed?

In a paper by Peterson and Rothstein (2010), the authors expose the arbitrary nature in which points were allocated in the RTT rubric and show how changes in the weights of categories could have changed the outcome of the entire RTT competition. For example, adding a mere 15 points to any of the four criteria (improving student outcomes, using data to improve instruction, using evaluations to inform key decisions, and ensuring equitable distribution), then decreasing the other criteria less than a half-point to keep the rubric's total score at 500, Georgia would have won the RTT competition (Peterson & Rothstein, 2010, p. 4). Similarly, the "demonstrating other significant reforms" criterion was only allocated one percent (5 points) of the total rubric. Given the innovation possible in this "other" category, including reforms called for in the DoE Blueprint and other federal education programs, it would have been reasonable to justify giving that category a larger weight. If that weight had been 25 percent of the application, then Pennsylvania would have been the winner (Peterson & Rothstein, 2010, p. 5).

This design of the RTT rubric and its point allocation not only affected the outcome of Phase 1, but likely affected the following phases even more strongly. The elements of the proposals examined in this paper were chosen regardless of the margin of victory. Not only are slim margins statistically insignificant in a 500-point rubric, but the scoring process itself leads to some arbitrary selections. Unfortunately, when trying to play catch-up with the winners, the simplest thing to do is copy, not create. In doing so, RTT reinforces a "don't just stand there, do something" atmosphere for reform, even if the choice and effectiveness of those "somethings" is uncertain and arbitrary.

References

Amrein-Beardsley, A. (2008). Methodological Concerns About the Education Value-Added Assessment System. Educational Researcher, 37(2), 65-75. doi:10.3102/0013189X08316420

Georgia Department of Education. (2010, January 19). Race to the Top: Application for Initial Funding. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-
applications/georgia.pdf

Kansas Department of Education. (2010, January 14). Race to the Top: Application for Initial Funding. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-
applications/kansas.pdf

Kentucky Department of Education. (2010, January 14). Race to the Top: Application for Initial Funding. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/kentucky.pdf

Kupermintz, H. (2003). Teacher Effects and Teacher Effectiveness: A Validity Investigation of the Tennessee Value Added Assessment System. Educational Evaluation and Policy Analysis, 25(3), 287-298. doi:10.3102/01623737025003287

Peterson, W., & Rothstein, R. (2010). Let's do the Numbers: Department of Education's "Race to the Top" Program Offers Only a Muddled Path to the Finish Line (Briefing Paper No. 263). EPI Briefing Papers. Washington, D.C.: Economic Policy Institute. Retrieved from http://www.epi.org/page/-/BriefingPaper263.pdf

Rhode Island Department of Education. (2010, January 14). Race to the Top: Application for Initial Funding. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-
applications/rhode-island.pdf

Tennessee Department of Education. (2010, January 18). Race to the Top: Application for Initial Funding. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/tennessee.pdf

U.S. Department of Education. (2010b). Race to the Top Scoring Rubric Corrected. Washington, D.C.: U.S. Department of Education. Retrieved from http://www2.ed.gov/programs/racetothetop/scoringrubric.pdf

U.S. Department of Education. (2010c). Race to the Top: Technical Review Form - Kentucky. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/comments/kentucky.pdf

U.S. Department of Education. (2010d). Race to the Top: Technical Review Form - Rhode Island. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-
applications/comments/rhode-island.pdf

U.S. Department of Education. (2010e). Race to the Top: Technical Review Form - Tennessee. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/comments/tennessee.pdf

U.S. Department of Education. (2010f). Race to the Top: Technical Review Form - South
Carolina. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/comments/south-carolina.pdf

U.S. Department of Education. (2010g). Race to the Top: Technical Review Form - Georgia. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/comments/georgia.pdf

U.S. Department of Education. (2010h). Race to the Top: Technical Review Form - Kansas. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/comments/kansas.pdf

U.S. Department of Education. (2010i). Race to the Top: Technical Review Form - Deleware. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/comments/delaware.pdf