MathEd.net: accountability

Showing posts with label accountability. Show all posts

RYSK: Shulman's Those Who Understand: Knowledge Growth in Teaching (1986)

This is the 15th in a series describing "Research You Should Know" (RYSK) and part of my OpenComps. I also Storified this article as I read.

Lee Shulman. (CC BY-NC) Penn State

George Bernard Shaw once said, "He who can, does. He who cannot, teaches." For that, you could say that Lee Shulman takes offense. Shulman, a long-time faculty member at both Michigan State (1963-1982) then Stanford, explained his position and a new way of thinking about teacher knowledge in his AERA Presidential Address and the paper, Those Who Understand: Knowledge Growth in Teaching. Shulman is now an emertius professor but stays active traveling, speaking, and occasionally blogging.

Wondering why the public often has a low opinion of teachers' knowledge and skill, Shulman first looks at the history of teacher examinations. In the latter half of the 1800s, examinations for people wishing to teach were almost entirely content-based. In 1875, for example, the California State Board examination for elementary teachers gave a day-long, 1000-point exam that covered everything from mental arithmetic to geography to vocal music. Its section on the theory and practice of teaching, however, was only worth 50 of the 1000 points and included questions like, "How do you interest lazy and careless pupils?" (p. 5)

By the 1980s, when Shulman wrote this article, teacher examinations painted almost the opposite picture. Instead of focusing on content, they focused on topics such as lesson planning, cultural awareness, and other aspects of teacher behavior. While the topics usually had roots in research, they clearly did not represent the wide spectrum of skills and knowledge a teacher would need to be a successful teacher. More specifically, by the 1980s our teacher examinations seemed to care as little about content as the examinations a century prior seemed to care about pedagogy.

Looking back even further in history, Shulman recognized that we haven't always made this distinction between content and teaching knowledge. The origins of the names of our highest degrees, "master" and "doctor," both essentially mean "teacher" and reflected the belief the highest form of knowing was teaching, an idea going back to at least Aristotle:

We regard master-craftsmen as superior not merely because they have a grasp of theory and know the reasons for acting as they do. Broadly speaking, what distinguishes the man who knows from the ignorant man is an ability to teach, and this is why we hold that art and not experience has the character of genuine knowledge (episteme) -- namely, that artists can teach and others (i.e., those who have not acquired an art by study but have merely picked up some skill empirically) cannot. (Wheelwright, 1951, as cited in Shulman, 1986, p. 7)

Shulman saw a blind spot in this dichotomy between content and teaching knowledge. What he saw was a special kind of knowledge that allows teachers to teach effectively. After studying secondary teachers across subject areas, Shulman and his fellow researchers looked to better understand the source of teachers' comprehension of their subject areas, how that knowledge grows, and how teachers understand and react to curriculum, reshaping it into something their students will understand.

Pedagogical Content Knowledge

To better understand this special knowledge of teaching, Shulman suggested we distinguish three different kinds of content knowledge: (a) subject matter knowledge, (b) pedagogical content knowledge, and (c) curricular knowledge. It was the second of these, pedagogical content knowledge (PCK), that Shulman is best remembered for. Shulman describes the essence of PCK:

Within the category of pedagogical content knowledge I include, for the most reguarly taught topics in one's subject area, the most useful forms of representation of those ideas, the most powerful analogies, illustrations, examples, explanations, and demonstrations -- in a word, the ways of representing and formulating the subject that make it comprehensible to others. Since there are no single most powerful forms of representation, the teacher must have at hand a veritable armamentarium of alternative forms of representation, some of which derive from research whereas others originate in the wisdom of practice. (p. 9)

In addition to these three kinds of teacher knowledge, Shulman also proposed we consider three forms of teacher knowledge: (a) propositional knowledge, (b) case knowledge, and (c) strategic knowledge. These are not separate from the three kinds of knowledge named above, but rather describe different forms of each kind of teacher knowledge. Propositional knowledge consists of those things we propose teachers do, from "planning five-step lesson plans, never smiling until Christmas, and organizing three reading groups" (p. 10). Shulman organized propositional knowledge into principles, maxims, and norms, with the first usually emerging from research, the second coming from a practical experience (and generally untestable, like the suggestion to not smile before Christmas), and the third concerning things like equity and fairness. Propositions can be helpful but difficult to remember to implement as research intended.

Learning propositions out of context is difficult, so Shulman proposed case knowledge as the second form of teacher knowledge. By case, he means learning about teaching in a similar way a lawyer learns about the law by studying prior legal cases. In order to truly understand a case, a learner starts with the factual information and works towards the theoretical aspects that explain why things happened. By studying well-documented cases of teaching and learning, teachers consider prototype cases (that exemplify the theoretical), precedents (that communicate maxims), and parables (that communicate norms and values). (If you're scoring at home, Shulman has now said there are three types of cases, which itself is one of three forms of knowledge, each of which capable of describing three different kinds of content knowledge.)

The last form of knowledge, strategic knowledge, describes how a teacher reacts when faced with contradictions of other knowledge or wisdom. Knowing when to bend the rules or go against conventional wisdom takes more than luck -- it requires a teacher to be "not only a master of procedure but also of content and rationale, and capable of explaining why something is done" (p. 13).

The value of this article by Shulman goes beyond the theoretical description of pedagogical content knowledge. Additionally, this article serves as a strong reminder that when we judge a teacher, we must consider a broad spectrum of skills and abilities, and not limit ourselves to only those things we think can be easily measured. As Shulman explains:

Reinforcement and conditioning guarantee behavior, and training produces predictable outcomes; knowledge guarantees only freedom, only the flexibility to judge, to weigh alternatives, to reason about both ends and means, and then to act while reflecting upon one's actions. Knowledge guarantees only grounded unpredictability, the exercise of reasoned judgment rather than the display of correct behavior. If this vision constitutes a serious challenge to those who would evaluate teaching using fixed behavioral criteria (e.g., the five-step lesson plan), so much the worse for those evaluators. The vision I hold of teaching and teacher education is a vision of professionals who are capable not only of acting, but of enacting -- of acting in a manner that is self-conscious with repect to what their act is a case of, or to what their act entails. (p. 13)

In our current era of teacher evaluation and accountability, with all its observational protocols and test score-driven value added models, this larger view of teaching presented to us by Shulman is a gift. His recommendation that teacher evaluation and examination "be defined and controlled by members of the profession, not by legislators or laypersons" (p. 13) is a wise one, no matter how politically difficult. Shulman hoped for tests of pedagogical content knowledge that truly measured those speical skills that teachers have, skills that non-teaching content experts would not pass. I don't think those measurement challenges have been overcome, but continuing towards that goal should strengthen teacher education programs while also improving the perception of teaching as a profession. As Shulman concludes (p. 14):

We reject Mr. Shaw and his calumny. With Aristotle we declare that the ultimate test of understanding rests on the ability to transform one's knowledge into teaching.

Those who can, do. Those who understand, teach.

References

Shulman, L. S. (1986). Those who understand: Knowledge growth in teaching. Educational Researcher, 15(2), 4–14. Retrieved from http://www.jstor.org/stable/3202180

How the Race (to the Top) Was Won (Part 2 of 2)

Note to self: In the future, don't go on a multi-state road trip and become otherwise distracted for more than a month between Parts 1 and 2 of a two-part blog post. Sorry, readers!

In my previous post, I set out to answer the following questions about Phase 1 of Race to the Top (RTT):

Where in the RTT rubric could states score the most points?
For the portions of the RTT rubric identified in #1, which states scored highest?
For the states identified in #2, what did their application propose and what were the judges' comments?

The answer to #1 was Section D of the RTT application, "Great Teachers and Leaders." If you break down that section, you'll find that the highest subsection scores belonged to Delaware, Tennessee, Georgia, South Carolina, Rhode Island, Kentucky, Louisiana, and Kansas. This post will dig into the actual applications to see what proposed reforms warranted those high scores, along with some of the comments made by the judges during the scoring.

(D)(1) – Providing High-Quality Pathways for Aspiring Teachers and Principals (21 points)

The RTT scoring rubric specifies that this criterion must be judged for both teachers and principals. High points are awarded for alternative certification routes that operate independently of institutions of higher education (IHEs) and include at least four of the following five definitional criteria: (a) programs are operated by a variety of providers, (b) candidates are admitted selectively, (c) candidates have school-based experiences and ongoing support, (d) limited coursework, and (e) certifications are the same as traditional programs (Kentucky Department of Education, 2010, p. 7; U.S. Department of Education, 2010b, p. 10).

Kentucky has a 20-year history of alternative certification programs, and in 2003 the Kentucky Legislature allocated funds for the creation and growth of such programs (Kentucky Department of Education, 2010, p. 118). Kentucky now has seven alternative programs, including specific programs for people with extensive work experience, college faculty, and veterans of the Armed Forces, as well as district-based, university-based, and institute-based options. Most of the programs are selective; the work experience route requires at least ten years of work in the area of certification and several others require Bachelor's degrees in the relevant content area (Kentucky Department of Education, 2010, p. 120). Ten percent of Kentucky's current teachers and 17 percent of new teachers in 2009-2010 completed an alternative program.

There were only two aspects of this portion of Kentucky's application that received criticism from the reviewers, and only two of the five reviewers deducted any points for this section. The first deficiency was in the area of alternative principal licensing. Kentucky only has one program for alternative principal certification, and it does not meet all five of the definitional criteria (U.S. Department of Education, 2010c, p. 32). Only one new principal was alternatively licensed in 2009-2010, and that individual went through a university-based alternative program (Kentucky Department of Education, 2010, p. 122). The other deficiency was in Kentucky's process for identifying and acting upon teacher and principal shortages. Shortages are identified by Kentucky's LEAs, but current efforts to place teachers are limited both in number and geographic reach (U.S. Department of Education, 2010c, p. 32).

My take: Kentucky got perfect scores from 3 of 5 judges, and the top score overall in this area, despite only having one alternative program for administrators that produced only one new administrator the previous year. This was the best proposal in the country.

(D)(2) – Improving teacher and principal effectiveness based on performance (58 total points across four subsections)

This criterion, also applicable to both teachers and principals, is worth a maximum of 58 points. Rhode Island scored 94 percent of those points to lead all applicants. Unlike the previous criterion, this one is divided into four distinct subsections. Rhode Island had the top score or tied for top score in three of those subsections.

(D)(2)(i) - Measuring Student Growth (5 points)

Even though this subsection is only worth five points, Tennessee was the only state to be awarded a perfect score. To earn the points, states must "establish clear approaches to measuring student growth and measure it for each individual student" (U.S. Department of Education, 2010b, p. 11).

Tennessee has been using the Tennessee Value-Added Assessment System (TVAAS) since 1992. Tennessee claimed to track every grade 3-12 student in every subject, and the state claims the TVAAS is the largest student database in history (Tennessee Department of Education, 2010, p. 82). Despite the "every subject" claim, a brief look at the publicly-viewable data on the TVAAS website (https://tvaas.sas.com/evaas/welcome.jsp) indicates only core subjects are tested, and not at every grade. Tennessee has used the TVAAS to determine Adequate Yearly Progress, to support progressive districts, identify strengths and weaknesses in grades or subjects, and inform instruction, but prior to the 2009-2010 school year only 14 percent of the state's teachers had access to the database (Tennessee Department of Education, 2010, p. 82). Now that Tennessee has opened up access to all teachers, they plan to train both current teachers and administrators, as well as pre-service teachers, in the use of the database. Not only will this value-added data be linked to teacher and principal compensation and evaluations, the state "will monitor and report access and usage of the system at the teacher, school, and district levels" (Tennessee Department of Education, 2010, p. 82). No explanation is given in Tennessee's application for this level of monitoring and reporting, but one might assume it is to apply pressure to ensure that the system is universally used.

None of the five reviewers of Tennessee's application made any mention of the criticisms that have been levied against the TVAAS, even though such research is easily found (Amrein-Beardsley, 2008; Kupermintz, 2003). The statistical model employed in the TVAAS, the Education Value-Added Assessment System (EVAAS), may currently be the best value-added model available, but experts have had difficulty resolving its flaws because neither the statistical algorithms nor full value-added data sets have been disclosed for peer review. As one researcher stated, "My own and others' attempts to access the EVAAS value-added data have consistently gone without response or been refused with the justification that the value-added data, if released to external researchers, might be misrepresented" (Amrein-Beardsley, 2008, p. 68).

My take: Tennessee got a perfect score despite not collecting data for all students, at all grade levels, for all subjects, and they use a system that researchers aren't allowed to inspect because the state doesn't think they'll understand it. This was the best proposal in the country.

(D)(2)(ii) – Developing Evaluation Systems (15 points)

This subsection encourages the development of an evaluation system that "differentiate[s] effectiveness using multiple rating categories that take into account data on student growth ... as a significant factor" (U.S. Department of Education, 2010b, p. 11). The evaluation system should be "rigorous, transparent, and fair" and be "designed and developed with teacher and principal involvement" (U.S. Department of Education, 2010b, p. 11).

Rhode Island was the top-scoring state in this subcategory, but unlike Tennessee, Rhode Island does not have a value-added assessment system currently in place. Instead, it is rushing to implement one by the 2011-2012 school year so it can be "fully operational" by 2013-2014 (Rhode Island Department of Education, 2010, p. 95). Rhode Island plans to use this system liberally in educator evaluations:

Every decision made in regard to the professional educators in Rhode Island, whether made by an LEA or the state, will be based on evidence of the respective teacher's or principal's impact on student growth and academic achievement in addition to other measures of content knowledge, instructional quality, and professional responsibility. These new RI Standards ensure that no child in Rhode Island will be taught by a teacher who has received an "ineffective" evaluation for two consecutive years. (Rhode Island Department of Education, 2010, p. 97)

Instead of mandating a single statewide evaluation system, Rhode Island will allow individual LEAs to develop their own, provided they comply with the rigorous standards specified by the state. If LEAs choose to develop their own systems (or fail to), they can/must adopt a state-provided evaluation system.

Reviewers of Rhode Island's application awarded them 96 percent of the possible points for this subsection. In response to Rhode Island's "no child will be taught by an ineffective teacher" clause, one reviewer commented, "This is bold, it shows the seriousness of effort and it is an incredibly important foundation for RTT plans to get traction" (U.S. Department of Education, 2010d, p. 5). Only one reviewer seriously questioned Rhode Island's aggressive timeline for implementing their evaluation system. Even though the state forecasts a "fully operational" value-added system by 2013-2014, value-added data will account for 40 percent of a teacher's evaluation starting in 2011-2012 before rising to 45 percent in 2012-2013 and 51 percent in 2013 (Rhode Island Department of Education, 2010, p. 98; U.S. Department of Education, 2010d, p. 44).

My take: Rhode Island outscored every other state by mandating that districts transparently and fairly evaluate teachers based on data that didn't exist yet and a growth model the state didn't yet have.

(D)(2)(iii) – Conducting Annual Evaluations (10 points)

Two states, Tennessee and Rhode Island, scored the maximum ten points on this subsection. To earn the maximum points, the RTT scoring rubric requires that states have policies requiring "annual evaluations of teachers and principals that include timely and constructive feedback" (U.S. Department of Education, 2010b, p. 11) and that those evaluations include student growth data.

Tennessee gained favor in the scoring by having recently passed their "First to the Top Act," which establishes a 15-member Teacher Evaluation Advisory Committee tasked with developing a new evaluation system. All participating Tennessee LEAs will use the new evaluation system as described:

The evaluation system may be used to publicly report data that includes, but is not limited to, differentiation of teacher and principal performance (percentage in each rating category), the LEA's ability to increase the percentage of effective teachers and principals, and percentage of compensation based on instructional effectiveness. To ensure accountability on improving performance of teachers and principals, the state will encourage LEAs to set annual improvement goals, with a minimum of 15% improvement in terms of the number of educators moving up in each rating category. (Tennessee Department of Education, 2010, p. 86)

Much appears to be hinging on the application's use of "may be used" and "the state will encourage." One reviewer, despite awarding a perfect score, advised that "It would make sense to pilot some of these ideas in several districts and make any needed adjustments before adopting them statewide in July, 2011" (U.S. Department of Education, 2010e, p. 4). Meanwhile, another reviewer questioned, "With such heavy weighting on student achievement data, it is not clear what solutions the State has to evaluate teachers in non-tested subjects or grades" and "It is not clear if this new evaluation system will need to be collectively bargained, and if so, how the State intends to secure teacher buy-in" (U.S. Department of Education, 2010e, p. 12). None of the reviewers explicitly questioned the ability to expect a minimum, annual 15 percent improvement of the number of teachers moving up the evaluation rating categories. Only time will tell if this is a sustainable goal.

Compared to Tennessee, Rhode Island's annual evaluation proposal looks decidedly unremarkable and received few comments from reviewers. Rhode Island called for annual evaluations at a minimum, and the state is responsible for providing teachers and principals the academic growth data that constitutes the bulk of their evaluation. The evaluations must also be based on the “quality of instruction (or, for principals, quality of instructional leadership and management), demonstration of professional responsibilities, and content knowledge” (Rhode Island Department of Education, 2010, pp. 101-102). LEAs are expected to review evaluations to guide their professional development programs.

My take: Tennessee planned to evaluate everyone but only had a system designed to measure teachers in just a few subjects. How they would negotiate an expansion of their system wasn't clear. How they expected endless annual 15% improvements wasn't clear. Still, this and Rhode Island's rather bland proposal were the best in the country.

(D)(2)(iv) – Using Evaluations to Inform Key Decisions (28 points)

By far the largest subsection of criteria (D)(2), constituting nearly half its possible points, this subsection is targeted towards using evaluations to inform "key decisions." The RTT rubric specifies four such "key decisions:"

(a) Developing teachers and principals, including by providing relevant coaching, induction support, and/or professional development;
(b) Compensating, promoting, and retaining teachers and principals, including by providing opportunities for highly effective teachers and principals ... to obtain additional compensation and be given additional responsibilities;
(c) [Granting] tenure and/or full certification (where applicable) to teachers and principals using rigorous standards and streamlined, transparent, and fair procedures; and
(d) Removing ineffective tenured and untenured teachers and principals after they have had ample opportunities to improve, and ensuring that such decisions are made using rigorous standards and streamlined, transparent, and fair procedures. (U.S. Department of Education, 2010b, p. 11)

South Carolina and Rhode Island tied for the top score on this subsection, each earning 93 percent of the possible points. South Carolina currently uses two data systems: the system for Assisting, Developing, and Evaluating Professional Teaching (ADEPT) and the Program for Assisting, Developing, and Evaluating Principal Performance (PADEPP). (Acronym-loving South Carolina's RTT application is named INSPIRE, short for “Innovation, Next Generation Learners, Standards & Assessments, Personalized Instruction, Input and Choice, Redesigned Schools, Effective Teachers & Leaders, and Data Systems.”) South Carolina plans to tie these systems into their state-controlled certification system (which determines contract and due process rights) and statewide salary schedule (U.S. Department of Education, 2010f, p. 29). With the state handling certifications, tenure, and salaries, it will be much easier for South Carolina to implement the reforms specified in the RTT scoring rubric.

One reviewer only awarded 18 of 28 points and had particularly critical comments for this part of South Carolina's proposal:

The state proposes to provide induction support for beginning teachers and principals. There is no mention of coaching services after the induction period. The state application explains various statutory issues related to tenure and insists that tenure will be related to performance. The explanation is inadequate and does not lay out a clear plan. (U.S. Department of Education, 2010f, p. 4)

A different reviewer gave South Carolina the maximum 28 points for this subsection, saying only that "all beginning teachers and principals [will] receive induction support and mentoring" and "Salary incentives are part of South Carolina's plan, teacher effectiveness, retention, full certification, and removal, if necessary" (U.S. Department of Education, 2010f, p. 20). This kind of variability is a problem with the design of the RTT rubric and will be discussed in the conclusion of this post.

South Carolina might have the top-scoring proposal for using evaluations to inform decision-making, but their assessment and data systems have some glaring overall problems. The statewide data systems, ADEPT and PADEPP, do not use a value-added model. Some LEAs are piloting a value-added "approach," and the state plans on developing or selecting a statewide model in the near future. The data used in that eventual model will first be from their current statewide assessment and the Measures of Academic Progress (MAP), but the state plans to abandon those assessments in favor of one aligned with the Common Core K-12 standards, whenever one becomes available (South Carolina Department of Education, 2010, p. 102).

Rhode Island equaled South Carolina's score, but did so while retaining a more traditional measure of local control. In most cases, the LEAs will be setting their policies to meet the proposed goals of Rhode Island's RTT application, and the State Department of Education will assume an enforcement role. For the compensation piece, Rhode Island proposes funding four pilot programs with RTT dollars. By 2015, LEAs will be able to choose one of the four compensation models or develop their own with the state providing guidance and support (Rhode Island Department of Education, 2010, p. 106).

As discussed previously, Rhode Island plans to use their evaluation system for promotion, retention, and certification of teachers. LEAs will have to prove to the state that they are using evaluation data in these decisions and report to the state those teachers who have earned promotions or leadership responsibilities, which will require at least an "effective" or "highly effective" rating on their annual evaluation (Rhode Island Department of Education, 2010, p. 107). LEAs will also have to certify that they have removed all non-tenured ineffective teachers and any teacher marked "ineffective" two years in a row (Rhode Island Department of Education, 2010, pp. 108-109). The state will continue to manage the certification system and current educators will be subject to the new rules as their current certificates renew.

My take: Despite the high point value of this subsection, the U.S. DoE seems unclear if they believe more strongly in local control or state control, in current tests or future tests, or in mentoring or induction support. These were the best proposals in the country.

(D)(3) – Ensuring equitable distribution of effective teachers and principals (25 points across two subsections)

Louisiana led all states by taking 90 percent of the maximum 25 points for this section, but did not have the high score in either of the two subsections. Instead of reviewing Louisiana's application, we will instead focus on the applications from Georgia and Kansas.

(D)(3)(i) – Ensuring equitable distribution in high-poverty or high-minority schools (15 points)

The RTT scoring rubric for this subsection requires policies that "ensure that students in high-poverty and/or high-minority schools ... have equitable access to highly effective teachers and principals ... and are not served by ineffective teachers and principals at higher rates than other students" (U.S. Department of Education, 2010b, p. 11). Georgia earned 93 percent of the 15 available points in this subsection to lead all states. Georgia's strategy clearly delineates into solving problems of supply and demand. On the demand side, Georgia plans to award bonuses to effective teachers and principals in high-need schools "tied to the degree of reduction made in the student achievement gap every year" (Georgia Department of Education, 2010, p. 121). To entice effective teachers to move to high-need rural areas, the state is proposing $50,000 tax-free bonuses that vest over three years and require the teacher to maintain a high rating on the state's Teacher Effectiveness Measure (TEM). Districts wanting to participate in this program must compete for the funds and prove that the teachers eligible for bonuses have an established record of high achievement. Georgia is being bold with this plan, despite their decision not to "[offer] these kinds of bonuses to principals, having experimented with significant bonuses for principals in the past and having found that these incentives were not effective in getting principals to relocate" (Georgia Department of Education, 2010, p. 121). To improve the supply side of equitable teacher distribution, Georgia will work with LEAs to improve professional development and partner with organizations like Teach for America and The New Teacher Project that have experience recruiting teachers for hard-to-fill positions.

Only one reviewer offered the most glaring criticism of Georgia's plan: "There is also detail missing ... on the systems to ensure distribution over time" (U.S. Department of Education, 2010g, p. 40). The RTT money allocated for the bonuses is temporary, and programs like Teach for America and The New Teacher Project are not well-known for placing teachers who remain in high-need areas for more than a few years.

My take-away: Georgia actually had a straightforward approach here -- fill difficult assignments by offering significantly more money to teachers who have shown an ability to raise scores and close achievement gaps. Will it work? No one's sure, but this proposal should be worth following up on. After all, it was the best proposal in the country.

(D)(3)(ii) – Ensuring equitable distribution in hard-to-staff subjects and specialty areas (10 points)

Kansas, whose application ranked 29th overall, makes a surprise appearance at the top of the scoreboard. They introduce this section of their application with some startling statistics:

The Teaching in Kansas Commission found that:

42% of Kansas teachers leave the field after seven years,
36% of Kansas teachers can retire within the next 5 years,
25% fewer students entered the teaching profession over the past six years,
An 86% decrease in Kansas teacher biology licenses will occur within 6 years,
A 50% decrease in chemistry licenses will occur within 6 years, and
A 67% decrease in physics licenses will occur within 6 years. (Kansas Department of Education, 2010, p. 81)

Kansas's plan mostly consists of expanding their UKanTeach program at the University of Kansas, both at KU and to other institutions of higher education around the state. Kansas claims that “UKanTeach is dramatically increasing the number of math and science teachers graduating from KU, resulting in over 100 new, highly qualified science and math teachers each year” (Kansas Department of Education, 2010, p. 81). They claim this “dramatic increase” without citing the number of graduates before the UKanTeach program and fail to address non-STEM hard-to-staff subjects such as special education and language instruction. Neither of these criticisms were mentioned by any of the five reviewers of Kansas's application. Additionally, despite having other plans for teacher preparation and retention in hard-to-serve areas, the reviewers almost universally fail to cite them in their comments (U.S. Department of Education, 2010h).

My take: Kansas's proposal sounds practical but lacks details. Can UKanTeach do anything for non-STEM teachers? Why did the judges find this to be the best proposal in the country without an answer to that question?

(D)(4) – Improving the effectiveness of teacher and principal preparation programs (14 points)

This criterion asks for a quality plan for linking student achievement and growth to the in-state teacher and principal preparation programs and expanding those programs identified as successful.

Tennessee, which earned 90 percent of the possible points, uses brief but strong language to sell this part of their application. They proudly boast "The cornerstones are competition and accountability," and "Our State Board of Education (SBE) has broken the monopoly on teacher preparation held by institutions of higher education" (Tennessee Department of Education, 2010, p. 110). Tennessee claims to publicly report their teacher preparation program quality data, but a search of their Department of Education website (http://tn.gov/education/) when the RTT results were announced revealed nothing. Tennessee planned in 2010 to gather stakeholders from across the state to examine how they link student achievement data to teacher preparation programs and develop a plan to "reward programs that are successful and support or decertify those that fail to produce effective teachers" (Tennessee Department of Education, 2010, p. 111). Most of the reviewers of Tennessee's application cited a lack of focus on principal preparation programs to match those for teachers (U.S. Department of Education, 2010e).

Rhode Island doesn't use Tennessee's tough language, but claims to "[act] aggressively to close programs that do not meet its rigorous current standards and has closed two programs, including a principal preparation program, in the last 5 years" (Rhode Island Department of Education, 2010, p. 125). Every educator preparation program in the state must be re-approved every five years and Rhode Island plans to include data from teacher and principal evaluations in the re-approval process. Specifically, Rhode Island wishes to track how many educators from each preparation program earn full Professional Certification and a disaggregation of preparation program graduates in high vs. low poverty and minority schools (Rhode Island Department of Education, 2010, p. 125).

My take: It's troubling to see Rhode Island acknowledge their closing of two teacher/principal preparation programs, and more troubling to see the judges view that as a positive achievement, without knowing in detail the specific failures of those institutions that led to the failure. How were the programs not meeting Rhode Island's "rigorous standards" and what efforts had been made to improve them? It would have been far more impressive for our country's best proposals to describe a successful rebuilding of those programs than their simple termination.

(D)(5) – Providing effective support to teachers and principals (20 points)

This criterion is based on two goals: provide ongoing, targeted professional development and supports while also monitoring and improving the quality of the professional development and supports. The supports could include “coaching, induction, and common planning and collaboration time to teachers and principals” (U.S. Department of Education, 2010b, p. 12).

Delaware earned 95% of the available points by requiring all participating LEAs to adopt a comprehensive professional development plan that contains all the supports specified in the rubric. Despite being the top-scoring plan, one reviewer commented:

The key weakness of this plan is the lack of specificity about how LEAs will know what is a good PD model and what is not – this section seems vague and not well thought through. Compared to other plans in the Delaware application, this area is not very creative nor clear. (U.S. Department of Education, 2010i, p. 15)

Delaware does specify plans for certifying effective professional development programs and requiring states to adopt such high-quality programs by the 2010-2011 school year, but the eleven pages of description in the Delaware application didn't translate into rich commentary from the reviewers, despite the high scores.

My take: It's as if the reviewers are confident in Delaware's plan despite not being able to accurately describe what the plan contains. Somehow, this was still better than the proposals from all other states.

Discussion

Taken all together, we see a policy preference for: (a) many alternative routes to certification, (b) an extensive value-added assessment system, (c) teacher and principal evaluations based on student performance and growth data, (d) annual evaluations of all teachers and principals, (e) teacher and principal compensation, promotion, and retention policies tied to evaluations, (f) incentives for teachers and principals to serve in high-need areas, (g) programs to increase the supply of teachers for hard-to-fill subjects, (h) quality, accountable teacher preparation programs, and (g) effective professional development.

This should be no surprise, because this is precisely what the RTT rubric asked for. How did this encourage a large pool of innovative and creative reforms? Is Kentucky's 20-year-old alternative licensure program creative? Is Tennessee's value-added assessment system, in use since 1992, innovative? It's very possible the RTT rubric has stifled creativity and innovation as much as it encouraged it. Even worse, states may have abandoned the innovative ideas they developed in Phase 1 and instead chose to copy the above high-scoring states in the hopes of winning funding.

A very troubling aspect of many proposed policies is the dependence of so many important decisions on a value-added student performance model that is not 100 percent transparent. Regardless of opinions concerning the use of value-added models, or beliefs that value-added models could achieve perfect accuracy and reliability, the use of a non-transparent model (such as the EVAAS) in so-called transparent evaluation systems is a significant flaw. Software is patentable and profitable, while the underlying mathematics is not, so the motivations for keeping at least some parts of these growth models secret is understandable, even if undesirable. Still, the RTT process could have been strengthened significantly if the scoring rubric had required 100 percent transparency for any and all statistical operations provided on educational data.

My final criticism of this process lie in the RTT rubric itself. Why have 500 total points? Why is "providing high-quality pathways for aspiring teachers and principals" worth 21 points and "ensuring equitable distribution of teachers" worth 25? Who decided that one category should be worth four points more than the other and why? If those four points had been allocated elsewhere, would the results have changed?

In a paper by Peterson and Rothstein (2010), the authors expose the arbitrary nature in which points were allocated in the RTT rubric and show how changes in the weights of categories could have changed the outcome of the entire RTT competition. For example, adding a mere 15 points to any of the four criteria (improving student outcomes, using data to improve instruction, using evaluations to inform key decisions, and ensuring equitable distribution), then decreasing the other criteria less than a half-point to keep the rubric's total score at 500, Georgia would have won the RTT competition (Peterson & Rothstein, 2010, p. 4). Similarly, the "demonstrating other significant reforms" criterion was only allocated one percent (5 points) of the total rubric. Given the innovation possible in this "other" category, including reforms called for in the DoE Blueprint and other federal education programs, it would have been reasonable to justify giving that category a larger weight. If that weight had been 25 percent of the application, then Pennsylvania would have been the winner (Peterson & Rothstein, 2010, p. 5).

This design of the RTT rubric and its point allocation not only affected the outcome of Phase 1, but likely affected the following phases even more strongly. The elements of the proposals examined in this paper were chosen regardless of the margin of victory. Not only are slim margins statistically insignificant in a 500-point rubric, but the scoring process itself leads to some arbitrary selections. Unfortunately, when trying to play catch-up with the winners, the simplest thing to do is copy, not create. In doing so, RTT reinforces a "don't just stand there, do something" atmosphere for reform, even if the choice and effectiveness of those "somethings" is uncertain and arbitrary.

References

Amrein-Beardsley, A. (2008). Methodological Concerns About the Education Value-Added Assessment System. Educational Researcher, 37(2), 65-75. doi:10.3102/0013189X08316420

Georgia Department of Education. (2010, January 19). Race to the Top: Application for Initial Funding. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-
applications/georgia.pdf

Kansas Department of Education. (2010, January 14). Race to the Top: Application for Initial Funding. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-
applications/kansas.pdf

Kentucky Department of Education. (2010, January 14). Race to the Top: Application for Initial Funding. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/kentucky.pdf

Kupermintz, H. (2003). Teacher Effects and Teacher Effectiveness: A Validity Investigation of the Tennessee Value Added Assessment System. Educational Evaluation and Policy Analysis, 25(3), 287-298. doi:10.3102/01623737025003287

Peterson, W., & Rothstein, R. (2010). Let's do the Numbers: Department of Education's "Race to the Top" Program Offers Only a Muddled Path to the Finish Line (Briefing Paper No. 263). EPI Briefing Papers. Washington, D.C.: Economic Policy Institute. Retrieved from http://www.epi.org/page/-/BriefingPaper263.pdf

Rhode Island Department of Education. (2010, January 14). Race to the Top: Application for Initial Funding. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-
applications/rhode-island.pdf

Tennessee Department of Education. (2010, January 18). Race to the Top: Application for Initial Funding. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/tennessee.pdf

U.S. Department of Education. (2010b). Race to the Top Scoring Rubric Corrected. Washington, D.C.: U.S. Department of Education. Retrieved from http://www2.ed.gov/programs/racetothetop/scoringrubric.pdf

U.S. Department of Education. (2010c). Race to the Top: Technical Review Form - Kentucky. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/comments/kentucky.pdf

U.S. Department of Education. (2010d). Race to the Top: Technical Review Form - Rhode Island. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-
applications/comments/rhode-island.pdf

U.S. Department of Education. (2010e). Race to the Top: Technical Review Form - Tennessee. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/comments/tennessee.pdf

U.S. Department of Education. (2010f). Race to the Top: Technical Review Form - South
Carolina. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/comments/south-carolina.pdf

U.S. Department of Education. (2010g). Race to the Top: Technical Review Form - Georgia. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/comments/georgia.pdf

U.S. Department of Education. (2010h). Race to the Top: Technical Review Form - Kansas. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/comments/kansas.pdf

U.S. Department of Education. (2010i). Race to the Top: Technical Review Form - Deleware. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/comments/delaware.pdf

How the Race (to the Top) Was Won (Part 1 of 2)

Race to the Top (RTT), the foremost education policy instrument under the Obama administration, was introduced in 2009 as part of the American Recovery and Reinvestment Act. The first two winning states, Delaware and Tennessee, were announced in the spring of 2010. At that time I wrote a paper about RTT for a policy class but didn't blog about it. Since RTT has continued with more rounds of state competition and a new RTT program for school districts, I hope there's still some relevance in sharing some of what I learned about RTT and what states did to score well on their applications.

Using policy definitions developed by McDonnell and Elmore (1987), RTT is a near-perfect example of an inducement, where money is exchanged for action. Inducements are a simple model:

RTT adds a layer of policymaking to this simple process. In the model below, the state departments of education have all policy instruments at their disposal. The inducement exists between the federal and state policymaking bodies, and not necessarily between the state and the local education authority (LEA). (The new RTT for districts will obviously change this arrangement.) Regardless of the instrument(s) used by the states, the goal, as defined by the U.S. DoE, is for states to "[lead] the way with ambitious yet achievable plans for implementing coherent, compelling, and comprehensive education reform" (U.S. Department of Education, 2010d). Additionally, the DoE clearly states that "Creativity and innovation are rewarded in this competition" (U.S. Department of Education, 2010c. p. 15).

In the ideal RTT scenario, the competition would look like this:

The U.S. DoE rewards the states with the most promising reforms.
Winning states would enact and enforce new education policies.
The effects of the new policies would be measured.
Policies that prove to be successful would be replicated by other states.

There are two significant problems with emphasizing creativity and innovation in RTT. First, creativity and innovation necessitates deviation from proven reforms. You can't be creative by saying, "We're going to do what we know works." Delaware and Tennessee's "innovative" reforms (a label worth questioning) may have helped them win Phase 1 of RTT, but it may some time before we know if the reforms perform as intended. Implicit in the RTT competition is an assumption that established, effective reforms are too few, too expensive, or too difficult to scale, so RTT challenges states to create new reforms that might be cheaper or more easily implemented. This creates a condition where RTT money gets awarded for the potential of a policy, and not its past, proven effectiveness.

The second problem with emphasizing creativity and innovation is that RTT was not structured as an brainstorming, anything goes kind of policymaking process. RTT comes with a detailed scoring rubric, and any state wishing to score well obviously wrote policies to satisfy the rubric. How does that encourage creativity or innovation? In addition, states or districts applying beyond the first round will likely replicate the highest-scoring applications from Phase 1. This means that policies developed as part of RTT Phase 2 and later are likely to be less diverse, less creative, and less innovative, but no more proven.

So how did states manage this balance of creativity versus scoring high on the rubric? Thankfully, the rubric, the applications, and the judges' scorecards are all publicly available, so we can see exactly what each state proposed in their application. The analysis in this post will answer the following questions:

Where in the RTT rubric could states score the most points?
For the portions of the RTT rubric identified in #1, which states scored highest?
For the states identified in #2, what did their application propose and what were the judges' comments?

Race to the Top Scoring

Here is a summary of RTT Phase 1 scoring:

Selection Criteria	Points Possible	Percent of Total	Average Score (Points)	Average Score (Percent)	Standard Deviation (Points)
A. State Success Factors	125	25	90	72	18.03
B. Standards and Assessments	70	14	61	88	9.63
C. Data Systems to Support Instruction	47	9	33	70	7.04
C. Data Systems to Support Instruction	47	9	33	70	7.04
D. Great Teachers and Leaders	138	28	91	66	21.5
E. Turning Around the Lowest-Achieving Schools	50	10	36	72	11.19
F. General	55	11	37	68	12.96
Competitive Preference Priority 2: Emphasis on STEM	15	3	11	73	6.73
TOTAL	500	100	359	72	67.22

(Source: U.S. Department of Education, 2010a, 2010b)

Looking at how the points on the rubric are allocated, it's clear that for any state to do well they'd need to score well on Section D, "Great Teachers and Leaders." It was worth 28 percent of the 500 total points, and we now can see that of all the sections, the fewest points (as a percent) were awarded in this section. That means there was a lot of potential upside here, and we can dig into the details of "Great Teachers and Leaders" to see which states scored best. In the table below you'll find all the subsections of Section D, along with the scores of the eight states (DE, TN, GA, SC, RI, KY, LA, and KS) who had the top score (in bold) in at least one of those subsections.

Selection Criteria	DE	TN	GA	SC	RI	KY	LA	KS
Overall RttT Phase 1 Rank	1	2	3	6	8	9	11	29
(D) "Great Teachers and Leaders" Score	86%	83%	81%	82%	88%	80%	89%	63%
(1) Providing high-quality pathways for aspiring teachers and principals	82%	71%	70%	74%	84%	94%	89%	32%
(2) Improving teacher and principal effectiveness based on performance	87%	91%	86%	91%	94%	76%	90%	56%
(2)(i) Measuring student growth	88%	100%	48%	88%	80%	84%	96%	56%
(2)(ii) Developing evaluation systems	85%	91%	81%	89%	96%	83%	91%	68%
(2)(iii) Conducting annual evaluations	92%	100%	96%	90%	100%	58%	92%	68%
(2)(iv) Using evaluations to inform key decisions	86%	87%	91%	93%	93%	76%	89%	45%
(3) Ensuring equitable distribution of effective teachers and principals	85%	74%	87%	73%	79%	73%	90%	89%
(3)(i) Ensuring equitable distribution in high-poverty or high-minority schools	83%	68%	93%	68%	92%	72%	88%	85%
(3)(ii) Ensuring equitable distribution in hard-to-staff subjects and specialty areas	88%	82%	78%	80%	60%	74%	92%	94%
(4) Improving the effectiveness of teacher and principal preparation programs	81%	90%	73%	81%	90%	77%	86%	66%
(5) Providing effective suport to teachers and principals	95%	75%	75%	79%	84%	92%	84%	80%

(Source: U.S. Department of Education, 2010a)

Just by reading the titles of the subsections above, you can recognize some of the most contentious areas of education policy we've seen the past several years. So not only is this part of the RTT application about high points, it's high-stakes. If the greatest opportunity for improvement in education truly lies in this area, it will be critical to get these policies right. In Part 2 of this post, we'll look at each subsection and the application from the state with the highest score in that area. Some of the policies are sound, but some aren't, and sometimes the judges' comments indicate divergent interpretations of both the application and the rubric.

References

McDonnell, L. M., & Elmore, R. F. (1987). Getting the job done: Alternative policy instruments. Educational Evaluation and Policy Analysis, 9(2), 133-152. Retrieved from http://www.jstor.org/stable/1163726

U.S. Department of Education. (2010a). Detail chart of the Phase 1 scores for each State. Retrieved from http://www2.ed.gov/programs/racetothetop/phase1-applications/phase1-scores-detail.xls

U.S. Department of Education. (2010c, May 27). Race to the Top Program: Guidance and Frequently Asked Questions. Retrieved from http://www2.ed.gov/programs/racetothetop/faq.pdf

U.S. Department of Education. (2010d, April 16). Race to the Top Fund. ED.gov. Retrieved June 6, 2012, from http://www2.ed.gov/programs/racetothetop/index.html

Review: Race to Nowhere

Tonight I had the pleasure of viewing a screening of the documentary Race to Nowhere at the Shepherd Valley Waldorf School. With other education films such as Waiting for Superman and The Lottery, Nowhere provides a very different, yet very important perspective on American education. Check out the trailer:

People need to see this movie -- especially people who see Waiting for Superman. (Are you hearing me, Oprah?) Where Superman wants you to point fingers at a school, a teacher, or a union, Nowhere doesn't try to assign blame. Nowhere wants you to understand our educational culture and our roles in it, and use that understanding to change the way we as a society view school. Instead of over-scheduling, over-working, and over-stressing our students, Nowhere advocates for children, letting kids be kids and fostering their creativity and happiness to make them (or let them be) accomplished learners. Nowhere paints a powerful picture of accountability-based, high-stakes reform, and we begin to see how easily we fall into the traps of our system even while we understand its failings. Where Superman demands anger as the impetus for social change, Nowhere is a plea for compassion. Anger generally trumps compassion when it comes to getting people's attention, so I have doubts that Nowhere will be getting the attention it deserves alongside Superman. But we don't always make our best decisions when we're angry.

I'll admit it: for most of my life I scoffed at the idea that student self-esteem was a prerequisite for student achievement. I always thought it should be the other way around. After a year or so of grad school, with sufficient guidance and time for reflection and self-enlightenment, I now realize that my opinions were far too grounded in my own experience as a relatively stress-free, high-achieving student. This movie isn't about the warm and fuzzy student self-esteem I may have discredited in the past; the students in this movie are being harmed both psychologically and physically in ways that are hard to watch, and students who don't want that stress are giving up altogether. Nowhere seeks a balance, a reciprocity between student welfare and achievement that we must desire as an educational outcome. High standardized test scores always look good, but not if they come with higher incidents of student sickness, headaches, sleepless nights, caffiene and stimulant abuse, eating disorders, and suicides. (The film contains a heartbreaking story of a 13-year-old girl who killed herself, essentially, because a bad test in 8th grade algebra was going to prevent her from getting straight As.) The most powerful voices in Nowhere are the students. They are bright, well-spoken, driven students who want to do well as much or more than any parent, teacher, administrator, or policymaker wants for them. They are the stars of this movie and they deserve not only our attention, but our action. Where should you start? Go see the movie and watch for the answers to that question at the end of the film.

Tenure and Union Contracts Are Two Separate Issues

One of many teachers to speak during NBC's "Teacher Town Hall" on Sunday was this young woman, featured again on Monday's NBC Nightly News with Brian Williams:

Visit msnbc.com for breaking news, world news, and news about the economy

While I don't know all the details of her particular situation or what it's like at her school, her first words concern me: "I think we don't understand tenure." She then goes on with a passionate speech about how the union contract at her school interferes with her ability to deliver the kind of education she'd like to see her students get. I'm worried that she's failed to distinguish between union contracts and tenure, and it's clear from our current national discourse that she isn't the only one.

Union contracts, sometimes called professional agreements or master agreements, define fundamental working conditions between the school board (or administration) and the teachers, including contract hours, sick days, early retirement, grievance procedures, and many other practical, often common-sense things that are found in many employer-employee relationships. These are generally negotiated at a district level and quality varies. And, as with anything territorial, these negotiations can become very politicized and what started as common sense gets twisted into spiteful actions and reactions. If this teacher works in a district that prevents her from volunteering her non-contract time to help kids who want to be helped, then there is something wrong with both her district's union contract and how it's enforced. She has every right to be upset.

Tenure is something quite different. Tenure is usually defined by state lawmakers, and to avoid the "job for life" misconception many states don't actually use the word "tenure." Colorado, for example, uses "non-probationary teacher." After teaching successfully for several years (usually 3), as shown by regular and multiple administrative evaluations, a teacher earns due process rights. While I am not a lawyer, my understanding of due process is this: a teacher with "tenure," when faced with dismissal from his/her job, has the right to be heard in front of a neutral judge. Unfortunately, given the high stakes of such decisions (for both sides) and the inefficiencies of the legal system, such proceedings are contentious, taxing, and expensive. Tenure protections were designed to protect teachers from the whims of political pressures, not poor performance. But in today's heated climate, however, poor teacher performance couldn't be much more political. Few would disagree that reforms are needed, but the rights of due process are not easily negotiated or redefined.

In some ways I'm happy that this teacher doesn't think she needs tenure. Maybe we finally live in a world where parents and administrators are more supportive when their teachers try innovative teaching practices. Maybe teachers are free to assign grades to students without outside influence. Maybe communities are more tolerant of teachers who affiliate with a religion, political party, or sexual orientation that differs from their own. Maybe teachers, such as the one in the video, can now be outspoken without risking their jobs. But let's get serious -- this is not the world in which we live and the teacher in the video, whether she thinks she needs them or not, is a noble professional who deserves some protection against these kinds of pressures. I want her to be able to argue for what's best for her students and not fear for her career because her sentiments might be unpopular, whether it's with her administration or union officials. Again, this is not about her performance. The protection from pressures I've described above should and can be dealt with separately from concerns of poor performance.

I admire the spirit of the young woman in the video and hope she sees the differences between her union contract and tenure. I also hope she is admired for speaking out and is received in her school with understanding and respect. I also hope she and others learn to calm the discourse, understand the critical distinctions in these important issues, and work to do not only what's best for children, but what's right for their teachers as well. Reforms done "with" are destined to be more successful than reforms done "to," and that kind of cooperation is going to take a refined level of debate.