MathEd.net: RYSK: Butler's Effects on Intrinsic Motivation and Performance (1986) and Task-Involving and Ego-Involving Properties of Evaluation (1987)

This is the third in a series of posts describing "Research You Should Know" (RYSK).

As teachers, we care not only about what students learn, but why students learn. In a perfect world, we would all agree on what's important to learn and do and be self-motivated to learn and do those things. But our world isn't perfect, and students are motivated to learn and do things for many reasons. Understanding those reasons is important if we want students to be properly motivated and to perform well with the right attitude.

Ruth Butler earned her Ph.D. in developmental psychology from the Hebrew University of Jerusalem in 1982 and was a relatively new professor there when she teamed with veteran educational psychologist Mordecai Nisan, whose career includes time spent at the University of Chicago, Harvard University, The Max Planck Institute for Human Development, and Oxford University. Together, they sought to build upon studies that compared extrinsic vs. intrinsic motivation and positive vs. negative feedback, looking specifically at how different feedback conditions -- ones that can be manipulated by teachers -- affect students' intrinsic motivation.

For their 1986 paper, Effects of No Feedback, Task-Related Comments, and Grades on Intrinsic Motivation and Performance, Butler and Nisan expected that students who received feedback in the form of simple positive and negative comments (without elements of praise or grading/ranking) would remain motivated, while students who received grades or no feedback would generally become less motivated. To test this hypothesis, Butler and Nisan randomly assigned 261 sixth grade students to one of three groups. They gave the students two types of tasks: Task A was a quantitative "speed" task where students created words from the letters of a longer word, while Task B was a qualitative "power" task that encouraged problem solving and divergent thinking.

Butler and Nisan conducted three sessions with the groups:

Session 1: Students performed the tasks.
Session 2: Two days after Session 1 the tasks were returned.
- Students in the first group got comments in the form of simple phrases such as, "Your answers were correct, but you did not write many answers," or "You wrote many answers, but not all were correct."
- Students in the second group got numerical grades that were computed to reflect a normal distribution of scores from 30 to 100.
- Students in the third group got their work returned with no feedback.
After students reviewed their previous work, they were given new tasks and told to expect the same type of feedback when they returned for Session 3.
Session 3: Two hours after Session 2 students again reviewed their work and feedback (except for the third group, who got no feedback) from Session 2 and then got a third set of tasks. Students were asked to complete the tasks and were told that they would not get them back. The session ended with a survey of students attitudes towards the tasks.

When Butler and Nisan compared the students' average performance on the tasks in Session 1, all three groups scored approximately the same. That changed in Session 3. On Task A, students receiving comments and grades scored about the same in Session 3 (with an edge to the comments group for the creation of long words), but students receiving no feedback did far worse. For Task B, students receiving comments did significantly better than students who received grades or no feedback, who performed about the same. The only students doing well in Session 3 -- in fact, the only students consistently scoring higher, on average, in Session 3 than in Session 1 -- were the students who received comments.

The survey also showed attitudinal benefits for the comments group, who indicated they found the tasks more interesting and were most willing to do more tasks. Furthermore, 70.5% of students who received comments attributed their effort to their interest in the tasks, compared to only 34.4% of those graded and 43.4% of those receiving no feedback. Only 9% of students receiving comments said their effort was due to a desire to avoid poor achievement, compared to 26.7% of students receiving grades and 9.6% of the no feedback group. Lastly, 86.3% of students receiving comments wanted to keep receiving comments, while only 21% of the graded group wanted to keep receiving grades. The vast majority of graded students, 78.9%, wanted comments. The no feedback group was roughly split 50/50 on wanting comments or grades. None wanted to keep receiving no feedback.

Butler modified this study for her 1987 paper Task-Involving and Ego-Involving Properties of Evaluation: Effects of Different Feedback Conditions on Motivational Perceptions, Interest, and Performance. In it, Butler adapted a theory of task motivation used by Nicholls (1979, 1983, as cited in Butler, 1987):

Task involvement: Activities are inherently satisfying and individuals are concerned with developing mastery in relation to the task or prior performance.
Ego involvement: Attention is focused on ability compared to the performance of others.
Extrinsic motivation: Activities are undertaken as a means to some other end, and the focus is that goal, not mastery or ability.

Butler believed comments would promote task involvement, while grades would promote ego involvement. While both of these can be seen as intrinsic motivation, a third type of feedback needed to be considered: praise. Previous research on praise had gotten mixed results, possibly because researchers hadn't considered if the praise was task- or ego-involved. Butler's study would include ego-involving praise using comments designed to focus a student's attention on their self-worth and not on the task. Therefore, Butler hypothesized that praise and grades would generate similar results, results less desirable than task-involved comments.

The study was similar to the 1986 study, with 200 fifth and sixth graders split into four groups (comments, grades, praise, and no feedback) with subgroups in each for high- and low-achieving students. Tasks were administered in three sessions, with no feedback given after the third session. The tasks this time were divergent thinking tasks, used as Task B in the 1986 study. Praise would come in the form of a single phrase: "Very good." An attitude survey was given after Session 3.

As Butler expected, comments promoted task-involved attitudes while grades and praise promoted ego-involved attitudes. Students' interest in the tasks after Session 3 was higher for the comments group than for the grades, praise, and no feedback groups combined. Students who received praise showed more interest than those who received grades. As for performance, the comments group easily performed the best in Session 3, with both high and low groups improving their scores over Session 1, while all other groups performed about the same or worse compared to their Session 1 performance.

So what does this mean?

As a teacher who struggled with assessment and grading, it was Butler's work that most inspired me to start this RYSK blog series. Despite these results being 25 years old, there's not much evidence that Butler's findings have had a serious impact on the practice of most teachers. I suspect that few teachers know about Butler's work -- I certainly didn't. I was wrapped up in the scores and grades game, not fully aware of the impact those scores were having on my students. I knew it wasn't working, but I didn't have this kind of theoretical knowledge to support a significant change in my practice.

I'm not suggesting that we should suddenly demand a grade-free world. That's just not a realistic thing to expect given where we are now. What I would like to suggest is that teachers become more aware of how the feedback they give affects student motivation, and be careful to focus on task-involved comments whenever possible. Because students aren't likely to get this kind of feedback from standardized tests or computer-based learning systems (i.e., Khan Academy), it takes a teacher's touch to carefully craft the kind of feedback a student needs to sustain their motivation.

References

Butler, R., & Nisan, M. (1986). Effects of no feedback, task-related comments, and grades on intrinsic motivation and performance. Journal of Educational Psychology, 78(3), 210-216. doi:10.1037/0022-0663.78.3.210

Butler, R. (1987). Task-involving and ego-involving properties of evaluation: Effects of different feedback conditions on motivational perceptions, interest, and performance. Journal of Educational Psychology, 79(4), 474-482.