At a Harvard University lab, I saw some surprising inventions that challenge our popular images of robots.
It’s a hot topic because of efforts to improve classroom learning by using improvement in student test scores as one of multiple measures to evaluate teachers, and then make decisions about their retention, promotion, and pay. In the past, most school districts have made these decisions based almost entirely on seniority and whether a teacher had earned graduate degrees. Results in the classroom weren't considered at all. President Obama’s Race to the Top Initiative began encouraging states to use improvements in test scores as part of evaluating teachers. And today some 30 states do so.
Harris is a proponent of using value-added measures that get at how effective a teacher is at helping students progress from whatever level they're at when the year begins. By analyzing how a teacher’s students improve during their time in his or her class, rather than looking at absolute test scores, you don’t unfairly reward teachers who happen to have a lot of top students. And you don't unfairly punish teachers who take on the challenge of teaching kids who may not arrive well-prepared or very engaged with school for whatever reason, often because they come from disadvantaged backgrounds.
Some people argue that standardized tests distort the learning experience—that teachers will “just teach to the test”—and that tests don't measure creativity. That’s an interesting point in subjects like art and music. In these and some other subjects, knowing what to test is complicated. You need to be very careful.
But it seems to me that well-designed tests in science and math are useful in determining proficiency. Teaching students to pass such tests is a good thing. Creativity is important too, but in fields such as, say, economics or nursing, first you need to be able to do the math. No one is so creative that the person is a good nurse although unable to do the division needed to figure the right dosage of medication to give a patient.
Harris is careful to say that test scores should be just one of several kinds of data used in evaluating teachers. Scores alone can be misleading, although field experience in the states has shown how to minimize random variance, such as by looking at two years’ of a teacher's student scores. For the most part, Harris does not acknowledge this field experience, although he discusses the statistical reasons to be cautious with test results.
Still, I think he’s right in emphasizing that while value-added measures can help principals focus on working with teachers who may be struggling, the principal and peer-teachers should sit in on classes and provide feedback the teacher can use to improve. This is the big benefit. Only if all else fails, over a reasonable amount of time, should a teacher who's weak (based on experts' observations as well as scores) be counseled to find another line of work.
Including test data as one component is the key to creating that feedback loop. Research from the foundation’s Measures of Effective Teaching (MET) project has found that by using a balanced combination of measures—including value-added test scores—it is possible to identify aspects of teaching that ultimately lead to what we want most—better student learning. Among these measures, classroom observations and even student surveys can offer teachers real guidance for improvement tailored to their particular needs. When evaluation systems have professional development aims too, teachers can improve, the system can target supports that are actually working, and students benefit.
Testing alone is not enough. Harris provides a theoretical explanation of the statistical reasons why testing is vital but should be used carefully and as part of a system of classroom observations and professional development. His overall conclusions are validated by several years of field experience with teacher evaluation in states such as Tennessee. His book is a solid introduction to how value-added measures can work, although he could have made better use of actual field data to show how pitfalls can be avoided. That would have done more to ease some fears of these measures and of teacher accountability overall.
Through the use of multiple measures, evaluation systems can serve professional development as well as accountability goals, supporting teachers as they work to improve, building trust among teachers, and ultimately benefiting students.