Fear, Mistrust, Frustration: A Look into Michigan’s Punishing Teacher Evaluation Experiment
Stories by Brenda Ortega
MEA Voice Editor
MEA member Corey O’Bryan never planned to be a teacher.
A self-proclaimed nerd in high school, he entered Western Michigan University to study engineering—until discovering he wanted to share his love of math and science more than he wished to pursue his childhood dream of designing robotics for military applications.
What he didn’t realize then was how much he would come to love working with young people in his role as a math teacher at Loy Norrix High School in Kalamazoo.
“I really fell for the idea of getting to know these kids and learning about their lives, learning what they enjoy, talking about their sports,” the 11‑year veteran now says. “When kids stop by my classroom at lunch and they have something tough going on at home, they need to talk. That’s something I never expected to enjoy, but it’s great to be a sounding board for them.”
A few years ago, however, everything changed in a series of events that plunged O’Bryan into what he describes as “a very challenging and dark time.”
After eight years of positive job performance evaluations, he received a rating of “minimally effective” in 2016 under the supervision of a new school administrator wielding a new scoring tool implemented in the wake of a sweeping new state law to appraise teacher performance.
“It was devastating,” he said. “I was told I’m not a good teacher, I’m failing my kids, and it broke me. People I know might not say this to my face, but a lot of them would probably say I’m not the same person I was five years ago—not as positive, not as happy or cheerful.”
O’Bryan submitted a written response to the negative evaluation with evidence to dispute several aspects of his rating. For example, he received a score of zero in “content knowledge,” despite his work as the lead teacher on a team that rewrote the district’s Algebra I curriculum.
That zero and others he questioned with supporting documentation came two months before he was granted a Master’s degree in mathematics education from WMU with a 3.5 grade-point average.
“I presented the information to my administrator, and I received an email saying it was reviewed by Human Resources as well, but no scores were changed,” he said.
The next year he was labeled “ineffective,” the lowest score in the evaluation tool’s range. The year after that, he moved back up one step to “minimally effective,” and the district asked him to resign in exchange for a neutral letter of recommendation, he said.
He thought about leaving the teaching profession but hung on with the support of union leaders and staff, colleagues, and students. “I declined their offer, stuck it out and—well, I’m still here.”
At the time of his low ratings, O’Bryan was teaching the most at-risk freshmen taking the lowest-level math courses, including repeat Algebra I classes. Entering students tested below grade level, with some several years behind.
He worked hard, often with the help of a special education co-teacher in the room, he said. Together they sorted students into variable small groupings based on individual needs. He spent hours grading students’ work on a daily basis to assess understanding and engagement.
He developed a spreadsheet to track student grades and shared it with other teachers who wanted to use it. And he saw some kids increase two grade levels in math in one 12-week semester, including some who reached grade level.
He believes standardized test scores played an outsized role in his evaluations. “Because of the low pass rate in our most at-risk student population for math, I assume that is the administration’s primary cause for focusing on my evaluation and my teaching.”
He doesn’t claim to be perfect. Like most educators, O’Bryan wants to develop new and better strategies for engaging students who are disinterested or discouraged in his subject, who are often dealing with other struggles in their lives.
He’s made changes in behavior management at his supervisor’s recommendation, which haven’t moved the needle on his score. “I feel like our current system is all about identifying failings versus building on successes.”
Now he’s sharing his story—despite fears about doing so—to try to right a wrong. “I think my anxiety is not as big of a concern as fixing this problem that we have with this current evaluation system,” he said.
DEEP DISSATISFACTION with the four-year-old statewide teacher evaluation system in Michigan crosses geographic and socioeconomic lines.
Interviews with two dozen educators who agreed to talk for this story, plus opinion surveys and conversations among MEA leaders and members over three years, reveal an evaluation system that has lowered morale and raised fears without improving teaching and learning.
School districts span a continuum in approaches to the system’s mandates—and in their willingness or capacity to address issues arising from its implementation in 2015. However, repeated educator perceptions of the changes echo across the state.
Unfair. Arbitrary. Subjective. Demoralizing. Destructive.
MEA member Claire Reid, a first-year special education teacher, raised the issue with Gov. Gretchen Whitmer at a meet-and-greet event at her elementary school in Grand Rapids last month, and she received backup from co-workers in the room.
Reid teaches in a self-contained K-3 classroom for students with emotional impairments, but she’s judged on the same rubric used to evaluate someone teaching high school physics or middle school language arts.
“Evaluations are so important, and I want to grow and become better, but I feel frustrated that it might not be accurate because I’m evaluated against standards that I can’t achieve,” Reid told the governor. “That hurts my standing in the district for things I have no control over.”
Seated nearby, 20-year elementary teacher Jennifer Thayer agreed. “We wouldn’t label our students ineffective or minimally effective to help them learn and grow. It breaks your spirit, and yet we come in every day and give 110 percent to our children.”
Reid added that other new educators who recently graduated from university with her are voicing the same complaints about evaluations. “I think it deters a lot of new teachers from coming into the profession, and it also contributes to the high rate of younger teachers quitting.”
“Older ones, too,” a colleague remarked.
A more experienced special education teacher joined in to say she received a “minimally effective” rating in her 24th year of teaching after building-wide student test scores were included in her evaluation for youngsters she doesn’t even teach. “That was quite a shock,” she said.
Another long-time teacher echoed the others: “This evaluation system is not accurately reflecting the very hard work we are doing. I think it’s such a significant piece of what’s happening in this state right now with people leaving the classroom. It’s scary, and we need to fix it.”
Whitmer listened to the concerns and responded, “It’s clear that for a number of years now the philosophy in Lansing has been punitive and undermining and not supportive, for political reasons and not necessarily for what’s in the best interest of our kids.”
We have a lot of work to do on a number of fronts, Whitmer said, adding it will be a challenge to solve every problem in divided government. “But there’s no doubt that building up morale is critically important among the ranks of our educators.”
As Whitmer prepares to appoint a council of educators to offer expert insight on policy matters, Republicans control the state House and Senate. However, the party split narrowed to three votes in both chambers and several educators won seats in last November’s General Election.
BACK IN 2011, a GOP-dominated Legislature passed and Gov. Rick Snyder signed into law PA 173, which made it easier to dismiss teachers and barred school districts from making seniority or tenure status a main factor in layoff decisions.
PA 173 mandated that school districts base personnel decisions on retaining effective teachers and required annual evaluations for all educators, although it was not entirely clear how the new system would work.
A commission was established to make recommendations on a system that would incorporate student growth as a measure. That report led to an evaluation law passed in 2015 which mandated requirements and penalties of a statewide system for rating teachers’ effectiveness.
Under the 2015 evaluation law, starting in 2015-16, school administrators were required to choose one of four state-approved observation tools and to make student growth measures account for 25 percent of a teacher’s score, a figure that rose to 40 percent this year.
By 2016, 44 states across the country had implemented similar high-stakes teacher evaluation reforms—many mixing complicated calculations of student test score data, along with ratings from observation rubrics, into evaluation scores.
Since then, several states have moved away from using student test scores to rank educator effectiveness out of concern over the data’s reliability. In January, new Democratic majorities in the New York Legislature made the use of state test scores in evaluations optional, following several years of public backlash.
On a much larger scale, widespread research in recent years has questioned the efficacy of the new systems overall. In fact, last June the foundation driving the changes since 2009 released findings from a six-year study showing little evidence of improvements to teaching and learning.
By last October, the Bill and Melinda Gates Foundation—which funded nearly 40 percent of that $575 million teacher evaluation experiment in three large school districts and a charter school consortium—announced it was moving on to different education-related priorities.
Matt Kraft, a Brown University researcher who has studied teacher evaluation, told Chalkbeat the districts involved in the Gates Foundation study “were very well poised to have high-quality implementation. That speaks to the actual package of reforms being limited in its potential.”
Kraft was lead researcher in a study released in 2018 which found the supply of new teachers was reduced in states that eliminated tenure protections and adopted high-stakes evaluation systems, aggravating shortages in hard-to-staff subject areas and urban and rural schools.
“In our effort to move towards a better direction, were the costs larger than the benefits? That’s quite possible,” Kraft told Chalkbeat.
FEAR REIGNS as the overriding effect of Michigan’s changes to teacher evaluation, especially because PA 173 removed educator voices from the process by making procedures for hiring, evaluation, layoff, recall, and discipline prohibited subjects of bargaining, educators say.
Many of those interviewed said the resulting imbalance of power has made teachers subject to the whims of administrators free to demand extra unpaid duties and to define the particulars of what good teaching practice looks like.
From his conversations with MEA leaders across the state, Rick Vincent—president of the Reeths-Puffer Education Association in Muskegon—considers the changes to teacher tenure and evaluation a major cause of Michigan’s higher-than-average teacher attrition rate.
“I cannot tell you the feeling of fear that people have when the handle of their classroom door rattles, and they think, Oh, no—somebody is coming in to do that evaluation thing to me,” Vincent said.
It’s not surprising, he added, given that a 25-year career in education can be ended by an evaluation score less than one point lower than a colleague’s rating—even if both teachers are labeled “effective.”
“If that happens to me, I lose my house,” Vincent said. “It’s crushing to people.”
In addition to educators leaving the field, the number of new teaching credentials issued in Michigan dropped by 62 percent from 2004-2016, and enrollment in teacher preparation programs dropped 40 percent from 2011-2015.
Vincent attributes the problems to “insidious job creep,” which he defines as the loss of educator autonomy, increased workloads, and dismissal of educators’ high level of knowledge and skill. For that reason, he and others say they’re compelled to speak out in defense of their profession.
“The number-one question teachers are asking other teachers across the state is ‘When can you go?’ As in—When can you retire and hopefully still have some of yourself left?” Vincent said.
This year’s increase in the percentage of student test scores in teacher evaluations—from 25 to 40—has further ratcheted up the tension among educators.
MEA member Ellissa Lauer laments the time she must spend teaching her Wyoming kindergarteners how to use a computer and type on a keyboard to complete online math testing three times per year.
The youngsters are just learning numbers and counting, she said. Many speak English as a second language. She has found her students score markedly better taking the same test with paper and pencil.
“They’re going to tell me I’m a bad teacher because this data that I have no control over is ineffective? It doesn’t guide my instruction. It’s strictly for evaluation purposes.”
Issues around testing do not disappear as students get older, many teachers point out. Students in grades 3-11 are required to undergo M-STEP testing that many educators consider flawed and grade-inappropriate. And the consequences for results only apply to teachers—not test takers.
One MEA leader from an economically disadvantaged area said eighth-grade teachers in his district collected their own data over the past two years—marking the time taken by each student on various sections of the M-STEP test, which spreads over many hours across several days.
Their findings, along with a survey of student attitudes, documented the significant number of students who completed sections of the state assessments in a fraction of time allotted or admitted skipping the five-paragraph essay.
“We have so little control over external factors,” said the teacher, who asked not to be identified. “My job is about more than a test score.”
OBSERVATION TOOLS approved for use by the state have created their own concerns among educators, especially their tendency to reduce complex pedagogy to a checklist removed from meaningful context and open to subjectivity in ratings from one administrator to another.
The art of teaching does not lend itself to one-size-fits-all measurements, said MEA member Bill Julian, who teaches business and social studies and serves as a Google-certified technology consultant for Traverse City Area Public Schools.
“The evaluators are looking at their computer and whether they can check a box or not, and if those things aren’t apparent at the exact time someone’s evaluating, you’re not going to get credit for it,” Julian said.
“It’s frustrating, because you’re doing other good things in the classroom which may not be on the checklist, and we also don’t get credit for that.”
Time spent planning to meet dozens of criteria on a checklist in two 20-minute observations and various unannounced visits per year shifts teachers’ thinking away from improvement goals rooted in content and students’ needs, said Farmington High School English teacher Megan Ake.
Additional hours are required to gather student data and fill out paperwork to document work, she said. “We go from ‘Here’s what I need to do to drive instruction,’ to ‘How is this going to look on my evaluation?’ It doesn’t feel organic.”
Plymouth High School chemistry teacher Scott Milam agrees. The MEA member was named 2018 Michigan Science Teacher of the Year, but he has not yet achieved “Highly Effective” status in his district.
The inflexibility of the system’s goal-setting and observation tools do not nurture true reflection, he said. Teaching a high school advanced science class requires different approaches than a middle school required core class, or music class, physical education, or special education.
“There’s so much discrepancy when I look at the rubric, I often think, This isn’t actually appropriate for my class. This doesn’t fit.”
The observation tool turns teachers into “point chasers,” Milam said, a quality most educators discourage in students because it takes focus away from learning.
“I don’t know any of my colleagues who say this has helped them become a better teacher. It’s just really frustrating.”
Many of those interviewed said a big problem lies in the dichotomy between the stated intent of evaluations—to coach teachers in the challenges of developing a highly complex craft—versus using evaluations to sort and rank employees for disciplinary and job placement purposes.
For that reason, the system ultimately rewards compliance and discourages risk-taking to the detriment of innovation and growth, educators say.
“Staff is reluctant to try new things, disagree with the principal, ask a question or make a move that might count against them,” said Lisa Sutton, an instructional coach and Kalkaska Education Association President. “How do you improve when you’re running scared?”
Absent tenure protections and bargaining and grievance rights, the system also discourages teacher collaboration—which research shows actually does help educators to grow in their practice, along with high-quality, targeted professional development.
Even Charlotte Danielson, whose research on teacher effectiveness evolved into an eponymous evaluation tool used in Michigan and other states, has questioned the distilling of professional craft to “numbers, ratings, and rankings” as evaluation reform has done nationwide.
“I’m deeply troubled by the transformation of teaching from a complex profession requiring nuanced judgment to the performance of certain behaviors that can be ticked off on a checklist,” the author of Framework for Teaching wrote in Education Week.
“In fact, I (and many others in the academic and policy communities) believe it’s time for a major rethinking of how we structure teacher evaluation to ensure that teachers, as professionals, can benefit from numerous opportunities to continually refine their craft.”
Danielson’s critical commentary appeared in April 2016—a few weeks before Kalamazoo’s Corey O’Bryan received his first negative evaluation, which was formulated using the Charlotte Danielson Framework for Teaching. Three years later, he’s still rebuilding his confidence.
“I hear from some of the union reps that my story is empowering and helping out other people dealing with similar circumstances,” he said. “If I can help someone else stick with teaching because that’s what they love to do, then I made the right choice by pushing through.”
Related Stories:
Who Will Listen to Educators
Veteran Teacher: ‘At the Wrong Place at the Wrong Time’
MEA Legal Update: Evaluations and Due Process
Find Help from MEA
Act Now to Change Student Growth Percentage
Labor Voices Opinion: Educators rate evaluation ‘ineffective’
MEA never supported me when l went through the same things. After 28.5 years of teaching l was deemed ineffective. I got forced out and had to use what l had in saved retirement to pay off my last 5 months to get to 30 years. I had to refinance my house and car. It effects teachers deeply especially their self-esteem. Prayers to others going through this.