Every few years someone does a study that proves student evaluations in higher education are biased against people of color, white women, and women of color. The latest study also shows that untenured faculty are likely to receive lower scores for the same work. And, also no surprise, that student perception of inflated grades (ie that they are earning an A no matter what work they do or do not do) will increase the evaluation.

The interesting thing about these studies, is it does not matter how “objective” the agency administering them or financing them is, schools continue to rely on evaluations unweighted by considerations of diversity. Evaluations are also consistently used to prove “lack of fit” during the tenure process. Though some of us, myself included, have managed to find ways to make even the most contentious classroom fill out the forms in a positive light the biases involved are no surprise to any of us. So the question then becomes: why are they still in use?

This from the Chronicle including a link to pay for a copy of the most recent study:

Student evaluations of their instructors can often reflect certain biases that professors themselves have no control over, says a study by Michael A. McPherson and R. Todd Jewell, both associate professors of economics at the University of North Texas. Given the weight that such evaluations receive in decisions regarding tenure and promotion, they ask, should evaluation scores be adjusted?

According to the study — which is based on student evaluations from 280 master’s-level courses in economics at North Texas from 1994 to 2005 — faculty members are more likely to receive lower scores from students if their course meets just once a week, or if they are teaching a theory course. Faculty members who are nonwhite also earn lower average scores, says the study, as do professors who are nontenured.

In addition, say the authors, instructors are able to increase their average evaluation scores by inflating the grade expectations of their students.

Adjusting evaluations in a way that controls for factors beyond the control of faculty members — such as their race — could make the ranking process fairer, as well as more accurate, the authors argue. After adjusting for such factors in their own study, the authors report that there are “clear differences between raw and adjusted rankings and that these differences are substantial for some instructors.” In one of their adjustments, a nontenured female instructor went from being ranked 10th out of 17 tenure-track instructors, to being the top-rated instructor.

“The issue of bias in student-evaluation-of-teaching scores is one that each department should discuss, and one that each department may resolve in its own fashion,” conclude the authors. They add that, in general, “departments may find that adjustments based on some set of criteria are a valuable exercise.”

The article — “Leveling the Playing Field: Should Student Evaluation Scores Be Adjusted?” — is available to subscribers or for purchase through Blackwell Synergy.

6 thoughts on “Blackwell Study Proves Student Evals are racist

  1. I’m so glad you brought this up. As a nontenured white woman I can’t stand end-of-year evaluations (though I do use short evaluations all semester long that I keep for myself to track course progress). Students are stressed, annoyed and ready to take it out on whoever they feel has the least authority. So people of color, women, and people without tenure get the most dumped on (EVEN IF during the course student perceptions were very different, which I’ve found by doing evaluations during the semester).

  2. Kate – thank you for putting out there that you do in semester evaluations. I encourage all of my grad students to do this when they first start teaching, so they will be in the habit of doing it when they are faculty. It is a great way to track progress, reflect on what is working and what you can tweak, and to give students some accountability and say in their education. And you are right to note that when you are in one of these targeted groups, the semester evals can often differ from the end of the year one’s for the exact reasons you note (in addition to the isms). An added benefit then, is that you have these additional evals on file to counter discussion about teaching effectiveness should you be called out for “a bad semester” (which everyone has sooner or later).I would add that if you have graduating seniors in your classes whom you’ve been working with, it is nice to ask them for written evaluations to be turned in at the end of the year for your “tenure file.” If you are on tenure track, these evals can be put in the file and then you do not have to scramble for them later, and if you are on the job market at any time you have them as an additional testament to the work that you do as an advisor, mentor, and educator.

  3. Thanks for pointing out this new study. There was one many years ago I lost the reference for. Men who were demanding were respected, women were mean harpies, etc. etc.There’s also the question of the difference between courses where the issue is mastery of concrete material and those where the activities are more interpretive. I find they don’t like the intepretive ones, or the skill building ones, but the acquire-info ones … these are the ones that (although they require work) require the least investment of self, either in terms of practice time (knowing you won’t be perfect yet) or navigating the unknown (knowing you won’t get to the one final answer).

  4. even earlier one by (eliza) beth hartung in nwsa journal!how many surveys do the gatekeepers need!!! and what has been the effects of all our diversity classes on our students, colleagues, and T & P committee?kbw

  5. thanks for the author on that one KBW – for whatever reason I was thinking the AAUW and the NWSA one were the same but kept hearing a voice in my head saying other wise. thanks for letting me take that confusion off my plate.😀

