Thursday, June 02, 2005

The Unbearable Weirdness of Student Evaluations


I spent a good few hours last night going through my student evaluations for the past semester. An important ritual of the pre-tenure years was the annual review, in which I was expected to discuss and explain the evaluations. I no longer have to do this in a formal manner, but I think it's a good idea to spend some time trying to understand the evaluations and figure out what went wrong (and what went right) with each course. It puts a nice cap on the year and helps a teacher to consolidate gains and make plans to avoid repeating errors--this is particularly useful at a place like Wheaton where we don't regularly teach exactly the same courses in consecutive years: if you made notes about texts, assignments and lectures in your annual review, you'll remember them when planning the next class.

This year I had a relatively controlled experiment in the accuracy of student evaluations: for the first time, ever, I taught exactly the same syllabus in two different classes, back to back, in the same classroom. These were both my English 101 class Writing About a Wonderful Life, in which, among other things, I have students struggle through Stephen Jay Gould's book Wonderful Life.

The evaluations are pretty bizarre for how much they differ. Both classes met for 1.5 hours, the first from 9:30 to 11, the second from 11 to 12:30. The first class gave me my lowest numeric evaluation, ever for the course itself, a 3.9 out of 5. That same class rated me as a 4.6 out of 5 as an Instructor. Compare to the second class, which rated the course itself a 4.3 and me as instructor 4.7 (and that first number for the second class is a little skewed, because one student rated the course a 2, only the second 2 I've ever received, out of about 500 evaluations at Wheaton and Loyola). A better numeric evaluation might be to point out that in the morning class the mode for the course is 4 (10 out of 14), while in the later class it is 5 (8 out of 14). The mode for instructor, in both classes, is 5 (9 out of 14 and 11 out of 15).

The difference in instructor rating between the two courses is not significant, but the difference in course ratings is. Yet the one thing that was exactly the same in the two classes was the course itself. I used the same syllabus, same books, same assignments -- precisely the same. True, by the time I got to the second course, I had worked out the kinks in lecture and discussion, recognizing what had worked and what hadn't. But that should have changed the instructor rating, not the course rating.

I think what this little experiment shows is that individual classroom dynamics are projected unconsciously onto things like syllabus and choice of reading assignments. The second class was much more animated (the 11 to 12:30 time-slot is the best one of the day. Students are awake, they aren't logy from having eaten lunch, and they're not rushing off to anything) with a wider range of personalities. That energy, which had nothing to do with course design, ended up influencing the students' impression of the apparently non-subjective elements of the course.

It's also strange to realize how one's impression as a teacher is different from the impressions the students have. I felt very guilty about this semester because I wasn't on campus the number of hours I would have liked to have been (child care issues). But students raved about how accessible I was--because I answered email right away and scheduled meetings within 24 hours (always during the lunch hour, which students like because it doesn't conflict with their classes). I was not in my office nearly as much as I usually am, and yet I apparently met their needs more effectively (I guess more scheduled meetings rather than drop-ins).

There was, however, one anomalous comment in the Chaucer class, where a student said he/she had emailed me twice and never received a reply. This is very odd, since I just did a search of all my email from students this semester, and there is not a single message that doesn't have a "replied" icon next to it. Strange. I wonder if this could be the same student who claimed to have emailed me all the papers for the semester (I never received any of them, and he never did turn in hard copies...). And it does make me wonder if the wonderful world of spam has started making email much less reliable than it has been.

Overall the evaluations were useful and encouraging. Students seem to have got what I was trying to do in the courses, even if many of them hated the Stephen Jay Gould reading. And in Chaucer there was pretty unanimous agreement that it's a good idea to violate the integrity of the A-fragment.

3 comments:

Fred said...

Dear Ed - I recommend Schuster's book, "Breaking the Rules" - it has lots of practical advice for comp teachers.

Frank said...

I have nothing but love and affections for almost all of my English professors, but I really wish I'd had you as a professor! Any chance Auburn could magically transport itself to the Delware Valley and start granting Ph.D.s so I can have you has my advisor when I go to get my doctorate? No? Damn!

What Now? said...

I often teach two identical sections of courses back to back in the same room, so I regularly get to experience the student evaluation phenomenon you discuss here. The oddity I found most interesting was last fall when my "how organized is the professor" number dropped dramatically between the first and second sections--and yet, if anything, I am more organized in the second section because I've worked out any little kinks from the first section. Very odd.

I always read my evaluations very carefully and try to improve my courses based on this feedback (and my numbers are usually very good), but I have to say that experiences such as I noted above make me skeptical about taking the numbers too seriously. I get far more out of the written comments from students.