teas for Enrichingthe CurriculumI Sharpening Subjective Evaluation Skills Judith L. Gersting Department of Computer Science University of Hawaii at Hilo Hilo, Hawaii 96720 USA gersting@ hawaii.edu Frank H. Young Department of Computer Science Rose-Hulman Institute of Technology Terre Haute, Indiana 47803 USA young@cs.rose-hulman.edu he ability to evaluate one's own work, or that of colleagues, is a valuable skill. In a programming class, students rely on the compiler as a mechanical means to "evaluate" the quality of their programs, feeling that a program that compiles must be good. As computer science educators, we look for much more than merely successful compilation when we evaluate student programs. While we can espouse certain guidelines, there is still subjective judgment involved. Even more subjective judgment is involved in evaluating other written work such as reports, proposals, etc. Skill in self-evaluation, as well as the ability to evaluate the written work of others, is something we should help our students cultivate. Unfortunately, many computer science students have little confidence in their ability to make subjective judgements. Even more of a problem is the tendency of science and engineering students to consider any subjective rating to be unscientific, and therefore invalid and useless. How can we help our students overcome these prejudices? One possibility is to create an environment in the classroom that enables them to realize the validity and usefulness of their own subjective ratings. This article describes one successful attempt to create an environment where this learning is facilitated. The Initial Assignment: Students were asked to prepare a preliminary report of their system analysis activity in a software engineering course. The Second Assignment: After the initial assignment was turned in, the instructor distributed seven sample papers (with author names suppressed) to every member of the class. Every student received the same seven sample papers. After some initial class discussion about what was good and bad in each paper the students in the class were asked assign grades to each paper. No T further instructions were given. Some Interesting Information: One of the "papers" that was distributed was blank. This paper represented the paper of a student who had not turned in the assignment. After the initial reaction of the students to this "paper," the instructor asked what the reaction of the students would be if they were this person's supervisor and this was what was turned in as a work product. The resulting discussion showed that the students had never thought about a late paper in this light. The Initial Results: The grades that were turned in showed major inconsistencies. Some students graded on an A to A- scale while others graded on an A to F scale. This lack of uniformity made simple grade comparisons meaningless. Massaging the Results: The professor converted all grades into rankings. One was assigned to the highest graded paper(s) and seven was assigned to the lowest graded paper (unless two papers tied for last in which case each received a six). Tied papers always received identical rankings. The ranks received for each paper were totaled. Reporting the Results: The totals were reported to the class along with the professor's letter grades for each sample paper. Low totals were desirable. The paper that received an A was ranked first by more than 90% of the students. The paper that received an F was ranked last by 100% o17 the students. Most importantly, the rankings of individual students agreed with the consensus ranking in more than 85% of the cases when the grades were a letter grade apart, although, as expected, there was less agreement (about 70%) when grades were a half a letter grade apart. Student Reactions: Students were amazed at the amount of consensus and at how close their rankings were to the rankings of the group (and to those of the professor). They were surprised that this consensus was obtained even though there were no uniform standards being applied and the individuals doing the ratings were often applying different standards. Educational Opportunities: This exercise provided the instructor many opportunities to help students develop their professional competence. Students were reminded that they had no particular expertise in grading. Despite this, they demonstrated remarkable consistency in their rankings. The instructor took care to point this out to the students and also indicated that rankings assigned after the students had some additional experience might result in even more consistency. The instructor was also able to point out that the students in the class had demonstrated their ability to evaluate the work of others in a fairly reliable fashion, Students were thus given an important ego boost. Finally the instructor encouraged the students to apply similar evaluation techniques to their own work. In particular, the instructor encouraged the students to rely less on instructor evaluation, placing more emphasis on their own evaluation of their own work. The explanation for this was simple employers do not give letter grades. Did this exercise work out as the instructor intended? No, there was more agreement than was expected and there was more learning than expected. Was this exercise worth one class period? Emphatically yes! Will the instructor do this exercise again? Emphatically yes[ It was a worthwhile experience for all of the participants. It helped them grow as well as learn. S I G C S E Bulletin June 1999 Vol 31. No. 2
/lp/association-for-computing-machinery/sharpening-subjective-evaluation-skills-tesANH8Jhn