Peer Assessments


Peer assessment has become a talking point in many of the MOOC courses, and has also resulted in bitter and/or extensive arguments/discussions in several of the discussion forums. Seeing the impact of these debates/arguments on the learning experience of students led me to research for prior data revolving around this area. This article presents my preliminary findings on peer-assessments.

Note: This work in not an extensive and a detailed one, but preliminary findings certainly suggest that the process needs to be refined to improve the learning process amongst students. At least in two of these articles, there was an explicit mention about including clearly defined rubrics.





and references cited therein.


Reference 1 - Harvard link:

If you were to look at the summary section,

1. The success of peer assessment relied a lot on training of students, and monitoring of students by the instructors.

Note: Training in a large scale MOOC system, though not easily attainable, can be indirectly implemented through clearly defined rubrics (as suggested in couple of the above-mentioned references)

2. This article also suggested that lower performing students tend to inflate their own grades, whereas grades of higher performing students can suffer from others' grading.

However, this study appears to not include blind-peer grading system as is prevalent in MOOCs, and the article also addressed different level of education.


Reference 2 - Coming to the NIH link,,

There were several questions brought up in the article, and I have shortlisted some key elements of the article -

------------------------------- “Is student grading good enough to use?”

Based on the research findings, the researchers have mentioned that peer grading can be subjective and context-dependent.

Their study also suggested that peer grading does not appear to inflate or deflate the grades. But, this study included similar group of students with probably lesser variation in diversity compared to what is seen in MOOC courses.

This study also cited various other references to address the following question,

------------------------------- “Are students almost always easier graders than professionals?”

Research findings suggested the following:

a. Medical students grade their counterparts more harshly than expert tutors

b. Dental students grading are indistinguishable from the professionals

c. Enrolled non-graduate students graded more leniently than graduate students in introductory psychology class

In this reference, the researchers also suggested that student-professional grading gap increased with the increase in complexity of the questions (higher levels of Bloom’s taxonomy, where the performance of students’ tend to go down). With most case-studies/scenario analyses tend to be of higher level of complexity, there is a potential for concerns of bias with peer grading. (Note: Rubrics were clearly defined in this study)


Reference 3:

In this article, researchers point out that student peer assessment was quite reliable, but they added a caveat that the level of significance was a concern. This article also presented some of the limitations (which is presented below to save readers' time) pertaining to this study.

The design and context for this study present certain limitations. These first-year medical students, who were still within their first four months of school, had no prior experience with SPA within the medical school curriculum. A number of researchers who have studied SPA recommend that students be trained on SPA prior to practicing it 13, whereas this study accelerated training and application because of the crowded curriculum.

Second, practical considerations revolving around medical students' frequent interactions across the cohort of their class in medical school prevented extending this experiment over a time period longer than a week to measure sustained learning of searching skills. Students in the intervention group during a experiment of longer duration could have possibly taught their hypothetically superior EBM PubMed searching skills to students in the control group in subsequent weeks, thereby contaminating individual members of the control group for the study. A randomized controlled trial in medical education at Oxford University acknowledged the possibility of such a contamination effect 61. Thus, in this type of course-long design, it was not possible to truly distinguish between groups. Yet, a prospective cohort study could probably determine if all students using SPA retain high levels of competence beyond a single week of a block compared to a similar group of students lacking training in and application of SPA.

Third, no test data are available to evaluate how test scores might fluctuate over time for other students participating in a conventional learning and assessment arrangement.

Fourth, this study could not determine if the observed differences resulted from the fact that students in the intervention group simply spent more time using the skills set or using the rubrics. SPA takes longer not only to learn, but also to practice. The argument might be made that more time invested in any meaningful activity related to learning, particularly when coupled with timely and consistent rubric-based feedback, will result in better scores. The SPA methodology furthermore inherently involves students spending incrementally more time interacting with the skills set. In this version of SPA, however, most of the activity occurred outside class or lab contact hours, a key advantage of SPA in a crowded curriculum.

Fifth, the presence of one of the authors (Perea) who was known to all students as a proctor for major medical school exams might have caused some degree of inadvertent anxiety for all students. This anxiety might have prompted students to take the formative test more seriously as a mild form of the Hawthorne effect 62. If all students were anxious, this might have motivated them all to perform better than they would have in a more typical, formative SPA setting. This, in turn might have elevated the grades for both groups.

Finally, this study only included first-year medical students in one block at a single medical school, and so any generalization will remain limited until others elsewhere replicate this study.


With still unknowns surrounding peer grading in a MOOC set up, students’ should at least be given the option to opt out of peer-grading in a MOOC set up until clear evidence supports either way (taking variables such as subject material, diversity in groups, Bloom's taxonomy etc into consideration) especially, if we were to factor in the diversity that goes with MOOC courses.


At the same time, peer assessments appear to work for more homogeneous groups, and can also help in logistical, pedagogical, meta-cognitive and affective learning processes.

To help those students who are interested to participate in the peer-assessment section, I would like to suggest peer-assessment providers to,

i. Include clearly explained rubrics combined with feedback section for students to share their comments.

ii. It might also help if the submitter gets a chance to clarify her/his stand, if they were to disagree to peer-graded scores through a re-submission process. The re-submission process would also put the course instructor at task in constructing a clearly defined process for peer grading, and potentially make peer-graders become more objective and flexible while grading.