A female medical student reviews narrative feedback from the senior resident she worked with on her third-year clerkship rotations: “MF was lovely to have on this rotation. She was warm and friendly, and patients liked having her involved in their care. I enjoyed having her on the team. She’ll be a great house officer someday.” After reviewing, the student struggles to understand what specifically she did well during her rotation and how to identify areas for continued growth or improvement. She wonders about her performance in clinical documentation, patient interactions, or procedural technique.

While this student received positive feedback from her supervising residents, it focused heavily on personality traits, failing to comment on her actions, specific contributions to patient care, or interactions with the medical team. Unfortunately, this is not an uncommon experience for female medical students, more likely to be described with terms illustrating personal attributes (i.e., kind, lovely, delightful) compared to male counterparts, more likely to be described by abilities or skills (i.e., scientific, relevant, quick learners).1 Research has shown that, compared to men, women receiving narrative feedback are more likely to be penalized for not meeting stereotypical expectations of interpersonal warmth and are less benefited by meeting standards of technical competence.2 One study of third-year medical students on an internal medicine rotation revealed this: while there was no significant difference in final grades and women scored higher than men on a variety of clinical performance metrics, the content of narrative evaluations differed dramatically by gender.2

Similarly, there are well-documented differences in feedback given to students who come from groups historically minoritized and who are thus underrepresented in medicine (URiM) compared to white colleagues. One study found that Black students received lower clerkship grades overall and were more likely to be described as “competent” while white counterparts were more often described with standout words like “best” or “exceptional.”1

As narrative feedback is integral at all performance levels, identity-based differences in the character and quality of feedback may have far-reaching implications as women and URiM trainees advance in their careers. Women are less likely to be promoted to the highest ranks of academic medicine, to receive departmental and national recognition awards, and to be elected to national societies. Similarly, Black and Asian students are less likely to be elected to the Alpha Omega Alpha (AOA) honor society, even after controlling for several educational and demographic factors.3 Honors early in a medical career, such as AOA membership, have been linked to upward mobility through academic medicine, including matching to a desired specialty/location, increased potential financial earnings, and higher rates of promotion.3 Past work has highlighted an amplification cascade, wherein small subjective differences in assessment can lead to larger differences in grades and awards, ultimately used as objective measures of success for promotion.3

While the problem of bias in evaluations is pervasive, it can also be mitigated though active, persistent self-reflection, and intentional correction. Interventions to address bias in student and trainee evaluations thus far have predominantly been aimed at faculty who are tasked with evaluating medical students after their clerkship. However, senior residents also frequently evaluate medical students, and their comments are also eligible for inclusion on the Medical Student Performance Evaluation (MSPE). The Alliance for Academic Internal Medicine posited that “medical schools…should prioritize teaching faculty and residents the skills and strategies needed to mitigate bias when they assess students.”4 However, formalized education on this topic for residents remains sparse.

We argue that education and training in unintentional bias in evaluations should be a core component of internal medicine residency training programs. At our institution, we have incorporated this training into our “Preparing to be a Senior Resident” retreat at the end of intern year, into our inpatient morning report offerings during the year, and as part of the residency prep course for graduating medical students. Similar training could also be incorporated into the “Residents as Teacher” curricula that are offered at many IM residency training programs.

As a starting point, it is important to teach residents to proactively reflect on one’s biases and consider how these biases could influence written evaluations. When writing an evaluation, consider whether the words chosen would be used for a learner of another gender or race/ethnicity. Best practice is to focus on accomplishments, abilities, and skills, rather than personality traits. One simple approach that can be shared with supervising residents to minimize unintended bias in feedback is shown in the figure; this offers checkpoints for evaluators before, during, and after interactions with junior trainees.5 Set clear expectations with students and outline your evaluation and feedback process so they know what to expect. Throughout the rotation, make notes of laudable actions, directly observed behaviors or accomplishments by the student, and areas to offer specific constructive criticism. When writing narrative feedback, draw focus away from personality traits (no matter how positive) and emphasize what the student did during the rotation and how they grew as a trainee. Consider using a gender bias calculator or having a trusted colleague read your feedback to ensure it is not gendered. Finally, ensure written feedback matches what you have shared with the student verbally.

Schema for Addressing Bias

Based on these concepts, the following is rewritten feedback from the senior resident for the female medical student:

“She was a valued member of the team and asked insightful questions on rounds. She carried more patients than would be expected at this level of training, developed well-rounded and comprehensive plans for patients, and carried out those plans efficiently. She integrated well into the team and anticipated team needs, taking initiative to obtain outside records and facilitate transitions of care for patients. She is ready to be a sub-intern and have more responsibility and independence in the care of her patients.”

In summary, written narrative feedback is vulnerable to implicit bias. Given the importance of feedback on promotion and advancement in academic medicine, even early in a trainee’s career, it is important to consider the language we use to evaluate medical students, and to consider a structured framework to mitigate our own biases. Formal training to recognize and avoid unintended bias in evaluations should be a core element of our internal medicine residency curriculum, considering the vital role senior residents play in the professional development of students and the weight placed on their narrative evaluations, both by learners who value near-peer feedback and inclusion in formal assessment tools like the MSPE.


  1. Rojek AE, Khanna R, Yim JWL, et al. Differences in narrative language in evaluations of medical students by gender and under-represented minority status. J Gen Intern Med. 2019;34(5):684-691. doi:10.1007/s11606-019-04889-9.
  2. Gorth DJ, Magee RG, Rosenberg SE, et al. Gender disparity in evaluation of internal medicine clerkship performance. JAMA Network Open. 2021;4(7):e2115661. doi:10.1001/jamanetworkopen.2021.15661.
  3. Teherani A, Hauer KE, Fernandez A, et al. How small differences in assessed clinical performance amplify to large differences in grades and awards: A cascade with serious consequences for students underrepresented in medicine. Acad Med. 2018;93(9):1286-1292. doi:10.1097/ACM.0000000000002323.
  4. Onumah CM, Lai CJ, Levine D, et al. Aiming for equity in clerkship grading: Recommendations for reducing the effects of structural and individual bias. Am J Med. 2021 Sep;134(9):1175-1183.e4.doi:10.1016/j.amjmed.2021.06.001. Epub 2021 Jun 16.
  5. Gulbas L, Guerin W, Ryder HF. Does what we write matter? Determining the features of high- and low-quality summative written comments of students on the internal medicine clerkship using pile-sort and consensus analysis: a mixed-methods study. BMC Med Educ. 2016;16(1):145. doi:10.1186/s12909-016-0660-y.



Health Equity, Medical Education, Sex and Gender-Informed Medicine, SGIM, Women's Health

Author Descriptions

Dr. Finta (fintama@med.umich.edu) is a chief medical resident at the University of Michigan Internal Medicine Residency program and serves as a resident lead for Equal Medicine, a novel curriculum for women in academic internal medicine in the Internal Medicine Residency Program at the University of Michigan. Dr. Sheffield (vmmorris@med.umich.edu) is a clinical assistant professor in the Division of Hospital Medicine at the University of Michigan and attends on the general medicine service at the VA Ann Arbor Healthcare System. Dr. Lukela (jlreilly@med.umich.edu) serves as the faculty advisor for Equal Medicine, a career development program for women trainees in Internal Medicine at University of Michigan, and as the Vice Chief for Clinical Strategy and Community Engagement in the Division of General Medicine.