Skip to main content
Log in

Measuring Teaching Effectiveness: Correspondence Between Students’ Evaluations of Teaching and Different Measures of Student Learning

  • Published:
Research in Higher Education Aims and scope Submit manuscript

Abstract

Relating students’ evaluations of teaching (SETs) to student learning as an approach to validate SETs has produced inconsistent results. The present study tested the hypothesis that the strength of association of SETs and student learning varies with the criteria used to indicate student learning. A multisection validity approach was employed to investigate the association of SETs and two different criteria of student learning, a multiple-choice test and a practical examination. Participants were N = 883 medical students, enrolled in k = 32 sections of the same course. As expected, results showed a strong positive association between SETs and the practical examination but no significant correlation between SETs and multiple-choice test scores. Furthermore, students’ subjective perception of learning significantly correlated with the practical examination score whereas no relation was found for subjective learning and the multiple choice test. It is discussed whether these results might be due to different measures of student learning varying in the degree to which they reflect teaching effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. For the complete SET inventory used for the present study, please contact the authors.

References

  • Abrami, P. C., d’Appollonia, S., & Cohen, P. A. (1990). Validity of student ratings of instruction: What we know and what we do not. Journal of Educational Psychology, 82, 219–231.

    Article  Google Scholar 

  • Abrami, P. C., & Mizener, D. A. (1985). Student/instructor attitude similarity, student ratings, and course performance. Journal of Educational Psychology, 77, 693–702.

    Article  Google Scholar 

  • Aleamoni, L. M. (1999). Student rating myths versus research facts from 1924 to 1998. Journal of Personnel Evaluation in Education, 13, 153–166.

    Article  Google Scholar 

  • Clayson, D. E. (2009). Student evaluations of teaching: Are they related to what students learn? A meta-analysis and review of the literature. Journal of Marketing Education, 31, 16–30.

    Article  Google Scholar 

  • Cohen, P. A. (1981). Student ratings of instruction and student achievement: A meta-analysis of multisection validity studies. Review of Educational Research, 51, 281–309.

    Google Scholar 

  • Ellis, L., Burke, D. M., Lomire, P., & McCormack, D. R. (2003). Student grades and average ratings of instructional quality: The need for adjustment. Journal of Educational Research, 97, 35–40.

    Article  Google Scholar 

  • Greenwald, A. G., & Gillmore, G. M. (1997). No pain, no gain? The importance of measuring course workload in student ratings of instruction. Journal of Educational Psychology, 89, 743–751.

    Article  Google Scholar 

  • Gross, J., Lakey, B., Edinger, K., Orehek, E., & Heffron, D. (2009). Person perception in the college classroom: Accounting for taste in students’ evaluations of teaching effectiveness. Journal of Applied Social Psychology, 39, 1609–1638.

    Article  Google Scholar 

  • Koon, J., & Murray, H. G. (1995). Using multiple outcomes to validate student ratings of overall teacher effectiveness. Journal of Higher Education, 66, 61–81.

    Article  Google Scholar 

  • Kromrey, H. (1994). Evaluation der Lehre durch Umfrageforschung? [Evaluation of teaching through survey research?]. in P.Ph. Mohler (Ed.), Universität und Lehre. Ihre Evaluation als Herausforderung an die Empirische Sozialforschung (University and Teaching. Their Evaluation as a Challenge for Empirical Social Research) (p. 91–114). Münster: Waxmann.

  • Kulik, J. A. (2001). Student ratings: Validity, utility, and controversy. New Directions for Institutional Research, 109, 9–25.

    Article  Google Scholar 

  • Marsh, H. W., & Roche, L. A. (1997). Making students’ evaluations of teaching effectiveness effective: The critical issues of validity, bias, and utility. American Psychologist, 52, 1187–1197.

    Article  Google Scholar 

  • Marsh, H. W., & Roche, L. A. (2000). Effects of grading leniency and low workload on students’ evaluations of teaching: Popular myth, bias, validity, or innocent bystanders? Journal of Educational Psychology, 92, 202–228.

    Article  Google Scholar 

  • McKeachie, W. J. (1979). Student ratings of faculty: A reprise. Academe, 65, 384–397.

    Article  Google Scholar 

  • McKeachie, W. J. (1997). Student ratings: The validity of use. American Psychologist, 52, 1218–1225.

    Article  Google Scholar 

  • Murray, H. G. (2005, June). Student evaluation of teaching: Has it made a difference? Paper presented at the Annual Meeting of the Society for Teaching and Learning in Higher Education, Charlottetown, Prince Edward Island, Canada.

  • Neath, I. (1996). How to improve your teaching evaluations without improving your teaching. Psychological Reports, 78, 1363–1372.

    Article  Google Scholar 

  • Prosser, M., & Trigwell, K. (1991). Student evaluations of teaching and courses: Student learning approaches and outcomes as criteria of validity. Contemporary Educational Psychology, 16, 293–301.

    Article  Google Scholar 

  • Rindermann, H. (1996). Untersuchungen zur Brauchbarkeit studentischer Lehrevaluationen [Analyses on the Usefulness of Student Evaluations of Teaching]. Landau: Verlag Empirische Pädagogik.

    Google Scholar 

  • Rindermann, H., & Amelang, M. (1994). Das Heidelberger Inventar zur Lehrveranstaltungs-Evaluation (HILVE). Handanweisung [The Heidelberg Inventory for Evaluation of Teaching (HILVE). Manual]. Heidelberg: Asanger.

  • Sitzmann, T., Ely, K., Brown, K. G., & Bauer, K. N. (2010). Self-assessment of knowledge: A cognitive learning or affective measure? Academy of Management Learning and Education, 9, 169–191.

    Article  Google Scholar 

  • Stark-Wroblewski, K., Ahlering, R. F., & Brill, F. M. (2007). Toward a more comprehensive approach to evaluating teaching effectiveness: Supplementing student evaluations of teaching with pre-post learning measures. Assessment and Evaluation in Higher Education, 32, 403–415.

    Article  Google Scholar 

  • Svinivki, M., & McKeachie, W. J. (2010). McKeachie’s teaching tips: Strategies, research, and theory for college and university teachers. Boston: Houghton Mifflin.

    Google Scholar 

  • Wachtel, H. K. (1998). Student evaluation of college teaching effectiveness: A brief review. Assessment and Evaluation in Higher Education, 23, 191–211.

    Article  Google Scholar 

  • Wilson, R. (1998). New research casts doubt on value of student evaluations of professors. Chronicle of Higher Education, 44, A12–A14.

    Google Scholar 

  • Zabaleta, F. (2007). The use and misuse of student evaluations of teaching. Teaching in Higher Education, 12, 55–76.

    Article  Google Scholar 

Download references

Acknowledgments

Preparation of the manuscript was supported by a doctoral fellowship through the Landesgraduiertenförderung-LGFG (Funding program of the German Federal State of Baden-Württemberg) awarded to Sebastian Stehle. We thank Gerald Wibbecke and Dr. Monika Porsche for their assistance in the data collection and Dr. Anna Ropeter and Janine Kahman for reviewing the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian Stehle.

Appendices

Appendix A: Exemplary item of the MC test, type A+

A 67-year-old male patient presents with a histologically verified adenocarcinoma of the rectum. Which of the following examinations yields the important cue for defining the therapeutic concept (neoadjuvant radiotherapy versus primary operation)?

  • Endosonography

  • Proctoscopy

  • Colonoscopy

  • Sphincter manometry

  • Barium enema

Appendix B: Exemplary checklist for assessment of OSCE performance

  • For each item, the student can achieve a maximum of 5 points.

  • 5 points: all items fulfilled without help

  • 3 points: all items fulfilled with defined help of the examiner

  • 1 point: items fulfilled incompletely despite defined help of the examiner

Task

Information to the student: Please perform an abdominal examination on this young patient and comment on what you are doing! The patient suffers from pain of the right lower abdominal quadrant since yesterday. He has no diarrhoea, no vomiting and no dysuria. There is no relevant former disease history. You do not need to perform an anamnesis!!!

Information to the examiner: Please instruct the simulated patient to lie on the stretcher with legs bent and arms entwined behind the head! The simulated patient has been trained to react strongly to palpation of the right lower abdominal quadrant, simulate signs of peritonitis there and give an impression of feeling ill.

Item 1: Auscultation

Information to the examiner: Use global rating between 0 and 5 points on this item and take security in performing and commenting into consideration!

The student…

  • introduces himself to the patient and explains to him the task he has to perform

  • asks the patient to lie relaxed with legs sprawled out or slightly bent and arms next to the body

  • starts with the auscultation in order not to mask pathological finding by prior palpation

  • comments on t the auscultation in all 4 quadrants

Item 2: Palpation

Information to the examiner: Use global rating between 0 and 5 points on this item and take security in performing and commenting into consideration. If student does not spontaneously examine Lanz’s point, McBurney’s point and the psoas sign, ask him/her to do so and reduce on the rating accordingly!

The student…

  • starts palpation in the left lower quadrant distant to the point of maximal pain

  • palpates all quadrants first superficially so as not to miss abnormalities of the abdominal wall

  • performs a second deep palpation in all 4 quadrants

  • palpates the area of maximal pain at last

Item 3: Signs of peritonitis

Information to the examiner: Use global rating between 0 and 5 points on this item and take security in performing and commenting into consideration. If student does not spontaneously examine signs of peritonitis, ask him/her to do so and reduce on the rating accordingly!

The student examines and comments on the following signs of peritonitis…

  • abdominal tenderness

  • ipsilateral and contralateral rebound tenderness

  • and interpretation as local or generalized peritonitis

Item 4: Rectal examination

Information to the examiner: If student does not spontaneously mention the need of a rectal examination in this case, ask him/her whether and why he would perform a rectal exam and ask him to explain what he would specifically pay attention to. Use rating of single answers on this item (see below)!

The student pays attention to the following aspects:

  • Stool in the rectal ampulla as a sign of constipation (1 point)

  • pathological lump, e.g. tumour, polyp (1 point)

  • sphincter tonus size and consistency of the prostate in males (1 point)

  • palpation of Douglas’s space, where pain is a sign of Douglas’ abscess (2 points)

Item 5: Interpretation of results and further

Information to the student: You have now gathered clinical findings by examining the abdomen. The rectal examination is painful when palpating the Douglas’ space. How do you interpret these results and what further medical information will you gather?

Information to the examiner: Use rating of single answers on this item (see below)!

The student…

  • interprets the results as strong signs of acute appendicitis, possibly already perforated (2 points)

  • asks for lab works on signs of inflammation and needed prior to an operation (e.g. blood count for leucocytes, CRP, electrolytes, coagulation tests) (1 point)

  • either asks for an additional ultrasound or argues for the decision of an operation on the basis of the clinical findings (2 points)

Reason for rating below 17 points (minimal competence expected on the basis of the standard setting):

figure a

Rights and permissions

Reprints and permissions

About this article

Cite this article

Stehle, S., Spinath, B. & Kadmon, M. Measuring Teaching Effectiveness: Correspondence Between Students’ Evaluations of Teaching and Different Measures of Student Learning. Res High Educ 53, 888–904 (2012). https://doi.org/10.1007/s11162-012-9260-9

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11162-012-9260-9

Keywords

Navigation