Simulated patients’ role-portrayal in the clinical skills part of the Swiss federal licensing exam is of high quality and improves further over time as measured with the FAIR OSCE instrument

Kai P. Schnabel; Daniel Bauer; Felix M. Schmitz; Tanja Hitzblech; Beate G. Brem

doi:10.3205/zma001736

This is the English version of the article. The German version can be found here.

research article

simulated patients

Simulated patients’ role-portrayal in the clinical skills part of the Swiss federal licensing exam is of high quality and improves further over time as measured with the FAIR OSCE instrument

Kai P. Schnabel ¹
Daniel Bauer ¹
Felix M. Schmitz ¹
Tanja Hitzblech ¹
Beate G. Brem ¹

¹ University of Bern, Institute for Medical Education, Bern, Switzerland

Abstract

Introduction: Simulation-based teaching and assessment are integral to education in the health professions, with simulated patients (SP) being a widely accepted strategy. Ensuring high-quality SP role-portrayal is crucial for the authenticity and standardization of assessments, particularly in high-stakes exams like the Swiss Federal Licensing Examination (FLE).

The study assesses the quality of SP role-portrayal over consecutive instances of the Swiss FLE. We hypothesized that the quality of role-portrayal improves over time.

Methods: The study employed the FAIR OSCE instrument to assess SP role-portrayal in five consecutive FLE exams from 2016 to 2021. The instrument, developed between 2011 and 2014, includes four categories: introduction, delivery of information, portrayal, and others. Data analysis involved retrospective examination of FAIR OSCE ratings, calculating item scores, and overall mean scores for each exam year.

Results: The study involved 37 SP educators observing 1803 SP-candidate interactions across five exam sites. Results demonstrated a continuous improvement in SP role-portrayal over the five-year period, with significant differences between 2016 and subsequent years. The overall mean scores of SP role-play ratings increased steadily, indicating a positive trend in SP performance.

Discussion: The findings supported the hypothesis of continuous improvement of SP role-portrayal within the Swiss FLE. The quality of role portrayal not only improved consistently but also maintained a very high level, with no items on the FAIR OSCE instrument rated “do not agree” more than 5% of the time. This suggests that SPs role play aligned well with case scripts, reflecting the intended authenticity and standardization of assessments.

Limitations were acknowledged, including potential bias in local SP educators rating their own SPs and the study’s sole focus on SP role-portrayal. While the findings contribute to understanding SP effectiveness in standardized, high-stakes clinical exams, the study did not scrutinize other potential sources of variance.

Conclusion: In conclusion, the research demonstrated a continuous improvement and high quality of SP role-portrayal in the Swiss FLE over five years. Well-trained SPs, assessed using the FAIR OSCE instrument, play a crucial role in maintaining the standardized and high-quality nature of clinical skills exams in a high-stakes context. Further research could explore additional factors influencing overall exam quality and address potential biases in SP educator ratings.

Keywords

OSCE, objective structured clinical examination, standardization, standardized patien, simulated patient, high-stakes examinations, licensing exam, patient simulation

Introduction

Simulation-based teaching and assessment are crucial in health professions education and assessment [1], [2]. Among different simulation-related options such as the use of mannequins, full-scale simulators and computer-based applications, employing simulated patients (aka simulated participants, SP) is a proven strategy for delivering simulation scenarios [2], [3]. First introduced in the 1960s [4], [5], working with SPs in learning and assessment contexts is nowadays widely accepted and applied by many medical schools all over the world [6]. Objective Structured Clinical Examinations (OSCEs) [7] regularly employ SPs, in which medical students interact with SPs portraying pre-defined patient cases, while examiners rate the students’ performance. The quality of the SPs’ role-portrayal is of critical importance to ensure fair and valid exams, above all in terms of authenticity and standardization [8]: while an insufficiently authentic role-portrayal limits the representativeness of the task put before the student, insufficiently standardized role-portrayal limits the comparability of their tasks and thus exam fairness, both are threats to the exam’s validity [2], [9], [10]. It is the role of SP educators to ensure their role-portrayal has a high level and good balance of authenticity and standardization [11], [12].

Switzerland introduced the new Federal Licensing Examination in (FLE) in 2011 [13], with a written exam and an OSCE that graduates of the medical Masters programs take in order to achieve licensing as physicians. An increasing number of candidates (medical students and candidates from non-European countries, seeking licensure in Switzerland) have to be assessed in the FLE with over a thousand taking the test in 2021 [14]. Several measures were put in place to ensure the highest quality in the development and delivery of this assessment [15], [16], [17]. The quality of the overall FLE is mirrored, e.g., with Cronbach’s alpha of above 0.84 as indicator of measurement reliability) [17].

To ensure each and every scenario’s SPs with their diverse individual backgrounds and acting experience are amalgamated into homogenous groups that portray their roles well over the whole exam day, a protocol for training and quality assurance was put in place. This includes 5 days of standardization meetings on a federal level to establish a common understanding of the scripts and scenarios and how to put them into practice and generating videographic material depicting the prototypical realization of every scenario (e.g., standardization of pain reactions, application of moulage [18], [19]). SP educators and SPs from their local program then use that material in two spaced trainings session of 2 hrs each to ensure authentic and standardized role portrayal during the exam [17]. During the exam, quality of SPs’ role portrayal (and other aspects of the exam) is controlled via live observation, e.g., via one-way mirrors or via audio-visual feed into a control room.

Relying on an assessment format as complex and resource-intensive as the OSCE to make decisions that can impact both patients’ safety (candidates that pass as false-positive) but also test-takers’ careers (candidates that fail as false-negative) warrants an analysis to see if SP role-portrayal in high-stakes OSCE can be performed at high levels and continuously. Thus, the aim of the present study was to assess the quality of simulated patients’ (SP) role-portrayal in Swiss high-stakes clinical exams over time. We hypothesized that the quality of SP role-portrayal in the Swiss FLE continues to improve over time.

Methods

In order to determine if SP role-portrayal in high-stakes exams is performed at high levels in general and over time, we conducted a cohort study with five Swiss SP programs involved in the delivery of the OSCE within the FLE.

Setting

The OSCEs were conducted at the time of this analysis at 5 sites (Basel, Bern, Geneva, Lausanne and Zurich), simultaneously in French and German, over three days, each run covering 12 stations, and almost all of which employed SPs (comprehensive data about the exam is available in attachment 1 [Att. 1]).

Procedure

To determine if OSCE-related SP role-portrayal is performed at high levels and consistently so, SP role-portrayal was assessed during the OSCEs within the FLE from 2016 to 2021, representing five consecutive exams as the 2020 clinical skills exam was not administered due to the COVID-19 pandemic. These five exams assessed over 5000 candidates’ clinical competencies, for which SPs were employed to deliver 70 parallel runs in total at 12 stations each. Data on these SPs’ role-portrayal were collected at all five sites over the five exams in random observations, via direct observation or live-stream as measured with the FAIR OSCE instrument. Raters were the local SP trainer (1-5 at each site) and one federal observer. The SP trainers were either physicians or experienced health professionals or artists with 2-20 years’ experience in the field and members of the federal SP trainer team.

Instrument

The FAIR OSCE instrument was developed between 2011 and 2014 by an interfaculty board of Swiss SP educators attempting to assess SP role-portrayal using case-independent observable criteria. It is based on the available literature and underwent a consensus-building process among the interfaculty board, supporting its content validity [15]. The instrument was additionally used in SP trainings as well as delivery of local formative and summative OSCEs at the faculty of Bern and has proven its feasibility and usability [15], [20].

The instrument encompasses four categories:

introduction (3 items),
delivery of information (7 items),
portrayal (7 items), and
others (3 items).

The items are assessed on a scale with the points -1=do not agree, +1=agree, +2=completely agree, and not applicable. (cf. attachment 2 [Att. 2])

Data analysis

FAIR OSCE ratings from 2016 to 2021 were analysed retrospectively. Item scores were calculated to highlight the quality of SP role-portrayal within each year’s exam, and overall mean scores were calculated for each of the 5 exams (2016 to 2021, items rated as “not applicable” were excluded), i.e. all ratings derive from all sites administering the examination in a given year.

Statistical differences between examination years were calculated with one-way ANOVA. Effect sizes were calculated using eta squared with η²~0.01 for small, η²~0.06 for medium and η²>=0.14 for large effects. Post hoc tests were calculated using Bonferroni tests for each possible group comparison.

Ethics approval

Ethics approval for this study was not necessary in accordance with Swiss legislation (chapter 1, article 2 of the 2011 Swiss Human Research Act). Ethics principles according to the WMA Helsinki declaration and good scientific practices were followed throughout.

Results

During the 5 OSCEs between 2016 and 2021, totalling N=37 different SP educators (approximately 2-4 per site and year, mostly the same SP educators over the years) observed a total of N=1803 SP-candidate interactions (Basel n=247, Bern n=419, Geneva n=357, Lausanne n=467, and Zurich n=313) and rated the SP role-portrayal.

The number of observations increased from an initial 168 observations in 2016 to a maximum of 499 observations in 2021.

To exemplify, table 1 [Tab. 1] shows the rating results for the OSCE 2021, based on 499 independent observations.

**Table 1: FAIR OSCE response rates as derived from the 2021 exam**

The overall mean scores of the SP role-play ratings increased continuously over the years (see table 2 [Tab. 2]). The ANOVA showed that at least one increasement across the years was of statistical significance (F=9.2, p<0.001; η²=0.02). The appropriate Bonferroni post hoc tests revealed that the results from 2018-2021 were significantly better compared to 2016 (cf. table 2 [Tab. 2]).

**Table 2: Statistics related to the FAIR OSCE scores per examination year**

Discussion

In terms of our hypothesis, we were able to demonstrate that the quality of the SP role-portrayal increased continuously over the 5 years analyzed, significantly so when comparing data from 2021 versus 2016 and 2017, 2018 and 2017 versus 2016.

This means we were able to confirm the initial hypothesis of increasing quality.

The quality of the role portrayal not only improved continuously but also was at a very high level with no items on the FAIR OSCE instrument rated “do not agree” more often than in 5% of random observations in all four categories, implying role-portrayal was as intended by the case scripts. Both these results can be seen as indicators both the exam and SPs employed are of the highest quality. It is also an indicator that the FAIR OSCE instrument is able to measure the increasing quality of the exam even though the measurement already started at a very high level considering the starting point of the FLE in 2011 (when one would expect the steepest learning curve), adding to the instrument’s validity argument.

The findings are incidentally not a measure of the overall quality of the FLE but a mosaic, as other potential sources of variance like the examiners, checklists and further contextual factors were not scrutinized and remained beyond the scope of this study. A potential source of bias in the study at hand lies in how local SP educators rated the quality of their own SPs’ role-portrayal, who were trained by themselves or their colleagues. This however constitutes a systematic error that would apply to all sites. Inter-rater agreement using the FAIR OSCE has so far not been reported and was beyond the scope of this study but should be addressed in future studies. Parallel assessment with the authenticity items of the MaSP [21] could contribute more to the validity argument of the FAIR OSCE. While the study was conducted in only one country in the context of only one exam, the fact it was performed in the context of a highly-standardized high-stakes federal licensing exam performed at five sites in two languages gives it a certain merit.

Conclusions

SPs’ role-portrayal in a series of highly standardized high-stakes exams was shown to be of highest quality and increasing consecutively over all exam cohorts analyzed. Well-trained SPs observed with a valid instrument thus contribute to a standardized, high-quality clinical skills exam.

Authors’ ORCIDs

Kai P. Schnabel: [0000-0002-6977-2717]
Daniel Bauer: [0000-0002-3337-3327]
Tanja Hitzblech: [0000-0003-0876-0373]
Beate G. Brem: [0000-0002-0551-9587]

Acknowledgements

The authors would like to extend their gratitude towards all SP educators applying the instrument and collating the data during the exams that served as basis for the analyses.

Competing interests

The authors declare that they have no competing interests.

References

[1] Bradley P. The history of simulation in medical education and possible future directions. Med Educ. 2006;40(3):254-62. DOI: 10.1111/j.1365-2929.2006.02394.x
[2] Bauer D, Lahner FM, Huwendiek S, Schmitz FM, Guttormsen S. An overview of and approach to selecting appropriate patient representations in teaching and summative assessment in medical education. Swiss Med Wkly. 2020;150:w20382. DOI: 10.4414/smw.2020.20382
[3] Nestel D, Kneebone R. Perspective: authentic patient perspectives in simulations for procedural and surgical skills. Acad Med. 2010;85(5):889-893. DOI: 10.1097/ACM.0b013e3181d749ac
[4] Barrows HS, Abrahamson S. The Programmed Patient: A Technique for Appraising Student Performance in Clinical Neurology. J Med Educ. 1964;39:802-5.
[5] Barrows HS. Simulated patients in medical teaching. Can Med Assoc J. 1968;98(14):674-676.
[6] Nestel D, Bearman M. Simulated Patient Methodology. Malden (MA); Oxford (UK): West Sussex UK: John Wiley & Sons Ltd.; 2015. DOI: 10.1002/9781118760673
[7] Harden RM, Stevenson M, Downie WW, Wilson GM. Assessment of clinical competence using objective structured examination. Br Med J. 1975;1(5955):447-451. DOI: 10.1136/bmj.1.5955.447
[8] Smith CM, Gliva-Mcconvey G, Chapin A. Training SPs for Authentic Role Portrayal. In: Gliva-McConvey G, Nicholas CF, Clark L, editors. Comprehensive Healthcare Simulation: Implementing Best Practices in Standardized Patient Methodology Comprehensive Healthcare Simulation. Cham: Springer; 2020. DOI: 10.1007/978-3-030-43826-5
[9] Haladyna TM, Downing SM. Construct-Irrelevant Variance in High-Stakes Testing. Educ Measurement. 2004;23(1):17-27. DOI: 10.1111/j.1745-3992.2004.tb00149.x
[10] Smith CM, O`Byrne CO, Nestel D. Simulated patient methodology and assessment. In: Nestel D, Bearman M, editors. Simulated Patient Methodology. Malden (MA); Oxford (UK): West Sussex UK: John Wiley & Sons Ltd.; 2015. p.85-92. DOI: 10.1002/9781118760673.ch12
[11] Lewis KL, Bohnert CA, Gammon WL, Hölzer H, Lyman L, Smith C, Thompson TM, Wallace A, Gliva-Mcconvey G. The Association of Standardized Patient Educators (ASPE) Standards of Best Practice (SOBP). Adv Simul (Lond). 2017;2:10. DOI: 10.1186/s41077-017-0043-4
[12] Brem B, Steinweg B. Prüfungen mit SPs. In: Peters T, Thrien C, editor. Simulationspatienten Handbuch für die Aus- und Weiterbildung in medizinischen und Gesundheitsberufen. 1. Auflage. Göttingen: Hogrefe Verlag; 2018.
[13] Guttormsen S, Beyeler C, Bonvin R, Feller S, Schirlo C, Schnabel K, Schurter T, Berendonk C. The new licencing examination for human medicine: from concept to implementation. Swiss Med Wkly. 2013;143:w13897. DOI: 10.4414/smw.2013.13897
[14] Bundesamt für Gesundheit. Statistiken Ärztinnen/Ärzte. Bern: Bundesamt für Gesundheit; 2022. Zugänglich unter/available from: https://www.bag.admin.ch/dam/bag/de/dokumente/berufe-gesundheitswesen/medizinalberufe/statistiken/med/aerztinnen-aerzte-2021.pdf.download.pdf/Aerztinnen_und_Aerzte_2021_DE.pdf
[15] Brem B, Richter S, Schnabel KP. FAIR_OSCE - Feedbackstructure for assessment of interactive roleplay in Objective Structured Clinical Exams. In: 4th Swiss Conference on Standardized Patients and Simulation in Health Care. Bern, 10-12 September 2014. SC2-5. Zugänglich unter/available from: https://with-simulation.ch/programm-2014/
[16] Bachmann C, Kropf R, Biller S, Schnabel KP, Junod Perron N, Monti M, Berendonk C, Huwendiek S, Breckwoldt J. Development and national consensus finding on patient-centred high stakes communication skills assessments for the Swiss Federal Licensing Examination in Medicine. Patient Educ Couns. 2021;104(7):1765-1772. DOI: 10.1016/j.pec.2020.12.003
[17] Berendonk C, Schirlo C, Balestra G, Bonvin R, Feller S, Huber P, Junger E, Monti M, Schnabel K, Beyeler C, Guttormsen S, Huwendiek S. The new final Clinical Skills examination in human medicine in Switzerland: Essential steps of exam development, implementation and evaluation, and central insights from the perspective of the national Working Group. GMS Z Med Ausbild. 2015;32(4):Doc40. DOI: 10.3205/zma000982
[18] Schnabel KP, Lorwald AC, Beltraminelli H, Germano M, Brem BG, Wust S, Bauer D. Development and evaluation of three-dimensional transfers to depict skin conditions in simulation-based education. GMS J Med Educ. 2024;41(2):Doc14. DOI: 10.3205/zma001669
[19] Bauer D, Lorwald AC, Wust S, Beltraminelli H, Germano M, Michel A, Schnabel KP. Development, production and evaluation of 2-dimensional transfer tattoos to simulate skin conditions in health professions education. BMC Med Educ. 2021;21(1):350. DOI: 10.1186/s12909-021-02763-z
[20] Brem BG, Richter S, Schnabel KP. Quality Control for SP Performance Regarding Patient Portrayal in a High-Stakes OSCE. In: Association of Standardised Participants Educators (ASPE) annunal Conference; June 22rd – 25th, 2014; Indianapolis (IN); 2014. p.F-P24.
[21] Wind LA, Van Dalen J, Muijtjens AM, Rethans JJ. Assessing simulated patients in an educational setting: the MaSP (Maastricht Assessment of Simulated Patients). Med Educ. 2004;38(1):39-44. DOI: 10.1111/j.1365-2923.2004.01686.x

Attachments

Attachment 1	Comprehensive description of the clinical skills part of the Swiss Federal Licensing Exam in Medicine (Attachment_1.pdf, application/pdf, 134.9 KBytes)
Attachment 2	English version of the FAIR OSCE instrument in its current form (Attachment_2.pdf, application/pdf, 240.31 KBytes)

Get in touch.

GMS Journal for Medical Education

Simulated patients’ role-portrayal in the clinical skills part of the Swiss federal licensing exam is of high quality and improves further over time as measured with the FAIR OSCE instrument

Abstract

Keywords

Introduction

Methods

Setting

Procedure

Instrument

Data analysis

Ethics approval

Results

Discussion

Conclusions

Authors’ ORCIDs

Acknowledgements

Competing interests

References

Attachments

ZB MED is a member of DataCite

ZB MED advocates gender equality

Award for German Medical Science

ZB MED advocates Open Access