RESEARCH LIBRARY

View the latest publications from members of the NBME research team

Showing 71 - 80 of 96 Research Library Publications

Does Incorporating a Measure of Clinical Workload Improve Workplace-Based Assessment Scores? Insights for Measurement Precision and Longitudinal Score Growth From Ten Pediatrics Residency Programs

Posted: October 30, 2018 | Y.S. Park, P.J. Hicks, C. Carraccio, M. Margolis, A. Schwartz

Academic Medicine: November 2018 - Volume 93 - Issue 11S - p S21-S29

This study investigates the impact of incorporating observer-reported workload into workplace-based assessment (WBA) scores on (1) psychometric characteristics of WBA scores and (2) measuring changes in performance over time using workload-unadjusted versus workload-adjusted scores.

Category:Assessment-Oriented Research, Scoring

Commentary: On the Importance of the Speed-Ability Trade-Off When Dealing with Not Reached Items

Posted: October 30, 2018 | S. Pohl, M. von Davier

Front. Psychol. 9:1988

In their 2018 article, (T&B) discuss how to deal with not reached items due to low working speed in ability tests (Tijmstra and Bolsinova, 2018). An important contribution of the paper is focusing on the question of how to define the targeted ability measure. This note aims to add further aspects to this discussion and to propose alternative approaches.

Category:Assessment-Oriented Research, Reliability/Validity, General Measurement

The Optimal Number of Options for Multiple-Choice Questions on High-Stakes Tests: Application of a Revised Index for Detecting Nonfunctional Distractors

Posted: October 25, 2018 | M.R. Raymond, C. Stevens, S.D. Bucak

Adv in Health Sci Educ 24, 141–150 (2019)

Research suggests that the three-option format is optimal for multiple choice questions (MCQs). This conclusion is supported by numerous studies showing that most distractors (i.e., incorrect answers) are selected by so few examinees that they are essentially nonfunctional. However, nearly all studies have defined a distractor as nonfunctional if it is selected by fewer than 5% of examinees.

Category:Assessment-Oriented Research, General Measurement

Evaluation of a New Method for Providing Full Review Opportunities in Computerized Adaptive Testing — Computerized Adaptive Testing with Salt

Posted: October 1, 2018 | Z. Cui, C. Liu, Y. He, H. Chen

Journal of Educational Measurement: Volume 55, Issue 4, Pages 582-594

This article proposes and evaluates a new method that implements computerized adaptive testing (CAT) without any restriction on item review. In particular, it evaluates the new method in terms of the accuracy on ability estimates and the robustness against test‐manipulation strategies. This study shows that the newly proposed method is promising in a win‐win situation: examinees have full freedom to review and change answers, and the impacts of test‐manipulation strategies are undermined.

Category:Assessment-Oriented Research, General Measurement, Applications of Technology

Palliative Care Competencies and Readiness for Independent Practice: A Report on the American Academy of Hospice and Palliative Medicine Review of the U.S. Medical Licensing Step Examinations

Posted: September 1, 2018 | E. C. Carey, M. Paniagua, L. J. Morrison, S. K. Levine, J. C. Klick, G. T. Buckholz, J. Rotella, J. Bruno, S. Liao, R. M. Arnold

Journal of Pain and Symptom Management: Volume 56, Issue 3, p371-378

This article reviews the USMLE step examinations to determine whether they test the palliative care (PC) knowledge necessary for graduating medical students and residents applying for licensure.

Category:Assessment-Oriented Research, Reliability/Validity, Product-Oriented Research, USMLE, Health Professions

Crowdsourcing for Assessment Items to Support Adaptive Learning

Posted: August 10, 2018 | S. Tackett, M. Raymond, R. Desai, S. A. Haist, A. Morales, S. Gaglani, S. G. Clyman

Medical Teacher: Volume 40 - Issue 8 - p 838-841

Adaptive learning requires frequent and valid assessments for learners to track progress against their goals. This study determined if multiple-choice questions (MCQs) “crowdsourced” from medical learners could meet the standards of many large-scale testing programs.

Category:Assessment-Oriented Research, Applications of Technology

ALS Specific Quality of Life Short Form (ALSSQOL-SF): A Brief, Reliable and Valid Version of the ALSSQOL-R

Posted: July 20, 2018 | S. H. Felgoise, R. A. Feinberg, H. B. Stephens, P. Barkhaus, K. Boylan, J. Caress, Z. Simmons

Muscle Nerve, 58: 646-654

The Amyotrophic Lateral Sclerosis (ALS)‐Specific Quality of Life instrument and its revised version (ALSSQOL and ALSSQOL‐R) have strong psychometric properties, and have demonstrated research and clinical utility. This study aimed to develop a short form (ALSSQOL‐SF) suitable for limited clinic time and patient stamina.

Category:Assessment-Oriented Research, General Measurement, Health Professions

Providing Utility, Not Scores: Visualizations to Support Subscore Inferences

Posted: June 26, 2018 | R. A Feinberg, D. P. Jurich

Educational Measurement: Issues and Practice, 37: 5-8

This article spotlights the winners of the 2018 EM:IP Cover Graphic/Data Visualization Competition.

Category:Assessment-Oriented Research, Scoring

Trusting Your Test Results: Building and Revising Multiple-Choice Examinations

Posted: June 1, 2018 | D. Franzen, M. Cuddy, J. S. Ilgen

Journal of Graduate Medical Education: June 2018, Vol. 10, No. 3, pp. 337-338

To create examinations with scores that accurately support their intended interpretation and use in a particular setting, examination writers must clearly define what the test is intended to measure (the construct). Writers must also pay careful attention to how content is sampled, how questions are constructed, and how questions perform in their unique testing contexts.1–3 This Rip Out provides guidance for test developers to ensure that scores from MCQ examinations fit their intended purpose.

Category:Assessment-Oriented Research, General Measurement

A Comparison of Experimental and Observational Approaches to Assessing the Effects of Time Constraints in a Medical Licensing Examination

Posted: June 1, 2018 | P. Harik, B. E. Clauser, I. Grabovsky, P. Baldwin, M. Margolis, D. Bucak, M. Jodoin, W. Walsh, S. Haist

Journal of Educational Measurement: Volume 55, Issue 2, Pages 308-327

The widespread move to computerized test delivery has led to the development of new approaches to evaluating how examinees use testing time and to new metrics designed to provide evidence about the extent to which time limits impact performance. Much of the existing research is based on these types of observational metrics; relatively few studies use randomized experiments to evaluate the impact time limits on scores. Of those studies that do report on randomized experiments, none directly compare the experimental results to evidence from observational metrics to evaluate the extent to which these metrics are able to sensitively identify conditions in which time constraints actually impact scores. The present study provides such evidence based on data from a medical licensing examination.

Category:Assessment-Oriented Research, Reliability/Validity, Scoring, Product-Oriented Research, USMLE

NBME Self-Assessment Bundles

Stay Up to Date