Crowd-Sourced Assessment of Technical Skills for Validation of Basic Laparoscopic Urologic Skills Tasks

Timothy M. Kowalewski, Bryan Comstock, Robert Sweet, Cory Schaffhausen, Ashleigh Menhadji, Timothy Averch, Geoffrey Box, Timothy Brand, Michael Ferrandino, Jihad Kaouk, Bodo Knudsen, Jaime Landman, Benjamin Lee, Bradley F. Schwartz, Elspeth McDougall, Thomas S. Lendvay

Research output: Contribution to journalArticlepeer-review

35 Scopus citations


Purpose The BLUS (Basic Laparoscopic Urologic Skills) consortium sought to address the construct validity of BLUS tasks and the wider problem of accurate, scalable and affordable skill evaluation by investigating the concordance of 2 novel candidate methods with faculty panel scores, those of automated motion metrics and crowdsourcing. Materials and Methods A faculty panel of surgeons (5) and anonymous crowdworkers blindly reviewed a randomized sequence of a representative sample of 24 videos (12 pegboard and 12 suturing) extracted from the BLUS validation study (454) using the GOALS (Global Objective Assessment of Laparoscopic Skills) survey tool with appended pass-fail anchors via the same web based user interface. Pre-recorded motion metrics (tool path length, jerk cost etc) were available for each video. Cronbach's alpha, Pearson's R and ROC with AUC statistics were used to evaluate concordance between continuous scores, and as pass-fail criteria among the 3 groups of faculty, crowds and motion metrics. Results Crowdworkers provided 1,840 ratings in approximately 48 hours, 60 times faster than the faculty panel. The inter-rater reliability of mean expert and crowd ratings was good (α=0.826). Crowd score derived pass-fail resulted in 96.9% AUC (95% CI 90.3-100; positive predictive value 100%, negative predictive value 89%). Motion metrics and crowd scores provided similar or nearly identical concordance with faculty panel ratings and pass-fail decisions. Conclusions The concordance of crowdsourcing with faculty panels and speed of reviews is sufficiently high to merit its further investigation alongside automated motion metrics. The overall agreement among faculty, motion metrics and crowdworkers provides evidence in support of the construct validity for 2 of the 4 BLUS tasks.

Original languageEnglish (US)
Pages (from-to)1859-1865
Number of pages7
JournalJournal of Urology
Issue number6
StatePublished - Jun 1 2016


  • clinical competence
  • crowdsourcing
  • laparoscopy
  • urologic surgical procedures
  • validation studies

ASJC Scopus subject areas

  • Urology


Dive into the research topics of 'Crowd-Sourced Assessment of Technical Skills for Validation of Basic Laparoscopic Urologic Skills Tasks'. Together they form a unique fingerprint.

Cite this