Skip to main content

Table 1 Performance of GPT-4o, GPT-4, GPT-3.5 and Google Bard in USMLE, PLAB, HKMLE and NMLE.

From: Performance of ChatGPT and Bard on the medical licensing examinations varies across different cultures: a comparison study

  

GPT-4o (n/N, %)

GPT-4 (n/N, %)

GPT-3.5 (n/N, %)

Google Bard (n/N, %)

Overall

538/592(90.9%)

515/591(87.1%)

364/542(67.2%)

314/516(60.9%)

USMLE

Step 1 (119)

108/118(91.5%)

109/117(93.2%)

61/93(65.6%)

73/114(64.3%)

Step 2CK (120)

113/120(94.2%)

114/120(95.0%)

78/109(71.6%)

50/90(55.6%)

Step 3 (137)

127/137(92.7%)

126/137(92.0%)

85/124(68.5%)

61/105(58.1%)

PLAB (30)

28/30(93.3%)

26/30(86.7%)

24/30(80.0%)

13/24(54.2%)

HKMLE (48)

44/48(91.7%)

43/48(89.6%)

32/47(68.1%)

33/46(71.7%)

NMLE (139)

118/139(84.9%)

97/139(69.8%)

84/139(60.4%)

84/137(61.3%)

  1. USMLE = United State Medical Licensing Examination; PLAB = Professional and Linguistic Assessments Board; HKMLE = Hong Kong Medical Licensing Examination; NMLE = National Medical Licensing Examination