Skip to main content

Table 2 Evaluation of the accuracy of AI models according to question difficulties

From: The role of artificial intelligence in medical education: an evaluation of Large Language Models (LLMs) on the Turkish Medical Specialty Training Entrance Exam

Question Difficulty

(1–5)

#number Questions

#total ChatGPT 4 Correct

#total Llama 3 70B Correct

#total Gemini 1.5 Pro Correct

#total Command R + Correct

1

15

15

12

11

5

2

58

55

54

54

37

3

78

71

65

59

36

4

54

47

39

39

27

5

35

25

20

24

15

Total

240

213

190

187

120