The future of AI clinicians: assessing the modern standard of chatbots and their approach to diagnostic uncertainty

Table 2 Comparison of response characteristics between GPT-4o and Claude-3

		GPT-4o	Claude-3	P Value
Characteristics
	Response time (sec), mean (95% CI)	12.4 (9.3–15.3)	24.0 (21.0-32.5)	< 0.01
	Response length (characters), mean (95% CI)	1,596 (1,395.0–1,705.0)	2,001 (1,845-2,212)	< 0.01
	Rationale for other answer options, N (%)	71 (78.9)	78 (86.7)	0.17
Reason for error, N (%)		n = 48	n = 38
	Logical error	30 (62.5)	17 (44.7)	0.02
	Statistical error	9 (18.8)	6 (15.8)	0.72
	Information error	9 (18.8)	15 (31.6)	0.04

ISSN: 1472-6920