Can artificial intelligence pass the written European Board of Hand Surgery exam?

Scritto il 30/05/2025

da Sinan Mert

Hand Surg Rehabil. 2025 May 28:102197. doi: 10.1016/j.hansur.2025.102197. Online ahead of print.

ABSTRACT

Various artificial intelligence-based applications have emerged as transformative tools across numerous domains. Among these, ChatGPT has earned global recognition with its capacity for dynamic user interaction and holds significant potential in the medical sector. However, the subject-specific accuracy of ChatGPT remains a matter of debate. This study assesses the capabilities and knowledge of different artificial intelligence chatbots (ChatGPT, Google Gemini, and Claude) in the domain of hand surgery. Each chatbot conducted a full written EBHS exam. The test results were analyzed according to the EBHS-guidelines, focused on the total scores and the ratio of correct to incorrect responses for each artificial intelligence model. Findings revealed that three out of the four chatbots achieved passing scores on the exam. Notably, ChatGPT-4o1 demonstrated significantly superior performance. This study highlights the subject-specific expertise of different artificial intelligence programs within the specialized field of hand surgery while also underscoring their variability and limitations.

PMID:40447102 | DOI:10.1016/j.hansur.2025.102197