Since the release of artificial intelligence (AI) chatbot ChatGPT in November 2022, it seems that the ‘rise of the machines’ is becoming a reality, with the machine learning software demonstrating wide-ranging abilities with uncanny accuracy in many cases.
Now a recent report from the Netherlands, based on 30 Emergency Department (ED)cases in 2022, has highlighted how AI has been able to diagnose some medical conditions as well as if not better than some doctors, leading some to comment that ‘AI could revolutionise the medical field’.
So how has this been achieved, and could it indeed pose an existential threat to emergency doctors?
The 30 cases which were treated in a Dutch ED were examined by feeding information including patient histories, clinicians’ observations and laboratory tests into ChatGPT, which was then asked to suggest five possible diagnoses. It was then reported that the AI chatbot made a successful diagnosis in 97% of the cases compared with an 87% success rate of the doctors.
The conclusion was that AI is able to make a successful medical diagnosis in much the same way as any ED clinician would. However, the study authors were at pains to stress that the idea of computers running an ED is not yet on the table, rather that AI could potentially take a supporting role to under-pressure medics.
At the present time, it must be pointed out that by giving AI information already gleaned by physicians, it would not be surprising if the machine’s conclusions were as good as those of clinicians. It would probably be a different story altogether if AI had to first formulate a series of questions to ask in order to gain enough information to make an independent diagnosis.
In this respect, it can be argued that the technology is not that much further forward than the work of F. T. DeDombal in the 1980s when studies were performed into the computer-aided diagnosis of acute abdominal pain. It was shown that by filling out a proforma, both clinicians and computer were able to improve the accuracy of their diagnosis for a single condition – in this case appendicitis. The computer’s accuracy was reported to be 91.8% against a senior clinician’s of 79.6%, leading to a conclusion that computer-aided diagnostics could potentially be of practical value in a small percentage of cases. It was also reported, though, that when there were other factors and conditions to consider, the computer’s accuracy was reduced.
Fast-forward to 2022, and the Dutch study also reported that the AI had ‘encountered limitations’. These related to ‘medically implausible or inconsistent’ reasoning, which could, in a real case situation, lead to disastrous or significant consequences such as misinformation or incorrect diagnosis. The ChatGPT bot failed to provide a correct diagnosis in 5 cases, in particular one case involving a life-threatening abdominal aortic aneurysm with a swelling aorta. Other mistakes made by the bot included diagnosing anaemia with a low haemoglobin blood count in a patient with normal haemoglobin.
The Dutch researchers pointed out that the small sample size used in their 2022 study, along with the fact that the cases examined were relatively simple, single-issue complaints, are probably not enough on which to conclude that ChatGPT AI is ready to perform a full role in medical diagnostics and that the bots effectiveness in more complex cases has yet to be tested.
In conclusion, then, we must assume that the AI chatbot at this stage can be useful in helping with diagnosis by, for example, making suggestions that a clinician hasn’t considered. However, serious care must be taken when it comes to the sharing and protection of sensitive medical data.