Uncertainty and confidence in medical imaging AI

June - 04 - 2021

Dr Jarrel Seah | Associate Clinical AI Director

Dealing with uncertainty is central to medicine – medical practitioners are trained in differential diagnosis and how to manage myriad possibilities. For AI systems in medical imaging, this represents both an opportunity and a challenge. On the one hand, AI systems are great at precisely quantifying statistical probabilities, a task at which humans are notoriously bad. On the other hand, displaying the magnitude and sources of uncertainty to the user in an intelligible manner is a user design challenge that should not be underestimated.

Fundamentally, when identifying if a finding is present or absent, AI systems produce a continuous probability score, where a higher score indicates that the finding is more likely to be present. To convert this into a useful prediction, most systems define a threshold above which the finding is taken to be ‘present’. AI systems can convey their degree of confidence in the finding by displaying the score in addition to the absence or presence of the finding. Cases that barely exceed the threshold are less likely to be positive than those with higher scores, which should help users decide whether or not to trust the AI system’s prediction.

A key design feature of many AI systems is that they are usually ensembles of multiple models, meaning that the final prediction is tallied from the votes of multiple models. This reflects the stochastic nature of deep learning systems as retraining the same model with a slightly different dataset, or even the same dataset, can lead to differing predictions on the same data. This is a source of uncertainty analogous to the concept of inter and intra-observer variability in clinical practice; that giving the same image to different readers can result in different reports. AI systems can capture this information by not predicting the numerical score for a case but the uncertainty range around that score.

Doing so helps users troubleshoot predictions with which they may disagree. Low scores with a narrow range indicate that the case is inherently ambiguous even though the AI system has been trained on many similar such images. Whereas a wide uncertainty range – even with a high score – indicates that the AI system may not have seen sufficient similar such cases, leading to disagreement within the models of the AI system.

Managing and displaying confidence and uncertainty correctly can lead to significant benefits by helping users decide when to trust an AI prediction, as well as understand why a particular case may be problematic for the AI, improving user accuracy. However, this challenge is overlooked by many current AI vendors and poorly understood by many end-users.

As this emerging field matures, decision-makers must consider how AI systems deal with the challenge of uncertainty and proactively educate users on this topic to realise the full potential of AI systems and obtain maximum improvements in diagnostic performance when using AI assistance systems.

Annalise CXR

Data

Related resources

November-17-2022

Insights from implementation of an artificial intelligence assist device across a national radiology network

Authors: Sajith Karunasena; Michael Milne; Michael Vasimalla; Mark Wilson; Ronald Shnier; Catherine Jones Poster presented at UKIO 2022

Annalise CXR, Clinical research, CXR

September-29-2022

Radiologist reporting productivity benefits from AI-assisted triage of CXR studies in clinical practice

Authors: S. Karunasena, M. Milne, D. Rosewarne, C. Jones Poster presented at the British Institute of Radiology Annual Congress 2022, September 22nd-23rd Background Artificial intelligence […]

Annalise CXR, Clinical research, CXR

September-28-2022

Radiologist’s feedback post implementation of a comprehensive AI assist device for CXR across a large radiology network

Authors: S. Karunasena, M. Milne, M. G. Vasimalla, L. Danaher, M. Wilson, Q. Buchlak, C. Jones; Sydney/AU Poster presented at the European Society of Radiology Congress […]

Annalise CXR, Clinical research, CXR

March-18-2022

Diagnostic accuracy of a commercially available deep learning algorithm in supine chest radiographs following trauma

BJR. First published online 18 Mar 2022. Authors, Jacob Gipson, Victor Tang, Jarrel Seah, Helen Kavnoudias, Adil Zia, Robin Lee, Biswadev Mitra and Warren Clements Abstract Objectives : Trauma chest radiographs may contain […]

Annalise CXR, Clinical research, CXR

All resources

About Dr Jarrel Seah | Associate Clinical AI Director

Dr. Jarrel Seah is a radiology registrar at the Alfred Hospital, Melbourne and an AI engineer at usa.annalise.ai. He has a keen interest in developing and applying novel deep learning algorithms in radiology and particularly in explainable AI. He has published and presented in both medical and technical conferences such as SPIE and Radiology, and is a coordinator of the inaugural RANZCR Catheter and Line Position Kaggle challenge.

Named in Forbes 30 under 30, in 2014 he co-developed Eyenaemia, an app that allows people to use their cell phone to screen for anaemia, while studying medicine at Monash University, which won the World Championship and World Citizenship Competition for the Imagine Cup, an annual competition run by Microsoft.

Jarrel provides clinical and technical guidance in the development and validation of AI models at usa.annalise.ai and Harrison.ai, and undertakes cutting-edge research within the field of deep learning in radiology.