Automated radiology report generation has the potential to improve patient care and reduce the workload of radiologists. However, the path toward real-world adoption has been stymied by the challenge of evaluating the clinical quality of artificial intelligence (AI)-generated reports. We build a state-of-the-art report generation system for chest radiographs, called Flamingo-CXR, and perform an expert evaluation of AI-generated reports by engaging a panel of board-certified radiologists. We observe a wide distribution of preferences across the panel and across clinical settings, with 56.1% of Flamingo-CXR intensive care reports evaluated to be preferable or equivalent to clinician reports, by half or more of the panel, rising to 77.7% for in/outpatient X-rays overall and to 94% for the subset of cases with no pertinent abnormal findings. Errors were observed in human-written reports and Flamingo-CXR reports, with 24.8% of in/outpatient cases containing clinically significant errors in both report types, 22.8% in Flamingo-CXR reports only and 14.0% in human reports only. For reports that contain errors we develop an assistive setting, a demonstration of clinician–AI collaboration for radiology report composition, indicating new possibilities for potential clinical utility.
Radiology plays an integral and increasingly important role in modern medicine, by informing diagnosis, treatment and management of patients through medical imaging. However, the current global shortage of radiologists restricts access to expert care and causes heavy workloads for radiologists, resulting in undesirable delays and errors in clinical decisions1,2. In the past decade, we have witnessed the remarkable promise of AI algorithms as assistive technology for improving the access, efficiency and quality of radiological care, with more than 200 US Food and Drug Administration approved commercial products developed by companies based in more than 20 countries3 and approximately one in every three radiologists in the United States already benefiting from AI as part of their clinical workflow4.