Update on ChatGPT and Radiology Readings

ChatGPT-4 Vision and Radiology Exam Questions

ChatGPT-4 Vision is an enhanced version of OpenAI’s GPT-4 that can interpret both text and images. This multimodal capability allows it to analyze visual content, such as photos, diagrams, and medical images, in addition to understanding text.

Applications of ChatGPT-4 Vision include assisting with medical imaging analysis, enhancing accessibility by describing images, extracting data from visual documents, and supporting creative tasks. However, it has limitations, such as occasional inaccuracies when interpreting complex images, especially in specialized fields like radiology.

A study published in Radiology evaluated the performance of ChatGPT-4 Vision on radiology exam questions, revealing that while the model performed well on text-based questions, it struggled with image-related questions. ChatGPT-4 Vision is the first version of the language model capable of interpreting both text and images.

The study, led by Dr. Chad Klochko, used 377 retired questions from the American College of Radiology’s Diagnostic Radiology In-Training Examinations. The model answered 65.3% of all questions correctly, achieving 81.5% accuracy on text-only questions but only 47.8% on questions with images. The model performed best on image-based questions in chest and genitourinary radiology, and worst in nuclear medicine.

 

The study explored different prompting techniques and found that the model declined to answer 120 questions, primarily image-based, and showed hallucinatory responses when interpreting images, suggesting incorrect interpretations leading to correct diagnoses. Dr. Klochko emphasized the need for more specialized evaluation methods, as the model’s current limitations in accurately interpreting radiologic images restrict its applicability in clinical settings.

 

ChatGPT-4 For Summarizing Radiology Reports for Pancreatic Ductal Adenocarinoma

A study published in Radiology found that ChatGPT-4 outperforms GPT-3.5 in creating structured, summarized radiology reports for pancreatic ductal adenocarcinoma (PDAC), potentially improving surgical decision-making. Led by Dr. Rajesh Bhayana from the University of Toronto, the study demonstrated that GPT-4 generated near-perfect PDAC synoptic reports and achieved high accuracy in categorizing resectability using a chain-of-thought prompting strategy, resulting in more accurate and efficient surgical decision-making.

 

The study included 180 PDAC staging CT reports from Princess Margaret Cancer Centre in 2018. Two radiologists set a reference standard for 14 key features and the National Comprehensive Cancer Network (NCCN) resectability category. ChatGPT-4 was found to have equal or higher F1 scores than GPT-3.5 for all features and outperformed GPT-3.5 in categorizing resectability. Surgeons using AI-generated reports reduced their review time by 58%.

 

The findings suggest that ChatGPT-4 can improve standardization, communication, and efficiency in pancreatic cancer care. However, Paul Chang, MD, from the University of Chicago, emphasized the need to integrate these AI capabilities into scalable and comprehensive workflows, acknowledging the gap between feasibility and operational solutions.

 

Sources:

Auntminnie.com
medicalexpress.com
openai.com

 

A Look at 2023 and ChatGPT In Radiology

ChatGPT has quickly moved beyond its niche beginnings and become an integral part of everyday life. Its reach extends well past casual conversation, now penetrating various industries, notably the intricate world of radiology. As we close out 2023, we take a look at some headlines that show how far ChatGPT has advanced in the realm of diagnostic imaging.

Smart Enough to Pass Exam Questions

In two recent studies published in Radiology, researchers evaluated ChatGPT’s performance in answering radiology board exam questions. While the AI showed potential, it also demonstrated limitations affecting its reliability. ChatGPT, based on GPT-3.5, answered 69% of questions correctly, struggling more with higher-order thinking questions due to its lack of radiology-specific training.

A subsequent study with GPT-4 showcased improvement, answering 81% correctly and excelling in higher-order thinking questions. However, it still faced reliability concerns, answering some questions incorrectly and exhibiting occasional inaccuracies termed “hallucinations.”

Confident language was consistently used, even in incorrect responses, posing a risk, especially for novices who might not recognize inaccuracies.

 

Decision Making in Cancer Screening: Bard Vs ChatGPT

A study recently published in American Radiology compares ChatGPT-4 and Bard, two large language models, in aiding radiology decisions for breast, ovarian, colorectal, and lung cancer screenings. They tested various prompts, finding both models to perform well overall. ChatGPT-4 showed higher accuracy in certain scenarios, especially with ovarian cancer screening. However, Bard performed better with specific prompts for breast and colorectal cancer. Open-ended prompts improved both models’ performance, suggesting their potential use in unique clinical scenarios. The study acknowledged limitations in scoring subjectivity, limited scorers, and the focus on specific cancer screenings based on ACR guidelines.

bard AI
Can AI assist in diagnostic imaging?

Simplifying Readability of Reports

The study in European Radiology explores using ChatGPT and similar large language models to simplify radiology reports for easier patient comprehension. Researchers had ChatGPT translate complex reports into simpler language for patient understanding. Fifteen radiologists evaluated these simplified reports, finding them generally accurate and complete, yet also identified factual errors and potentially misleading information in a significant portion of the simplified reports. Despite these issues, the study highlights the potential for large language models to enhance patient-centered care in radiology and other medical fields, emphasizing the need for further adaptation and oversight to ensure accuracy and patient safety.

 

Sources:

Rsna.org
diagnosticimaging.com
Radiologybusiness.com
openai.com

 

ChatGPT in Radiology: Is it a Pro or Con?

The emergence of ChatGPT in the medical field, particularly in radiology, has generated a mix of excitement and concern about its role. But is it accurate enough to put into use? Can we trust artificial intelligence (AI) with the health of our patients?

How Could ChatGPT be Used?

An article in Diagnostic and Interventional Imaging discusses various ways in which radiologists can leverage ChatGPT. It highlights applications for clinical radiologists, such as implementing ChatGPT as a chatbot for patient inquiries, supporting clinical decision-making with information and analysis assistance, and enhancing patient communication and follow-up care by simplifying radiology reports and crafting tailored recommendations. Academic radiologists can benefit from ChatGPT by receiving suggestions for impactful research article titles, assistance with structuring and formatting academic papers, and help in formatting citations for bibliographies. The article emphasizes that the best use of ChatGPT in radiology depends on individual needs and goals, potentially paving the way for a more intelligent future in the field.  It notes that while ChatGPT offers valuable support, it’s crucial to fact-check its answers and review its output to ensure accuracy and relevance.

What Radiologists Have to Say

In RSNA’s article, The Good, the Bad and the Ugly of Using ChatGPT, various radiologists give their opinions on the use of this AI. Dr. Som Biswas, who published an article in Radiology entirely written by ChatGPT, believes that its potential benefits in reducing the workload and improving efficiency in radiology outweigh its limitations, which could be especially valuable in addressing the growing demand for medical imaging and reports in the face of a radiologist shortage.

Yiqiu Shen, MS, a researcher at New York University’s Center for Data Science, remarked, “In general, it’s ok to use ChatGPT as a language aid or to provide a template, but it’s dangerous to rely on ChatGPT to make a clinical decision.”

 

Urologic Imaging and AI: A Study

A study published in Current Problems in Diagnostic Radiology compared the performance of OpenAI’s ChatGPT and Google Bard in suggesting appropriate urologic imaging methods based on American College of Radiology (ACR) criteria. Both chatbots demonstrated an appropriate imaging modality rate of over 60%, with no significant difference between them in the proportion of correct imaging modality selected. However, the researchers noted that both chatbots lacked consistent accuracy and further development is needed for clinical implementation. The study found that while the chatbots were not entirely consistent in their responses, they hold promise in assisting healthcare providers in determining the best imaging modality, potentially improving clinical workflows in the future. ChatGPT provided shorter responses and had a slightly longer response time compared to Bard, which was faster but struggled with determining appropriate imaging modalities in a few scenarios.

 

Vesta: A Tech-Forward Company

Vesta Teleradiology looks forward to a future integrating AI with medicine. Click here to read more about Vesta Teleradiology Partners with MIT for AI Research

 

Sources:

radiologybusiness.com
rsna.org
Auntminnie.com
openai.com