Update on ChatGPT and Radiology Readings

ChatGPT-4 Vision and Radiology Exam Questions

ChatGPT-4 Vision is an enhanced version of OpenAI’s GPT-4 that can interpret both text and images. This multimodal capability allows it to analyze visual content, such as photos, diagrams, and medical images, in addition to understanding text.

Applications of ChatGPT-4 Vision include assisting with medical imaging analysis, enhancing accessibility by describing images, extracting data from visual documents, and supporting creative tasks. However, it has limitations, such as occasional inaccuracies when interpreting complex images, especially in specialized fields like radiology.

A study published in Radiology evaluated the performance of ChatGPT-4 Vision on radiology exam questions, revealing that while the model performed well on text-based questions, it struggled with image-related questions. ChatGPT-4 Vision is the first version of the language model capable of interpreting both text and images.

The study, led by Dr. Chad Klochko, used 377 retired questions from the American College of Radiology’s Diagnostic Radiology In-Training Examinations. The model answered 65.3% of all questions correctly, achieving 81.5% accuracy on text-only questions but only 47.8% on questions with images. The model performed best on image-based questions in chest and genitourinary radiology, and worst in nuclear medicine.

 

The study explored different prompting techniques and found that the model declined to answer 120 questions, primarily image-based, and showed hallucinatory responses when interpreting images, suggesting incorrect interpretations leading to correct diagnoses. Dr. Klochko emphasized the need for more specialized evaluation methods, as the model’s current limitations in accurately interpreting radiologic images restrict its applicability in clinical settings.

 

ChatGPT-4 For Summarizing Radiology Reports for Pancreatic Ductal Adenocarinoma

A study published in Radiology found that ChatGPT-4 outperforms GPT-3.5 in creating structured, summarized radiology reports for pancreatic ductal adenocarcinoma (PDAC), potentially improving surgical decision-making. Led by Dr. Rajesh Bhayana from the University of Toronto, the study demonstrated that GPT-4 generated near-perfect PDAC synoptic reports and achieved high accuracy in categorizing resectability using a chain-of-thought prompting strategy, resulting in more accurate and efficient surgical decision-making.

 

The study included 180 PDAC staging CT reports from Princess Margaret Cancer Centre in 2018. Two radiologists set a reference standard for 14 key features and the National Comprehensive Cancer Network (NCCN) resectability category. ChatGPT-4 was found to have equal or higher F1 scores than GPT-3.5 for all features and outperformed GPT-3.5 in categorizing resectability. Surgeons using AI-generated reports reduced their review time by 58%.

 

The findings suggest that ChatGPT-4 can improve standardization, communication, and efficiency in pancreatic cancer care. However, Paul Chang, MD, from the University of Chicago, emphasized the need to integrate these AI capabilities into scalable and comprehensive workflows, acknowledging the gap between feasibility and operational solutions.

 

Sources:

Auntminnie.com
medicalexpress.com
openai.com