Recent Popular Leaderboard What is KiKo? Case Reports

Using large language models to identify cutaneous immune-related adverse events in response to immunotherapy.

Need to claim your poster? Find the KiKo table at the conference and they'll help you get set up.

Presented at: Society for Investigative Dermatology 2025

Date: 2025-05-07 00:00:00

Views: 2

Summary: Abstract Body: Cutaneous immune-related adverse events (cirAEs) are the most frequent adverse reactions among patients receiving immune checkpoint inhibitors (ICIs) and have emerged as predictors of therapeutic response. However, identifying these events is challenging due to their varied presentation and absence of specific diagnostic codes to capture their occurrence, requiring time-consuming manual chart review by trained personnel. This study explores using Large Language Models (LLMs) to identify cirAEs from unstructured outpatient and inpatient clinical notes. Unlike previous work focused on structured inpatient notes, we address the novel challenge of analyzing the far more heterogeneous outpatient documentation where most cirAEs are documented. We developed and validated our approach using a multi-institutional cohort of ICI patients from the Mass General Brigham Healthcare System and Dana-Farber Cancer Institute. We evaluated three state-of-the-art open-source language models (LLaMa-3.1, Gemma-2, and Qiwen-2.5) and GPT4o, implementing a Retrieval Augmented Generation (RAG) system utilizing a vector database and hybrid search re-ranking. We further performed example-based finetuning to increase our models' performance. On a random sample of 100 patients' notes, ChatGPT achieved the highest performance with a sensitivity of 0.82 and specificity of 0.88, while LLaMa-3.1, Gemma-2 and Qiwen-2.5 achieved sensitivities of 0.55, 0.68 and 0.23, and specificities of 0.63, 0.64 and 0.94 respectively. The LLMs vectorized and analyzed 100 patient notes in just 45 minutes (27 seconds per patient), a significant improvement over the 15 minutes typically required per patient for manual review. With more optimization, LLMs may replace manual annotation entirely, requiring a human in-the-loop only for final verification. This approach provides tremendous time savings that could unlock valuable data from unstructured clinical notes, enabling further study that could advance understanding of cirAEs, their clinical implications, and biological mechanisms. Charles Lu<sup>1</sup>, Guihong Wan<sup>1</sup>, Sara Khattab<sup>1</sup>, Yevgeniy Semenov<sup>1</sup> 1. Harvard Medical School, Boston, MA, United States. Bioinformatics, Computational Biology, and Imaging