Leveraging Pre-Trained Models Reduces Expert Annotation Effort for Improving AI Models
Bradley Wheeler
Pro |
Presented at: Department of Pathology 2025 Research Day and Retreat
Date: 2025-05-28 00:00:00
Views: 12
Summary: Background: Annotating images for machine learning is time-consuming and costly, often requiring detailed analysis of thousands of images by domain experts. Poor annotations negatively impact model performance, making it crucial to balance annotation efficiency and quality. Machine-generated “pseudo-annotations” can be efficiently created by leveraging pre-trained models from similar tasks to imperfectly predict annotations for a new task. Applying customized filtering criteria to these pseudo-annotations enables evaluation of their quality and facilitates the selection of high-confidence examples for training next-generation models. This approach can automate much of the annotation and selection process, mitigating the trade-off between efficiency and quality.
Objective: We aimed to evaluate whether a Faster R-CNN model can be trained to detect amyloid-beta plaques with cored and diffuse subtypes in brain tissue using pseudo-annotations generated by a published YOLO model trained to detect cored plaques and cerebral amyloid angiopathy.
Methods: We developed a two-step pipeline to generate and use pseudo-annotations for training and evaluation. A total of 29,328 image tiles from 21 whole-slide images were collected. The YOLO model was used to generate initial pseudo-annotations. Filtering criteria identified 553 tiles with high-quality annotations, which were used to train and test the Faster R-CNN model. An additional 50 tiles with low-quality annotations based on the YOLO model were used to assess whether our model could improve upon these suboptimal predictions. Performance was measured using mean average precision (mAP), sensitivity, and precision.
Results: On the high-quality test set, our model achieved a sensitivity and precision of 1.0 and a mAP of 0.9998, indicating high concordance with the YOLO model. Expert review showed subtle improvements over the YOLO predictions even on this image set. On the low-quality test set, our model identified 86 additional plaques missed by the YOLO model and missed one diffuse plaque the YOLO model had identified. Localization precision also improved in this set.
Conclusion: Our approach demonstrates the viability of training improved plaque detection models with minimal domain expert intervention. Leveraging pre-trained models enables transfer of domain knowledge without redundant manual effort. While some oversight is needed to fine-tune filtering thresholds and expand beyond baseline capability, the resulting model can accelerate annotation and model development. Future investigations in active learning can be used to evaluate this approach. Thomas M. Pearce