Overview of the design of the internal validation (stage 1), external validation (stage 2), and reader study (stage 3). We used 27 362 images, allocating 70% for training the model (19 156 images), 20% for validation (5474 images), and 10% for internal testing (2732 images). The second stage involved selecting 1885 images for external validation, aimed at assessing the model’s generalisability across diverse datasets. For the third stage, we did a reader study using 400 randomly selected images from the external validation dataset. This study enlisted both expert otolaryngologists and primary care otolaryngologists to evaluate whether the AI model could improve their diagnostic capabilities in detecting abnormalities and identifying malignancies.