A flow diagram of the proposed pipeline for LGA driven Semi-Supervised (LGA-SS) training of CNNs. Once a dataset is gathered, the latent representations of the images in the dataset are generated using the LGA [9] (section 3.1), after which hierarchical $k$ means clustering (section 3.2) is used to identify a prioritised subset of images for human annotation. These annotations are used together with a set of algorithmically generated pseudo-labels for the remaining unannotated data to train a CNN that can be used for downstream classification tasks. The proposed LGA-SS method allows a CNN to be trained and applied to classification tasks on a per-dataset basis, making it effective in domains where there is limited transferability of learning between datasets.