Ayush Rane
Rare tumor types present a challenge for machine learning models due to class imbalance and limited sample sizes. In this study, I explore the use of synthetic image generation using both Generative Adversarial Networks (GANs) [1,2] and pre-trained Stable Diffusion (SD) models [3,4] for augmenting a histopathologic dataset of tumor and normal tissue images. I evaluate the impact of synthetic data on a binary classification task using convolutional neural networks. My results show that while synthetic images exhibit low structural similarity (SSIM) to real tumor images, classifier performance improves when augmented with GAN or SD-generated images compared to a baseline CNN trained only on the original dataset, highlighting the potential benefit of synthetic data even when visual similarity is not ideal [5,6]. Metastatic lymph node images are used as an example of a rare tumor type, illustrating the clinical challenge of underrepresented tumor classes in histopathology datasets.