Generative Models for Synthetic Data Generation in Biomedical Applications: An Introduction and Tutorial

Thu, Mar 06, 2025, 12:00 pm - Thu, Mar 06, 2025, 1:00 pm

Abstract:

Generative AI has become ubiquitous since the advent of ChatGPT, though the underlying mechanisms driving synthetic data generation are far from novel. Goodfellow and colleagues’ 2014 introduction of generative adversarial networks (GANs) was a watershed moment for the generative modeling domain and has since been widely used in a wide variety of contexts. This presentation will introduce participants to the structure and implementation of GANs before delving into one of their lesser-used applications: Generating synthetic patient records with data largely mirroring the same data obtained from real patients. This presentation illustrates an example using data from patients admitted to UK hospitals for either Acute Coronary Syndrome (ACS) or Takotsubo Syndrome (TTS), and the resulting prediction model from the original study (published as Ahmed et al., 2024). The tutorial will focus on the rationale behind utilizing this procedure, the process of preparing the data, fitting the GAN, and evaluating the generated data. A discussion of practical and ethical implications and responsible use will follow.

Speaker Bio:

Tony Mangino, PhD, is an Assistant Professor in the Biostatistics Consulting and Interdisciplinary Research Collaboration Lab (Biostat CIRCL) in the Department of Biostatistics, University of Kentucky. His research as a team scientist focuses on statistical and research design consultation on a variety of projects including program evaluations, biomedical intervention studies, survey research projects, and various student, faculty, and community endeavors. He also studies multilevel/mixed effects models, generative modeling, and methods for improving collaborations within research teams.

Relevant Publications:

Ahmed, T., Mangino, A.A., Lodhi, S.H., Gupta, V., Leung, S.W., & Sorrell, V.L. (2024). Simplified echocardiographic assessment of regional left ventricular wall motion pattern in patients with takotsubo and acute coronary syndrome: The Randomized Blinded Two-chamber Apical Kinesis Observation (TAKO) Study. Current Problems in Cardiology, 102731. DOI: 10.1016/j.cpcardiol.2024.102731

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., … & Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11), 139-144.

Neunhoeffer, N., (2022). RGAN: Generative Adversarial Nets (GAN) in R. R package version 0.1.1. https://CRAN.R-project.org/package=RGAN