Speaker: Vincent Pauline
Title: An Introduction to Diffusion Model Theory Beyond Euclidean Data: Discrete Extensions for Biology
Abstract: We see the outputs of diffusion models everywhere, from hyper-realistic video generation to protein design, yet understanding the mathematics behind them remains a significant hurdle. While diffusion models are typically introduced in Euclidean spaces, many problems of interest involve discrete or multimodal data. This talk offers a theoretical introduction to diffusion models beyond the standard continuous setting, with a focus on discrete state spaces. We present a unified view of forward noising and reverse denoising in both discrete and continuous time, show how standard training objectives arise from a common variational perspective, and highlight how masking-based forward processes connect diffusion to masked language modeling. We also briefly discuss recent work on control and alignment in discrete generative models. Biology serves as a motivating context throughout, but the emphasis is on general principles and theory, making the talk broadly relevant to researchers across a range of fields.
This meeting will take place remotely on Teams