Advancing Latent Diffusion Models for Language Generation

Title: Advancing Latent Diffusion Models for Language Generation

Bio: Justin Lovelace is a Ph.D. student in Computer Science at Cornell University advised by Prof. Kilian Q. Weinberger. His research focuses on the application of diffusion models to language and speech generation. His work aims to advance the capabilities and controllability of generative models within these domains.

Abstract: Diffusion models have revolutionized image synthesis, providing unprecedented quality and control. Despite this success, their application to discrete domains like language remains challenging. This talk presents a series of works bridging this gap, demonstrating the potential of diffusion for language generation. First, we explore text-to-speech synthesis. By optimizing diffusion training for speech, we outperform the state-of-the-art autoregressive system using significantly less training data. Next, we introduce Latent Diffusion for Language Generation. We develop language autoencoders with continuous latent spaces suitable for diffusion modeling. This enables the generation of fluent text through latent diffusion. Finally, we present Diffusion Guided Language Models (DGLMs) which use diffusion to generate a semantic proposal that guides an autoregressive decoder. DGLMs combine the fluency of autoregression with the plug-and-play control of diffusion. Through these works, we demonstrate how diffusion models can be adapted to language, opening new avenues for flexible and controllable language generation systems.