Training Natively Interactive Models

Name: Training Natively Interactive Models
Start: 2025-11-21T13:00:00-05:00
End: 2025-11-21T13:45:00-05:00

Anne Wu, Cornell University

Date: Friday, November 21, 2025 Time: 1:00 - 1:45pm Virtual Link , opens in a new tab/window

Large language models are interactive artifacts, but interaction is baked into them almost offhandedly, both for deployment and learning, and this has far-reaching consequences: rather than supporting the naturalness of human interactions, interactions with LLMs are stilted turn-taking experiences, a far cry from the dynamic full-duplex (i.e., concurrent input/output) nature of human interaction. While reinforcement learning methods utilize interactions, they remain mostly applicable to narrow domains due to their dependence on simplified rewards or the risks of reward hacking.

This talk covers two projects: I will spend most of the talk describing full-duplex speech LLMs and the methods I designed to post-train them, showing consistent improvements in conversational engagement, safety, and factuality; and in the second part, I will discuss an RL benchmark for visuo-linguistic reasoning I created, focused on designing robust reward signals that are not susceptible to reward hacking or require significant task simplifications.