Loading Events
Event Categories Past Event | Kempner Seminar Series | Past Event

Back to the Future – Data Efficient Language Modeling

Tatsunori Hashimoto

Date: Friday, October 3, 2025 Time: 2:30 - 4:00pm Talk Recording , opens in a new tab/window

Join us for a talk by Tatsunori Hashimoto, Assistant Professor of Computer Science at Stanford University. This talk is part of the Kempner Seminar Series, a research-level seminar series that covers topics related to the basis of intelligence in natural and artificial systems.

Compute scaling has dominated the conversation with modern language models, leading to an impressive array of algorithms that optimize performance for a given training (and sometimes inference) compute budget. But as compute has grown cheaper and more abundant, data is starting to become a bottleneck, and our ability to exchange compute for data efficiency may be crucial to future model scaling. In this talk, I’ll discuss synthetic data and algorithmic approaches to data efficiency, and show that in both cases, classical statistical perspectives based on nonparametric modeling and ensembling bring new insights and significant empirical benefits to modern questions of scaling and data efficiency.