Loading [MathJax]/extensions/Safe.js

Kempner Institute Collaborators Win Outstanding Paper Award at ICML 2025

By Yohan J. JohnJuly 15, 2025

One of six papers, among 3,200 conference papers, recognized for technical depth, novelty, and potential for impact in the field of machine learning

The paper's authors include (left to right) Jaeyeon Kim, a computer science Ph.D. student at Harvard, Sitan Chen, an assistant professor of computer science at Harvard SEAS, and Sham Kakade, co-director of the Kempner Institute and Gordon McKay Professor of Computer Science and Statistics.

A paper authored by Kempner Institute researchers and collaborators has won an outstanding paper award at ICML 2025, the 42nd International Conference on Machine Learning, held July 13 – 19, 2025 in Vancouver, Canada.

The paper’s authors include co-first authors Jaeyeon Kim, a computer science Ph.D. student at Harvard SEAS, and Kulin Shah, a computer science Ph.D. student at the University of Texas at Austin, as well as co-authors Vasilis Kontonis, a postdoctoral fellow at the University of Texas at Austin, Sitan Chen, an assistant professor of computer science at Harvard SEAS, and Kempner Co-director Sham Kakade.

The award, given to the research team for their paper “Train for the Worst, Plan for the Best: Understanding Token Ordering in Masked Diffusions,” was one of six papers to be given the award, out of roughly 12,000 overall paper submissions, and 3,200 papers selected for the conference. The award recognizes technical depth, novelty, and potential for impact in the field. 

Research offers insights into the power of masked diffusion models (MDMs)

The award-winning paper compares two types of models used for generating “tokens,” which are the building blocks used by models to assemble text and other types of sequential data. The first type, called autoregressive models (ARMs), generates one token at a time in a fixed order, always building on what it has already generated. The second type, masked diffusion models (MDMs), generates tokens of a sequence in any order, gradually replacing random placeholders with more meaningful content over the course of multiple steps. In this paper, the researchers introduce a new strategy that unlocks the potential of MDMs to perform better than ARMs.

For both ARMs and MDMs, training involves solving problems that are like “fill-in-the-blank” exercises. The difference between the two lies in the locations of the “masked” tokens, which are the “blanks” that the models are trained to fill. For ARM training, the masked tokens are always at the end, whereas for MDM training, the masked tokens can be anywhere. Training MDMs is computationally challenging because the process involves an enormous number of possible blanks to be filled. By contrast, ARMs only learn to fill one blank at the end of a given training sequence.

The researchers showed that if an MDM adaptively chooses the order in which it generates tokens, it can solve problems much more effectively. To prove this, they tested their approach on logical tasks like Sudoku puzzles and found that their adaptive strategy helped MDMs perform far better than ARMs. The takeaway: using smart, adaptive strategies during the generation of outputs offers insight into just how powerful MDMs can be.

About the Kempner

The Kempner Institute seeks to understand the basis of intelligence in natural and artificial systems by recruiting and training future generations of researchers to study intelligence from biological, cognitive, engineering, and computational perspectives. Its bold premise is that the fields of natural and artificial intelligence are intimately interconnected; the next generation of artificial intelligence (AI) will require the same principles that our brains use for fast, flexible natural reasoning, and understanding how our brains compute and reason can be elucidated by theories developed for AI. Join the Kempner mailing list to learn more, and to receive updates and news.


PRESS CONTACT:

Deborah Apsel Lang | (617) 495-7993 

kempnercommunications@harvard.edu