22 July 2025
Flow Equivariant Recurrent Neural Networks
By: T. Anderson Keller
The author introduces the first flow equivariant models that respect motion symmetries, leading to significantly improved generalization and sequence modeling.
18 July 2025
The Hidden Linear Structure in Diffusion Models and its Application in Analytical Teleportation
By: Binxu Wang and John J. Vastola
Diffusion models are powerful generative frameworks that iteratively denoise white noise into structured data via learned score functions. Through theory and experiments, the authors demonstrate that these score functions are dominated by a linear Gaussian component.
14 July 2025
Scaling Offline Reinforcement Learning at Test Time
By: Nicolas Espinosa-Dice
Kempner researchers present a new algorithm for offline reinforcement learning that features a key property: self-consistency.
27 June 2025
Characterization and Mitigation of Training Instabilities in Microscaling Formats
By: Nikhil Anand and Chloe Huangyuan Su
The authors uncover consistent training instabilities when using new, highly efficient low-precision formats, which has implications for the development of next-generation AI. By pinpointing the root causes of these failures and demonstrating effective mitigation strategies, this work offers crucial insights into enabling more cost-effective and scalable model training on future hardware.
8 May 2025
What Data Assumptions Come With Your SAE?
By: Sai Sumedh R. Hindupur*, Ekdeep Singh Lubana*, Thomas Fel*, Demba Ba
The authors show that SAEs are inherently biased toward detecting only a subset of concepts in model activations shaped by their internal assumptions, highlighting the need for concept geometry-aware design of novel SAE architectures.
30 April 2025
ATOMICA: Learning Universal Representations of Molecular Interactions
By: Ada Fang and Marinka Zitnik
The authors present ATOMICA, a representation learning model that captures intermolecular interactions across all molecular modalities—proteins, nucleic acids, small molecules, and ions—at atomic resolution.
28 April 2025
Interpreting the Linear Structure of Vision-Language Model Embedding Spaces
By: Isabel Papadimitriou*, Chloe Huangyuan Su*, Thomas Fel*, Stephanie Gil, Sham Kakade
The authors find a sparse linear structure within VLM embedding spaces that is shaped by modality, yet stitched together through latent bridges—offering new insight into how multimodal meaning is constructed.
25 April 2025
To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning
By: Tian Qin, David Alvarez-Melis, Samy Jelassi, and Eran Malach
Backtracking is widely believed to improve reasoning by letting LLMs “fix their mistakes.” But is it always the best way to use test-time compute? The authors show that backtracking is not a one-size-fits-all solution: It depends on the task structure, model scale, and training paradigm.
14 April 2025
Mechanistic Interpretability: A Challenge Common to Both Artificial and Biological Intelligence
By: Demba Ba, with contributions from Sara Matias and Bahareh Tolooshams
The researchers describe DUNL (Deconvolutional Unrolled Neural Learning), a novel framework for understanding how neurons encode information when they respond to multiple factors simultaneously.
24 March 2025
TxAgent: an AI Agent for Therapeutic Reasoning Across a Universe of 211 Tools
A smarter way to navigate complex drug decisions
By: Shanghua Gao and Marinka Zitnik
The authors introduce TxAgent: a first of its kind AI agent for therapeutic reasoning across a universe of 211 tools, with a comparison against DeepSeek-R1 671B.