6 February 2026
Forecasting the Brain: Scalable Neural Prediction with POCO
By: Yu Duan and Kanaka Rajan
Predicting future neural activity is a critical step toward achieving real-time, closed-loop neurotechnologies. To this end, we introduce POCO, a unified forecasting model trained on…
4 February 2026
Anytime Pretraining: Horizon-Free Learning-Rate Schedules with Weight Averaging
By: Alexandru Meterez*, Pranav Ajit Nair*, Depen Morwani*, Cengiz Pehlevan, Sham Kakade
The authors provide a theoretical analysis demonstrating the existence of anytime learning schedules for overparameterized linear regression, and highlight the central role of weight averaging—also known as model merging—in achieving the optimal convergence rates of stochastic gradient descent.
3 February 2026
Measuring and Controlling Solution Degeneracy Across Task-Trained Recurrent Neural Networks
By: Ann Huang and Kanaka Rajan
Despite reaching equal performance success when trained on the same task, artificial neural networks can develop dramatically different internal solutions, much like different students solving the same math problem using completely different approaches. Our study introduces a unified framework to quantify this variability across Recurrent Neural Network (RNN) solutions, which we term solution degeneracy, and analyze what factors shape it across thousands of recurrent networks trained on memory and decision-making tasks.
26 January 2026
PROTON: A Relational Foundation Model for Neurological Discovery
By: Ayush Noori and Marinka Zitnik
This work introduces a relational foundation model for neurological discovery and evaluates it through discovery loops that connect AI predictions to experiments in Parkinson’s disease, bipolar disorder, and Alzheimer’s disease.
5 January 2026
Large Video Planner: A New Foundation Model for General-Purpose Robots
By: Yilun Du
This work explores using video as the primary modality for robot foundation models. Unlike static images, videos naturally encode physical dynamics and semantics of the world, providing a rich prior for physical decision-making.
24 November 2025
Into the Rabbit Hull-Part II
From Linear Directions to Convex Geometry
By: Thomas Fel*, Binxu Wang*, Michael A. Lepori, Matthew Kowal, Andrew Lee, Randall Balestriero, Sonia Joseph, Ekdeep S. Lubana, Talia Konkle, Demba Ba, Martin Wattenberg
The authors ask the fundamental question: is the linear view of DINOv2 under the Linear Representation Hypothesis (LRH) sufficient to describe how deep vision models organize information? The authors examine the geometry and statistics of the learned concepts themselves and the results suggest that representations are organized beyond linear sparsity alone.
12 November 2025
Into the Rabbit Hull – Part I
By: Thomas Fel*, Binxu Wang*, Michael A. Lepori, Matthew Kowal, Andrew Lee, Randall Balestriero, Sonia Joseph, Ekdeep S. Lubana, Talia Konkle, Demba Ba, Martin Wattenberg
The authors offer an interpretability deep dive, examining the most important concepts emerging in one of today’s central vision foundation models, DINOv2.
29 October 2025
Boomerang Distillation Enables Zero-Shot Model Size Interpolation
By: Sara Kangaslahti, Nihal Nayak, Jonathan Geuter
The authors identify a novel phenomenon, Boomerang Distillation, which occurs when distilling a large language model into a smaller one. In this blog post, they describe how Boomerang Distillation can be used to create entire families of LLMs of fine-grained sizes without any training from a single student-teacher pair.
15 October 2025
LOTION: Smoothing the Optimization Landscape for Quantized Training
By: Mujin Kwun, Nikhil Anand, Depen Morwani
The authorsintroduce LOTION, a framework that optimizes a continuous variant of the quantized loss surface while provably preserving all global minima of the original problem.
9 October 2025
From Models to Scientists: Building AI Agents for Scientific Discovery
By: Shanghua Gao, Richard Zhu, Marinka Zitnik
ToolUniverse is a framework for developing AI agents for science, often referred to as “AI scientists.” It provides an environment where LLMs interact with more than six hundred scientific tools, including machine learning models, databases, and simulators. ToolUniverse standardizes how AI models access and combine these tools, allowing researchers to develop, test, and evaluate AI agents for science.