29 October 2025
    
    
        
            Boomerang Distillation Enables Zero-Shot Model Size Interpolation
        
        
                    By: Sara Kangaslahti, Nihal Nayak, Jonathan Geuter
        
                    The authors identify a novel phenomenon, Boomerang Distillation, which occurs when distilling a large language model into a smaller one. In this blog post, they describe how Boomerang Distillation can be used to create entire families of LLMs of fine-grained sizes without any training from a single student-teacher pair.
        
                    
             
    
        15 October 2025
    
    
        
            LOTION: Smoothing the Optimization  Landscape for Quantized Training
        
        
                    By: Mujin Kwun, Nikhil Anand, Depen Morwani
        
                    The authorsintroduce LOTION, a framework that optimizes a continuous variant of the quantized loss surface while provably preserving all global minima of the original problem.
        
                    
             
    
        9 October 2025
    
    
        
            From Models to Scientists: Building AI Agents for Scientific Discovery
        
        
                    By: Shanghua Gao, Richard Zhu, Marinka Zitnik
        
                    ToolUniverse is a framework for developing AI agents for science, often referred to as “AI scientists.” It provides an environment where LLMs interact with more than six hundred scientific tools, including machine learning models, databases, and simulators. ToolUniverse standardizes how AI models access and combine these tools, allowing researchers to develop, test, and evaluate AI agents for science.
        
                    
             
    
        7 October 2025
    
    
        
            Using Cognitive Models to Reveal Value Trade-offs in Language Models
        
        
                    By: Sonia Murthy and Peng Qian
        
                    The authors use a leading cognitive model of value trade-off in polite speech to systematically examine how post-training choices like reasoning budget and alignment recipes might be affecting value trade-offs in language models. 
        
                    
             
    
        4 August 2025
    
    
        
            ANN-like Synapses in the Brain Mediate Online Reinforcement Learning
        
        
                    By: Shun Li
        
                    The authors show that a type of synapses in the brain challenges a long-held assumption about synaptic plasticity rules. These synapses switch between more excitatory and more inhibitory in an experience-dependent manner, and contribute to online dopamine updates during reinforcement learning.
        
                    
             
    
        30 July 2025
    
    
        
            Accelerating RL for LLM Reasoning with Optimal Advantage Regression
        
        
                    By: Zhaolin Gao
        
                    Gao and collaborators propose a new RL algorithm that estimates the optimal value function offline from the reference policy and performs on-policy updates using only one generation per prompt.
        
                    
             
    
        28 July 2025
    
    
        
            Solvable Model of In-Context Learning Using Linear Attention
        
        
                    By: Mary Letey
        
                    This work provides a sharp characterization of in-context learning (ICL) in an analytically-solvable model, which offers insights into the sample complexity and data quality requirements for ICL to happen. These insights can be applied to more complex, realistic architectures.
        
                    
             
    
        22 July 2025
    
    
        
            Flow Equivariant Recurrent Neural Networks
        
        
                    By: T. Anderson Keller
        
                    The author introduces the first flow equivariant models that respect motion symmetries, leading to significantly improved generalization and sequence modeling.
        
                    
             
    
        18 July 2025
    
    
        
            The Hidden Linear Structure in Diffusion Models and its Application in Analytical Teleportation
        
        
                    By: Binxu Wang and John J. Vastola
        
                    Diffusion models are powerful generative frameworks that iteratively denoise white noise into structured data via learned score functions. Through theory and experiments, the authors demonstrate that these score functions are dominated by a linear Gaussian component.
        
                    
             
    
        14 July 2025
    
    
        
            Scaling Offline Reinforcement Learning at Test Time
        
        
                    By: Nicolas Espinosa-Dice
        
                    Kempner researchers present a new algorithm for offline reinforcement learning that features a key property: self-consistency.