Deeper Learning

13 July 2026

Training Language Models That Can Continue to Learn

By: Tessa Han, Sebastian Bordt, Hanlin Zhang, and Sham Kakade

The usefulness of a pretrained model depends not only on what it has learned, but also on its plasticity, i.e., its ability to keep learning. However, despite its importance for downstream adaptation, model plasticity remains understudied in pretraining. The authors find that using stronger weight decay during pretraining improves model plasticity. Contrary to common practice, minimizing pretraining validation loss does not necessarily produce the most plastic model, i.e., the model that performs best after further training. The authors also find potential explanations for why weight decay improves plasticity, including more structured internal representations, simpler attention patterns, and reduced overfitting.

Paper

23 June 2026

Jailbreak Scaling Laws for Large Language Models: Polynomial–Exponential Crossover

By: Indranil Halder, Annesya Banerjee and Cengiz Pehlevan

The authors report that adversarial prompt-injection attacks on large language models can amplify attack success rate from the slow polynomial growth observed without injection to exponential growth with the number of inference-time samples.

27 April 2026

Deep RL Needs Deep Behavior Analysis: Exploring Implicit Planning by Model-Free Agents in Open-Ended Environments

By: Riley Simmons-Edler, Ryan Badman and Kanaka Rajan

Model-free RL agents trained on simple tasks can be evaluated by whether or not they achieve their simple rewards, but complex open-ended environments demand richer analysis. ForageWorld introduces a naturalistic foraging benchmark where agents must explore, remember food patches, and survive across procedurally generated arenas. In our experiments, naive RNN agents develop structured exploration strategies, multi-factor patch memory, and implicit future planning. These behaviors mirror those documented in foraging insects and rodents and are recoverable only using neuroscience-inspired analysis tools. The approach generalizes across arena sizes, transfers readily to other recurrent architectures, and opens a path toward behavioral and neural transparency in increasingly complex AI systems.

Code
Paper

16 March 2026

Energy-Based Fine-Tuning: Beyond Next-Token Prediction

By: Samy Jelassi*, Mujin Kwun*, Rosie Zhao*, Yuanzhi Li, Nicolo Fusi, Yilun Du, Sham M. Kakade and Carles Domingo-Enrich*

The authors introduce a Energy-Based Fine-Tuning (EBFT), a method that matches the long-range statistics of model generations to ground-truth sequences in high-dimensional feature spaces. EBFT corrects the error amplification caused by standard teacher-forced next token training while improving downstream task performance. Experiments show that EBFT matches the accuracy improvements of RLVR while requiring no external reward signal or verifier. In contrast to RLVR, EBFT simultaneously improves the validation cross-entropy.

9 March 2026

InputDSA: Demixing then comparing recurrent and externally driven dynamics in complex systems

By: Ann Huang and Kanaka Rajan

The authors introduce InputDSA, a new method to measure the similarity between two complex systems when they are driven by external inputs, like biological neural circuits or reinforcement learning agents. The method disentangles each systems’ intrinsic dynamics from its input-driven effects, enabling highly accurate, robust, and efficient comparisons of those components.

4 March 2026

Structure, Disorder, and Dynamics in Task-Trained Recurrent Neural Circuits

By: David Clark,* Blake Bordelon,* Jacob Zavatone-Veth,* Cengiz Pehlevan

The authors develop a mean-field theory of task-trained recurrent networks that continuously interpolates between these regimes, and find evidence that macaque motor cortex is best captured by an intermediate level of task-specific recurrent restructuring.

6 February 2026

Forecasting the Brain: Scalable Neural Prediction with POCO

By: Yu Duan and Kanaka Rajan

Predicting future neural activity is a critical step toward achieving real-time, closed-loop neurotechnologies. To this end, we introduce POCO, a unified forecasting model trained on…

Paper
Code

4 February 2026

Anytime Pretraining: Horizon-Free Learning-Rate Schedules with Weight Averaging

By: Alexandru Meterez*, Pranav Ajit Nair*, Depen Morwani*, Cengiz Pehlevan, Sham Kakade

The authors provide a theoretical analysis demonstrating the existence of anytime learning schedules for overparameterized linear regression, and highlight the central role of weight averaging—also known as model merging—in achieving the optimal convergence rates of stochastic gradient descent.

3 February 2026

Measuring and Controlling Solution Degeneracy Across Task-Trained Recurrent Neural Networks

By: Ann Huang and Kanaka Rajan

Despite reaching equal performance success when trained on the same task, artificial neural networks can develop dramatically different internal solutions, much like different students solving the same math problem using completely different approaches. Our study introduces a unified framework to quantify this variability across Recurrent Neural Network (RNN) solutions, which we term solution degeneracy, and analyze what factors shape it across thousands of recurrent networks trained on memory and decision-making tasks.

Paper

26 January 2026

PROTON: A Relational Foundation Model for Neurological Discovery

By: Ayush Noori and Marinka Zitnik

This work introduces a relational foundation model for neurological discovery and evaluates it through discovery loops that connect AI predictions to experiments in Parkinson’s disease, bipolar disorder, and Alzheimer’s disease.

Feature Article

Training Language Models That Can Continue to Learn

Blog List

2026

Training Language Models That Can Continue to Learn

Jailbreak Scaling Laws for Large Language Models: Polynomial–Exponential Crossover

Deep RL Needs Deep Behavior Analysis: Exploring Implicit Planning by Model-Free Agents in Open-Ended Environments

Energy-Based Fine-Tuning: Beyond Next-Token Prediction

InputDSA: Demixing then comparing recurrent and externally driven dynamics in complex systems

Structure, Disorder, and Dynamics in Task-Trained Recurrent Neural Circuits

Forecasting the Brain: Scalable Neural Prediction with POCO

Anytime Pretraining: Horizon-Free Learning-Rate Schedules with Weight Averaging

Measuring and Controlling Solution Degeneracy Across Task-Trained Recurrent Neural Networks

PROTON: A Relational Foundation Model for Neurological Discovery