8 May 2025
What Data Assumptions Come With Your SAE?
By: Sai Sumedh R. Hindupur*, Ekdeep Singh Lubana*, Thomas Fel*, Demba Ba
The authors show that SAEs are inherently biased toward detecting only a subset of concepts in model activations shaped by their internal assumptions, highlighting the need for concept geometry-aware design of novel SAE architectures.
30 April 2025
ATOMICA: Learning Universal Representations of Molecular Interactions
By: Ada Fang and Marinka Zitnik
The authors present ATOMICA, a representation learning model that captures intermolecular interactions across all molecular modalities—proteins, nucleic acids, small molecules, and ions—at atomic resolution.
28 April 2025
Interpreting the Linear Structure of Vision-Language Model Embedding Spaces
By: Isabel Papadimitriou*, Chloe Huangyuan Su*, Thomas Fel*, Stephanie Gil, Sham Kakade
The authors find a sparse linear structure within VLM embedding spaces that is shaped by modality, yet stitched together through latent bridges—offering new insight into how multimodal meaning is constructed.
25 April 2025
To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning
By: Tian Qin, David Alvarez-Melis, Samy Jelassi, and Eran Malach
Backtracking is widely believed to improve reasoning by letting LLMs “fix their mistakes.” But is it always the best way to use test-time compute? The authors show that backtracking is not a one-size-fits-all solution: It depends on the task structure, model scale, and training paradigm.
14 April 2025
Mechanistic Interpretability: A Challenge Common to Both Artificial and Biological Intelligence
By: Demba Ba, with contributions from Sara Matias and Bahareh Tolooshams
The researchers describe DUNL (Deconvolutional Unrolled Neural Learning), a novel framework for understanding how neurons encode information when they respond to multiple factors simultaneously.
24 March 2025
TxAgent: an AI Agent for Therapeutic Reasoning Across a Universe of 211 Tools
A smarter way to navigate complex drug decisions
By: Shanghua Gao and Marinka Zitnik
The authors introduce TxAgent: a first of its kind AI agent for therapeutic reasoning across a universe of 211 tools, with a comparison against DeepSeek-R1 671B.
20 March 2025
Archetypal SAEs: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models
By: Thomas Fel*, Ekdeep Singh Lubana*, Jacob S. Prince, Matthew Kowal, Victor Boutin, Isabel Papadimitriou, Binxu Wang, Martin Wattenberg, Demba Ba, Talia Konkle (*denotes equal contribution)
The authors find that Archetypal SAE anchors concepts in the real data’s convex hull and delivers consistent and stable dictionaries.
10 March 2025
Traveling Waves Integrate Spatial Information Through Time
By: Mozes Jacobs, Roberto Budzinski, Lyle Muller, Demba Ba, and T. Anderson Keller
Through the use of recurrent neural networks trained to solve tasks requiring the integration of global information, but with constrained local connectivity, the authors find neurons learn to encode and transmit information to other spatially distant neurons through traveling waves.
10 February 2025
Alignment Reduces Conceptual Diversity of Language Models
By: Sonia Murthy, Tomer Ullman, and Jennifer Hu
The authors use a new way of measuring the conceptual diversity of synthetically-generated LLM “populations” to investigate whether LLMs capture the conceptual diversity of human populations.
19 December 2024
ProCyon: A Multimodal Foundation Model for Protein Phenotypes
By: Owen Queen, Robert Calef, and Marinka Zitnik
The authors introduce ProCyon, a multimodal foundation model to model, generate, and predict protein phenotypes.