Skip to content
Kempner Institute for the study of natural and artificial intelligence at Harvard University Kempner Institute Logo
  • About Us

    • About the Kempner
    • Mission, Vision & Values
    • Support Us
    • Contact Us
  • People

    • Core Research Community
    • Scientific Advisory Board (SAB)
    • Affiliate Faculty
    • Directory
  • Research

    • Our Research
    • Innovation in AI
    • Science of AI
    • AI and the Brain
    • Deeper Learning Blog
  • Compute
  • Education

    • Our Programs
    • Graduate Fellowship
    • Post-Baccalaureate Program
    • Undergraduate Research Programs
    • Courses and Workshops
  • Careers & Opportunities

    • Research & Graduate Fellowships
    • Undergraduate and Post-Baccalaureate
    • Faculty Opportunities
    • Kempner Institute Accelerator Awards
    • Jobs
  • Events

    • Kempner Seminar Series
    • Frontiers in NeuroAI (Symposium)
    • Calendar
Calendar News Kempner Community
Harvard Shield
Kempner Institute for the study of natural and artificial intelligence at Harvard University Kempner Institute Logo Harvard Shield
  • About Us
    • About the Kempner
    • Mission, Vision & Values
    • Support Us
    • Contact Us
  • People
    • Core Research Community
    • Scientific Advisory Board (SAB)
    • Affiliate Faculty
    • Directory
  • Research
    • Our Research
    • Innovation in AI
    • Science of AI
    • AI and the Brain
    • Deeper Learning Blog
  • Compute
  • Education
    • Our Programs
    • Graduate Fellowship
    • Post-Baccalaureate Program
    • Undergraduate Research Programs
    • Courses and Workshops
  • Careers & Opportunities
    • Research & Graduate Fellowships
    • Undergraduate and Post-Baccalaureate
    • Faculty Opportunities
    • Kempner Institute Accelerator Awards
    • Jobs
  • Events
    • Kempner Seminar Series
    • Frontiers in NeuroAI (Symposium)
    • Calendar
  • Calendar
  • News
  • Kempner Community
Kempner Institute Logo
Harvard University Logo

breadcrumb Menu

Back Link

Home Research Deeper Learning

Deeper Learning

A research blog from the Kempner Institute for the Study of Natural and Artificial Intelligence at Harvard University.

Subscribe to RSS Feed

Feature Article

Characterization and Mitigation of Training Instabilities in Microscaling Formats

June 27, 2025
By: Nikhil Anand and Chloe Huangyuan Su

The authors uncover consistent training instabilities when using new, highly efficient low-precision formats, which has implications for the development of next-generation AI. By pinpointing the root causes of these failures and demonstrating effective mitigation strategies, this work offers crucial insights into enabling more cost-effective and scalable model training on future hardware.

Blog List

2025

27 June 2025

Characterization and Mitigation of Training Instabilities in Microscaling Formats

By: Nikhil Anand and Chloe Huangyuan Su

The authors uncover consistent training instabilities when using new, highly efficient low-precision formats, which has implications for the development of next-generation AI. By pinpointing the root causes of these failures and demonstrating effective mitigation strategies, this work offers crucial insights into enabling more cost-effective and scalable model training on future hardware.

  • Preprint
  • Code
8 May 2025

What Data Assumptions Come With Your SAE?

By: Sai Sumedh R. Hindupur*, Ekdeep Singh Lubana*, Thomas Fel*, Demba Ba

The authors show that SAEs are inherently biased toward detecting only a subset of concepts in model activations shaped by their internal assumptions, highlighting the need for concept geometry-aware design of novel SAE architectures.

  • Preprint
  • Code (synthetic expt.)
  • Code (formal lang. expt.)
  • Code (vision expt.)
30 April 2025

ATOMICA: Learning Universal Representations of Molecular Interactions

By: Ada Fang and Marinka Zitnik

The authors present ATOMICA, a representation learning model that captures intermolecular interactions across all molecular modalities—proteins, nucleic acids, small molecules, and ions—at atomic resolution.

  • Paper
  • GitHub
  • Huggingface
28 April 2025

Interpreting the Linear Structure of Vision-Language Model Embedding Spaces

By: Isabel Papadimitriou*, Chloe Huangyuan Su*, Thomas Fel*, Stephanie Gil, Sham Kakade

The authors find a sparse linear structure within VLM embedding spaces that is shaped by modality, yet stitched together through latent bridges—offering new insight into how multimodal meaning is constructed.

  • Paper
  • Demo
25 April 2025

To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning

By: Tian Qin, David Alvarez-Melis, Samy Jelassi, and Eran Malach

Backtracking is widely believed to improve reasoning by letting LLMs “fix their mistakes.” But is it always the best way to use test-time compute? The authors show that backtracking is not a one-size-fits-all solution: It depends on the task structure, model scale, and training paradigm.

  • Paper
  • Code
14 April 2025

Mechanistic Interpretability: A Challenge Common to Both Artificial and Biological Intelligence

By: Demba Ba, with contributions from Sara Matias and Bahareh Tolooshams

The researchers describe DUNL (Deconvolutional Unrolled Neural Learning), a novel framework for understanding how neurons encode information when they respond to multiple factors simultaneously.

  • Paper
  • Code
24 March 2025

TxAgent: an AI Agent for Therapeutic Reasoning Across a Universe of 211 Tools

A smarter way to navigate complex drug decisions

By: Shanghua Gao and Marinka Zitnik

The authors introduce TxAgent: a first of its kind AI agent for therapeutic reasoning across a universe of 211 tools, with a comparison against DeepSeek-R1 671B.

  • Preprint
  • TxAgent code
  • ToolUniverse code
  • Project website
  • Huggingface
20 March 2025

Archetypal SAEs: Adaptive and Stable Dictionary Learning for Concept Extraction in Large Vision Models

By: Thomas Fel*, Ekdeep Singh Lubana*, Jacob S. Prince, Matthew Kowal, Victor Boutin, Isabel Papadimitriou, Binxu Wang, Martin Wattenberg, Demba Ba, Talia Konkle (*denotes equal contribution)

The authors find that Archetypal SAE anchors concepts in the real data’s convex hull and delivers consistent and stable dictionaries.

  • Preprint
  • Code
10 March 2025

Traveling Waves Integrate Spatial Information Through Time

By: Mozes Jacobs, Roberto Budzinski, Lyle Muller, Demba Ba, and T. Anderson Keller

Through the use of recurrent neural networks trained to solve tasks requiring the integration of global information, but with constrained local connectivity, the authors find neurons learn to encode and transmit information to other spatially distant neurons through traveling waves.

  • Preprint
  • Repository
10 February 2025

Alignment Reduces Conceptual Diversity of Language Models

By: Sonia Murthy, Tomer Ullman, and Jennifer Hu

The authors use a new way of measuring the conceptual diversity of synthetically-generated LLM “populations” to investigate whether LLMs capture the conceptual diversity of human populations.

1 2 3 Next
Kempner Institute Harvard University Logo

Science and Engineering Complex (SEC)

150 Western Avenue
Allston, MA 02134
kempnerinstitute@harvard.edu

Social Media Links

X LinkedIn Github YouTube Hugging Face Bluesky Contact Us

Join Our Newsletter

Newsletter Signup

Footer Menu

Digital Accessibility Privacy Statement

© 2025 The President and Fellows of Harvard College

!