5 February 2024
Repeat After Me: Transformers are Better than State Space Models at Copying
By: Samy Jelassi, David Brandfonbrener, Sham Kakade and Eran Malach
Improved efficiency of State Space Models sacrifices some core capabilities for modern LLMs.
7 December 2023
A Next-Generation Architecture for Elastic and Conditional Computation
The Matryoshka Way
By: Aditya Kusupati, Sneha Kudugunta, Devvrit, and Tim Dettmers
Introducing an algorithmic method to elastically deploy large models: the #MatFormer.
15 November 2023
Where Do Features Come From?
A story of sinusoids and inductive biases
By: Ben Edelman, Depen Morwani, Costin Oncescu, and Rosie Zhao
Mechanic interpretability results explained using known inductive biases.
9 November 2023
Watermarking in the Sand
By: Ben Edelman, Hanlin Zhang and Boaz Barak
Robust watermarking in AI is impossible under natural assumptions.