16 April 2024
Distinguishing the Knowable from the Unknowable with Language Models
By: Gustaf Ahdritz, Tian Qin, Nikhil Vyas, Boaz Barak, and Ben Edelman
A new way to label different types of uncertainty in unconstrained text and simple methods to predict those labels, including a completely unsupervised approach.
5 February 2024
Repeat After Me: Transformers are Better than State Space Models at Copying
By: Samy Jelassi, David Brandfonbrener, Sham Kakade and Eran Malach
Improved efficiency of State Space Models sacrifices some core capabilities for modern LLMs.
7 December 2023
A Next-Generation Architecture for Elastic and Conditional Computation
The Matryoshka Way
By: Aditya Kusupati, Sneha Kudugunta, Devvrit, and Tim Dettmers
Introducing an algorithmic method to elastically deploy large models: the #MatFormer.
15 November 2023
Where Do Features Come From?
A story of sinusoids and inductive biases
By: Ben Edelman, Depen Morwani, Costin Oncescu, and Rosie Zhao
Mechanic interpretability results explained using known inductive biases.
9 November 2023
Watermarking in the Sand
By: Ben Edelman, Hanlin Zhang and Boaz Barak
Robust watermarking in AI is impossible under natural assumptions.