Projects

08 Apr, 2026 iTTT (WIP)
Implicit Test-Time Training enables sequence modelling with true O(1) training memory (and beats dense attention at long context?). Currently a work in progress.
18 Feb, 2026 ZEBRA (WIP)
Train parallel and test serial for scalable latent reasoning. Currently a work in progress.
19 Aug, 2025 CWIC
Compute Where It Counts. A new state-of-the-art method for creating sparse transformers that automatically decide when to use more or less compute.
14 Dec, 2024 pBit
Enabling both sparsity and low-bit quantization in neural networks using stochastic weights and the local reparameterization trick.
10 Jun, 2024 MonArc
Token-level residual energy-based models enable parallelizable pretraining with 2x better data efficiency than standard autoregressive models.
20 Apr, 2024 NoiseSearch
Metaheuristic search over diffusion model noise achieves better performance than best-of-N sampling (early work on test-time scaling).