← Projects
mechanistic interpretability, HMMs, toy models
Jun 2025 – Present
Trying to open the transformer inside out — studying transformers trained on Markov processes. Detection of non-linear truth representations in low-dimensional subspaces of the residual stream.