AI Brain Surgery on a Small Model: Can One SAE Feature Control Behavior?
A wiki-only SAE intervention lab on Pythia-70M: feature ranking, residualized feature knobs, and activation-gated minimal-pair behavior tests.
Read postLocal-First ML
I build reproducible ML experiments and the tooling around them—mostly interpretability work on open models—with clear methodology, measurable results, and artifacts you can run locally.
Browse the latest postsRead
Experiment reports, methodology notes, and engineering write-ups from an active interpretability lab.
A wiki-only SAE intervention lab on Pythia-70M: feature ranking, residualized feature knobs, and activation-gated minimal-pair behavior tests.
Read postLayer sweeps, top-k sparsity sweeps, and an interactive explorer for SAE feature discovery on a small open-weight model.
Read postA local-only SAE feature probing experiment on Pythia-70M: activation extraction, sparse autoencoder training, cross-domain generalization, and blog-ready artifacts.
Read postPairing with GPT-5-Codex to turn the SSA baby-name dataset into a polished Go CLI with weighted sampling, charts, and automated releases.
Read postCurrent work focuses on sparse autoencoders, feature steering, and practical interpretability protocols that stay reproducible on local hardware.
I prioritize measurable experiments over abstract claims: setup details, reported metrics, failure cases, and artifacts that are runnable without cloud lock-in.
Curtis Covington is a software engineer at Oracle Cloud Infrastructure (OCI), where he works on large-scale distributed systems and cloud control-plane infrastructure. He holds a Master’s degree from the University of Colorado Boulder and has a background spanning systems engineering, infrastructure tooling, and applied machine learning.
His current research focuses on AI interpretability, representation learning, and mechanistic analysis of transformer models. Through hands-on experimentation with sparse autoencoders and feature-level interventions, he explores how internal model representations form and how they can be measured, steered, and better understood.
This site serves as a technical research log — documenting experiments, implementation details, and lessons learned while building practical interpretability tooling.