# Jake Ewen > ML engineer working at the intersection of machine learning, philosophy of science, epistemology, and game theory. ## About Jake Ewen is an ML engineer who researches how systems organize at unexpected scales. His work spans machine learning research (grokking dynamics, training dynamics), philosophy of mind (the bearer problem, methodology for studying LLM cognition), formal verification (Lean 4 proofs for detection theory), and fiction. He is based in the US. Website: https://jacobewen.com Email: jacobt.ewen@gmail.com ## Work ### Grokking Dynamics https://jacobewen.com/work/grokking Systematic experimental decomposition of the grokking delay on modular arithmetic. Key findings: (1) grokking delay is amplified by optimizer momentum but not caused by it, (2) weight decay is the primary mechanism — it erodes memorization until compact generalizing circuits take over, (3) the "grokked" state is a dynamical equilibrium, not a fixed point, (4) curvature increases during memorization then declines, contradicting the flat-basin intuition. ~900 experimental jobs across parameter sweeps. ### The Bearer Problem https://jacobewen.com/work/bearer-problem When people argue about whether LLMs understand, they skip a prior question: what thing are we attributing understanding to? Maps six philosophical frameworks across four deployment pressure tests (multiplicity, topology, dormancy, copying) that biological organisms never face. Shows that behavioral disagreements often trace back to silent commitments about different bearers of cognition. ### Hard Substrates, Soft Evidence https://jacobewen.com/work/hard-substrates Methodological diagnosis of the LLM cognition debate. Identifies four technical errors: skeptics misdescribe computation and training, optimists over-infer from behavioral evidence, and careful work is architecture-bound. Proposes a four-source methodology: behavioral evidence, internal probing, causal intervention, and cross-architectural replication. ## Writing ### Training as Selection https://jacobewen.com/writing/training-as-selection Examines whether the Price equation — a substrate-neutral accounting identity for directed change — organizes neural network training dynamics. Uses grokking experimental data to compare biological selection and gradient descent: both produce environment-relative fitness, dynamic equilibria, and pressure-dependent regime taxonomies. Introduces quasispecies theory as a way to dissolve the individuation problem for neural circuits. ## Key Research Interests - Grokking and phase transitions in neural network training - Philosophy of LLM cognition and methodology for studying it - The Price equation and selection dynamics in non-biological systems - Emergence, computation, and multi-scale organization - Epistemology and game theory