Hard Substrates, Soft Evidence

Philosophy / methodology · Revision in progress

Written February 2026Draft — may change substantiallyMedium confidence — some uncertainty

Read the full paper on PhilArchive →

The computational substrate fully determines behavioral output, but behavior radically underdetermines the substrate. You can't read emergence backward from outputs.

The argument

The LLM cognition debate is methodologically stuck. This paper diagnoses four technical errors sustaining the impasse:

Skeptics misdescribe computation. The "stochastic parrot" metaphor is wrong about what happens at inference. It's geometric transformation in high-dimensional space, not text stitching. Interpretability findings (induction heads, arithmetic circuits) demonstrate discoverable structure that doesn't reduce to surface statistics.

Skeptics misdescribe training. "Statistical learning from text" describes GPT-2 era, not current systems. Post-training (RLHF, DPO, tool use, environment feedback) shifts from descriptive to normative optimization.

Optimists over-infer from behavioral evidence. Broad behavioral competence is consistent with multiple internal organizations. Behavioral evidence can't establish what optimists claim because it's the wrong kind of evidence. 11.AsideThis is a dimensionality problem: behavior is a low-dimensional projection of a high-dimensional internal state. Many internal organizations produce the same behavioral output.

Careful work is architecture-bound. Most generalizations about "LLMs" are actually about transformers. The reversal curse (solved by diffusion architectures) proves some limitations are architecture-specific, not about learning or cognition generally.

The proposal

A four-source methodology for studying LLM cognition: behavioral evidence (necessary but insufficient), internal probing (accesses substrate directly), causal intervention (shows structure causally implicates behavior), and cross-architectural replication (distinguishes architecture-specific from general). Convergent findings across all four constrain the space of tenable positions. 22.AsideThe four-source methodology is essentially Lakatos's research programme methodology adapted for empirical AI: behavioral evidence is the "novel predictions," internal probing is access to the theoretical core, causal intervention is experimental test, and cross-architectural replication distinguishes the hard core from the protective belt.

Connections

The Bearer ProblemCompanion paper — this diagnoses methodology, Bearer diagnoses ontology. Both block the same premature conclusions.
Grokking DynamicsAn instance of the four-source methodology in practice: behavioral evidence (accuracy curves) was misleading until internal probing (Hessian, gradient rank) revealed the mechanism.