TL;DR — In theory, RNNs with infinite precision can be Turing-complete. But in practice, with finite precision (float32), RNNs are limited to recognizing regular languages — they can't reliably count or use a stack. This paper proves this gap and shows experimental evidence.
The Problem
Theoretical results show that recurrent neural networks are computationally powerful — under idealized assumptions, they can simulate Turing machines. But these proofs require infinite precision arithmetic: the ability to store and manipulate real numbers with unbounded accuracy.
Real implementations use float32 — 32 bits of floating-point precision. This is a far cry from infinite. So the question becomes: what can RNNs actually compute in practice?
The Key Idea
With finite precision, each component of the hidden state can only take on a finite number of distinguishable values. A 32-bit float has at most 232 distinct values. If the hidden state is a vector of d floats, there are at most 232d possible states.
This means the RNN is effectively a finite automaton. It cannot maintain unbounded counters or stacks. Languages like anbn (which require counting that n a's match n b's) can only be recognized up to some bounded n — the counter eventually saturates or wraps around.
Core result: Finite precision constrains any RNN to a finite number of reachable configurations. No matter how cleverly the network is trained, it cannot escape the computational class of regular languages when precision is bounded.
Interactive Demo: Precision vs. Power
See how precision limits what an RNN can recognize. Choose a language, adjust the precision, pick a string length, and watch the RNN process the string step by step.
RNN Finite Precision Simulator
At low precision the hidden state saturates — the RNN can no longer distinguish different counts and misclassifies long strings.
Precision vs. Recognizable Language
| Precision | Distinct States | Max n for anbn | Effective Language Class |
|---|
How It Works
Finite Precision Implies Finite States
An RNN updates its hidden state at each time step: h_t = f(h_{t-1}, x_t). With infinite precision, h ranges over all reals and the RNN can encode arbitrary information. With finite precision, h is quantized to a finite set of representable values.
A system with finitely many states that reads input one symbol at a time is, by definition, a finite automaton. Finite automata recognize exactly the regular languages.
What Gets Lost
- Counting: To recognize anbn, the RNN must count the number of a's and then verify the same number of b's. But a finite-precision counter can only count up to some maximum before it saturates or wraps around.
- Stack simulation: Context-free languages like balanced parentheses require a stack. A stack of unbounded depth needs unbounded memory, which finite precision cannot provide.
- Turing completeness: Simulating a Turing machine tape requires storing arbitrary amounts of information in the hidden state — impossible with finitely many distinct values.
The Hierarchy Collapses
In the Chomsky hierarchy, regular languages are the weakest class (Type 3). Context-free (Type 2), context-sensitive (Type 1), and recursively enumerable (Type 0) languages all require more computational power. Theoretical results place infinite-precision RNNs at Type 0. Finite precision collapses them all the way down to Type 3.
Key insight: The gap between theory and practice is not incremental — it spans the entire Chomsky hierarchy. Practical RNNs are fundamentally weaker than their theoretical idealization.
Results
The paper provides both theoretical proofs and empirical validation:
- Theoretical: A formal proof that any fixed-precision RNN recognizes a regular language, regardless of architecture (Elman, GRU, LSTM) or training.
- Empirical: Experiments show that trained RNNs generalize well on regular languages (e.g., a*b*) but fail to generalize on context-free languages (e.g., anbn) beyond the lengths seen during training.
- LSTM advantage: LSTMs, with their additive gating, can maintain counters more effectively than Elman RNNs for moderate lengths — but ultimately still fail for sufficiently long strings.
These results help explain why RNNs struggle with tasks requiring precise counting or hierarchical structure — it is not just a training problem but a fundamental capacity limitation imposed by finite precision.