ACL 2018

On the Practical Computational Power of Finite Precision RNNs for Language Recognition

Gail Weiss, Yoav Goldberg, Eran Yahav

TL;DR — In theory, RNNs with infinite precision can be Turing-complete. But in practice, with finite precision (float32), RNNs are limited to recognizing regular languages — they can't reliably count or use a stack. This paper proves this gap and shows experimental evidence.

The Problem

Theoretical results show that recurrent neural networks are computationally powerful — under idealized assumptions, they can simulate Turing machines. But these proofs require infinite precision arithmetic: the ability to store and manipulate real numbers with unbounded accuracy.

Real implementations use float32 — 32 bits of floating-point precision. This is a far cry from infinite. So the question becomes: what can RNNs actually compute in practice?

The Key Idea

With finite precision, each component of the hidden state can only take on a finite number of distinguishable values. A 32-bit float has at most 232 distinct values. If the hidden state is a vector of d floats, there are at most 232d possible states.

This means the RNN is effectively a finite automaton. It cannot maintain unbounded counters or stacks. Languages like anbn (which require counting that n a's match n b's) can only be recognized up to some bounded n — the counter eventually saturates or wraps around.

Core result: Finite precision constrains any RNN to a finite number of reachable configurations. No matter how cleverly the network is trained, it cannot escape the computational class of regular languages when precision is bounded.

Interactive Demo: Precision vs. Power

See how precision limits what an RNN can recognize. Choose a language, adjust the precision, pick a string length, and watch the RNN process the string step by step.

RNN Finite Precision Simulator

8-bit
Speed:
Input string
Hidden state (counter value)
0
Capacity
Max distinguishable states: 256
Ready

At low precision the hidden state saturates — the RNN can no longer distinguish different counts and misclassifies long strings.

Precision vs. Recognizable Language

Precision Distinct States Max n for anbn Effective Language Class

How It Works

Finite Precision Implies Finite States

An RNN updates its hidden state at each time step: h_t = f(h_{t-1}, x_t). With infinite precision, h ranges over all reals and the RNN can encode arbitrary information. With finite precision, h is quantized to a finite set of representable values.

A system with finitely many states that reads input one symbol at a time is, by definition, a finite automaton. Finite automata recognize exactly the regular languages.

What Gets Lost

The Hierarchy Collapses

In the Chomsky hierarchy, regular languages are the weakest class (Type 3). Context-free (Type 2), context-sensitive (Type 1), and recursively enumerable (Type 0) languages all require more computational power. Theoretical results place infinite-precision RNNs at Type 0. Finite precision collapses them all the way down to Type 3.

Key insight: The gap between theory and practice is not incremental — it spans the entire Chomsky hierarchy. Practical RNNs are fundamentally weaker than their theoretical idealization.

Results

The paper provides both theoretical proofs and empirical validation:

These results help explain why RNNs struggle with tasks requiring precise counting or hierarchical structure — it is not just a training problem but a fundamental capacity limitation imposed by finite precision.

@inproceedings{weiss2018practical, title={On the Practical Computational Power of Finite Precision RNNs for Language Recognition}, author={Weiss, Gail and Goldberg, Yoav and Yahav, Eran}, booktitle={Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL)}, year={2018} }