In this post, we examines the claim that natural languages are ε-ambiguity languages in the sense defined by the probabilistic theories of language and latent-intention inference in (Jiang, 2023). This article surveys linguistic, psycholinguistic, and computational evidence demonstrating that natural languages exhibit precisely this structure.
Natural languages support reliable communication despite variability, noise, and structural underspecification. Unlike programming languages, they allow metaphor, ellipsis, ambiguity, deixis, and context-dependent meaning. Yet humans typically recover the intended meaning with high accuracy.
Recent theoretical work
but alternative interpretations occur with small but non-zero probability.
In this framework, although many meanings are technically compatible with a linguistic expression, one meaning dominates the posterior probability, and ambiguity occurs only with small probability ε(x). In
As in
A language is an ε-ambiguity language if for every meaningful expression x:
\[\Pr(\theta_0 \mid x) \ge 1 - \varepsilon(x),\]where θ₀ is the intended meaning, and ε(x) quantifies residual ambiguity.
This describes a sparse posterior over meanings: a dominant intention and a long but very small tail of alternatives.
This model predicts that:
The fundamental principle behind ε-ambiguity languages is that linguistic expressions exhibit partial but not perfect semantic determinacy. Messages tend to convey one meaning with high probability, but never with absolute certainty. Real-world natural languages possess exactly these properties: they support efficient communication despite intrinsic ambiguity, and ambiguity is controlled by contextual, semantic, and pragmatic mechanisms.
The ε-ambiguity framework formalizes this intuition within a probabilistic generative model of communication, where meanings (latent intentions θ) are drawn from a space Θ, and surface messages are generated by noisy processes with intention-specific distributions q(x ∣ θ). The framework provides a mathematical explanation for why LLMs can infer hidden meanings from text and why phenomena such as chain-of-thought reasoning reduce uncertainty.
The central question of this article is:
What empirical and theoretical evidence supports the view that natural languages satisfy the definition of ε-ambiguity languages?
We demonstrate that evidence comes from multiple research domains.
Natural languages contain extensive lexical ambiguity
This illustrates exactly the condition:
\[\Pr(\theta_0 \mid x) \approx 1 - \varepsilon(x), \quad \varepsilon(x) \text{ small but nonzero},\]where θ₀ is the dominant sense.
Classic syntactic ambiguities (e.g., “I saw the man with the telescope”) allow multiple parses, yet listeners overwhelmingly adopt one interpretation when context is provided. Probabilistic grammars assign steeply skewed probability distributions to parses
Pragmatics often shifts literal meanings to intended meanings
Research on speech-act recognition
Humans use contextual probabilities to resolve ambiguity almost instantaneously
This supports the claim that:
Work in contextual integration
The ε-ambiguity model predicts (and proves) that when a listener receives multiple messages \((x_1, x_2, \dots, x_m )\) generated from the same θ:
\[\varepsilon_{\text{combined}} \approx \varepsilon(x_1)\varepsilon(x_2)\cdots\varepsilon(x_m).\]This is aligned with psychological evidence that humans aggregate cues. This multiplicative decay of ε explains:
Thus ε plays the role in controlling inference quality.
Conversation analysis
Grice’s cooperative principle
even when multiple interpretations are technically possible.
Probabilistic models such as topic models, PCFGs, and neural parsers show extreme sparsity in the joint distribution of meanings and linguistic forms
This sparsity is exactly the structure assumed for ε-ambiguous languages.
LLMs themselves reveal ε-like behavior:
When prompts are under-specified, LLM outputs diverge, demonstrating non-zero ε(x). When prompts are clarified or expanded (e.g., chain-of-thought prompting), the model’s output variance collapses—interpretable as effective ε(x) decreasing multiplicatively with additional linguistic evidence
The theoretical models show that concatenating independent messages reduces ambiguity roughly like:
\[\varepsilon_{\text{combined}} \approx \prod_i \varepsilon(x_i).\]This aligns with empirical improvements when LLMs receive:
Work on instruction tuning
The empirical observations in previous sections strongly suggest that natural languages align with the ε-ambiguity formalism. In this section, we present a more formal theoretical justification supporting the necessity of ε-ambiguity for any human language capable of large-scale communication, inference, and compositional generalization.
Let θ denote a latent intention and x a surface linguistic signal. Human communication is characterized by:
Inherent variability in production
Speakers do not produce perfectly deterministic signals for intentions.
Redundancy and recoverability in comprehension
Listeners consistently recover the intended meaning despite variability:
This requires that the conditional distribution over intentions, q(θ ∣ x), be sharply peaked but not a delta distribution. Formally, this implies:
\[\Pr(\theta_0 \mid x) = 1 - \varepsilon(x),\]where ε(x) captures the intrinsic noise or ambiguity.
If ε(x) were zero:
If ε(x) were large:
Thus, human language must live in the intermediate regime:
\[0 < \varepsilon(x) \ll 1.\]This is precisely the definition of an ε-ambiguity language.
Given Shannon’s channel coding theorem, any efficient communication system must satisfy:
\[H(\theta \mid x) > 0,\]unless compressed messages carry arbitrarily large complexity.
Natural languages are highly compressed representations of latent intentions. For compression to be efficient:
\[H(\theta \mid x) = \mathbb{E}[\varepsilon(x)] > 0\]must hold. But for communication to function at all:
\[H(\theta \mid x) \ll H(\theta)\]must also hold.
This yields:
\[0 < \varepsilon(x) \ll 1.\]Thus ε is not merely empirical—it is forced by fundamental information-theoretic constraints on communication between bounded agents.
A classic result from information theory states that, under bounded channel capacity, a communication code must balance:
Natural languages accomplish this by encoding intentions in probabilistic distributions over many correlated cues (syntax, semantics, prosody, discourse), with none individually deterministic. This “multi-cue redundancy” structure implies that:
\[q(x \mid \theta) \text{ is broad, but structured},\]leading again to:
\[\Pr(\theta_0 \mid x) \approx 1 - \varepsilon(x)\]for some small ε(x).
The ε term mathematically captures the trade-off between:
Thus ε-ambiguity is not an accident but a structural necessity for language to be both expressive and learnable.
A fully deterministic mapping from θ → x (ε = 0) would break compositionality in languages such as English:
If ε were zero, all linguistic constructions would require exhaustive specification of intentions, leading to:
Conversely, if ε is small but non-zero, compositional structures can afford underspecification, because the listener’s inferential machinery resolves them with high probability.
Thus ε > 0 is a prerequisite for efficient and human-like generative grammar.
Probabilistic pragmatics
Empirically, these models consistently find:
Thus:
\[\Pr(\theta_0 \mid x) = 1 - \varepsilon(x)\]emerges naturally as a mathematical property of Bayesian interpretation under realistic priors and likelihoods.
This shows that ε-ambiguity is a mathematically inevitable property of any communicative system interpreted via Bayesian reasoning, which includes both humans and LLMs.
Across linguistic, psycholinguistic, computational, and conversational evidence, natural languages display the following features:
| Property | Observed in Natural Languages | Matches ε-Ambiguous Definition |
|---|---|---|
| Multiple interpretations possible | ✔ | ε(x) > 0 |
| One interpretation strongly dominant | ✔ | Pr(θ₀ | x ) ≈ 1-ε(x) |
| Ambiguity decreases with context | ✔ | ε(x₁x₂) ≈ ε(x₁)ε(x₂) |
| Communication is highly reliable | ✔ | ε(x) generally small |
| Occasional misinterpretation occurs | ✔ | ε(x) nonzero |
Natural languages therefore satisfy the fundamental requirements of ε-ambiguity languages: they encode meaning with probabilistic stability but not absolute determinacy.
The ε-ambiguity model elegantly captures the essential structure of natural language semantics. Evidence from multiple disciplines demonstrates that natural languages exhibit:
At last, the following empirical facts also align closely with the ε-ambiguity framework and serve as the further evidence:
Together, these points provide a rigorous foundation for treating natural languages as ε-ambiguity languages.