You are here
HomeParikh's theorem
Primary tabs
Parikh’s theorem
Let $\Sigma$ be an alphabet. Fix an order on elements of $\Sigma$, so that they are $a_{1},\ldots,a_{n}$. For each word $w$ over $\Sigma$, we can form an $n$tuple $(w_{1},\ldots,w_{n})$ so that each $w_{i}$ is the number of occurrences of $a_{i}$ in $w$. The function $\Psi:\Sigma^{*}\to\mathbb{N}^{n}$ such that
$\Psi(w)=(w_{1},\ldots,w_{n})$ 
is called the Parikh mapping (over the alphabet $\Sigma$). Here, $\mathbb{N}$ is the set of nonnegative integers.
Two languages $L_{1}$ and $L_{2}$ over some $\Sigma$ are called letterequivalent if $\Psi(L_{1})=\Psi(L_{2})$.
For example, $L_{1}:=\{a^{n}b^{n}\mid n\geq 0\}$ is letterequivalent to the $L_{2}:=\{(ab)^{n}\mid n\geq 0\}$, as both are mapped to the set $\{(n,n)\mid n\geq 0\}$ by $\Psi$. Note that $L_{1}$ is contextfree, while $L_{2}$ is regular. In fact, we have the following:
Theorem 1 (Parikh).
Every contextfree language is letterequivalent to a regular language.
Like the pumping lemma, Parikh’s theorem can be used to show that certain languages are not contextfree. But in order to apply Parikh’s theorem above, one needs to know more about the structure of the set $\Psi(L)$ for a given contextfree language $L$.
First, note that $\mathbb{N}^{n}$ is a commutative monoid under addition. For any element $x$ and subset $S$ of $\mathbb{N}^{n}$, define $x+S$ in the obvious manner. Next, let $x_{1},\ldots,x_{m}$ be elements of $\mathbb{N}^{n}$. Denote
$\langle x_{1},\ldots,x_{m}\rangle$ 
the submonoid generated by $x_{1},\ldots,x_{m}$.
A subset $S$ of $\mathbb{N}^{n}$ is said to be linear if it has the form
$x_{0}+\langle x_{1},\ldots,x_{m}\rangle,$ 
for some $x_{0},\ldots,x_{m}\in\mathbb{N}^{n}$. It is not hard to see that every linear set is the image of some regular language under $\Psi$. For example,
$(1,0,0)+\langle(2,1,0),(0,3,2)\rangle=\Psi(\{a(a^{2}b)^{m}(b^{3}c^{2})^{n}\mid m% ,n\geq 0\}).$ 
The example can be easily generalized.
A subset of $\mathbb{N}^{n}$ is said to be semilinear if it is the union of finitely many linear sets. Since regular languages are closed under union, every semilinear set is the image of a regular language under $\Psi$. As a result, Parikh’s theorem can be restated as
Theorem 2 (Theorem 1 restated).
If $L$ is contextfree, then $\Psi(L)$ is semilinear.
Using this theorem, one easily sees that the language $\{a^{p}\mid p\text{ is a prime number }\}$ is not contextfree, and neither is the language
$\{a^{m}b^{n}\mid\mbox{ either }m\mbox{ is prime, and }n\geq m,\mbox{ or }n% \mbox{ is prime, and }m\geq n\},$ 
which can not be proved by the pumping lemma in a straightforward manner.
Remarks.

The converse of Theorem 2 above is false. For example, let $L=\{a^{n}b^{n}c^{n}\mid n\geq 0\}$. Then $L$ is not contextfree by the pumping lemma. However, $\Psi(L)=\langle(1,1,1)\rangle$, which is clearly linear.

In the literature, a lanugage $L$ such that $\Psi(L)$ is semilinear is often called a sliplanguage. It can be shown that for a sliplanguage $L$ over an alphabet consisting of two symbols, the language $\Psi^{{1}}\circ\Psi(L)$ is contextfree.
References
 1 H.R. Lewis, C.H. Papadimitriou, Elements of the Theory of Computation. PrenticeHall, Englewood Cliffs, New Jersey (1981).
 2 D. C. Kozen, Automata and Computability, Springer, New York (1997).
 3 R.J. Parikh, On ContextFree Languages, Journal of the Association for Computing Machinery, 4 (1966), 570581.
 4 M. Latteux, Cônes rationnels commutatifs, J. Comput. Systems Sci. 18 (1979), 307333.
Mathematics Subject Classification
68Q42 no label found68Q45 no label found Forums
 Planetary Bugs
 HS/Secondary
 University/Tertiary
 Graduate/Advanced
 Industry/Practice
 Research Topics
 LaTeX help
 Math Comptetitions
 Math History
 Math Humor
 PlanetMath Comments
 PlanetMath System Updates and News
 PlanetMath help
 PlanetMath.ORG
 Strategic Communications Development
 The Math Pub
 Testing messages (ignore)
 Other useful stuff
 Corrections