Constructions of regular languages

Question

Fix a finite alphabet $A$. The set of regular languages is the smallest set of languages on $A$ (i.e. subsets $L$ of the free monoid $A^*$) containing $\varnothing$ and the singletons $a$ for $a\in A$ (it is customary to suppress the curly braces and write $a$ instead of $\{a\}$) which is stable under the following operations: union (written $+$), concatenation (written $\cdot$) and Kleene star. Regular languages admit several independent characterizations e.g. languages recognized by finite automata (deterministic or not, with or without $\varepsilon$-transitions), languages $L \subset A^*$ with a finite number of derived languages (i.e. languages of the form $u^{-1}L$, $u\in A^*$), order ideals of regular well quasi-orders on $A^*$.

Regular languages are stable under many operations. I would like to collect as many as possible in this thread. An ideal answer should provide a construction of a new regular language and a short justification, if possible, as to why this defines a regular language. I am interested both in regular languages constructed from other regular languages as well as regular languages that emerge through completely different means.

I let your read the article How to prove that a language is regular or star-free?, Proc. LATA 2020, LNCS 12038 (2020) 68-88 and this answer where you will find further references. — J.-E. Pin
– J.-E. Pin, Commented Jul 14 at 10:49

Olivier Bégassat · Accepted Answer · 2025-07-10 09:48:22Z

Standard operations

By definition of the class of regular languages (on $A$) we can always form the following regular languages from regular languages $K$ and $L$:

$K + L$ (their union)
$K\cdot L$ (their concatenation)
$K^*$ (the Kleene star)

Olivier Bégassat · Accepted Answer · 2025-07-10 09:50:52Z

Complement and intersection

Every regular language $L$ is recognized by some deterministic automaton $\mathcal{A}$. The complement $L^c$, too, is then recognized by a deterministic automaton. Indeed consider the automaton $\mathcal{A}'$ which is identical to $\mathcal{A}$ save for having as its accepting states $F' := Q \setminus F$ equal to the complement of the accepting states $F$ of $\mathcal{A}$.

It follows that for regular languages $K$ and $L$

$K \cap L$ (their intersection)

is regular, too.

Olivier Bégassat · Accepted Answer · 2025-07-10 10:04:45Z

Prefixes, suffixes and factors

Let $L$ be a regular language. Consider $$ \DeclareMathOperator\prefixes{Prefixes} \DeclareMathOperator\suffixes{Suffixes} \DeclareMathOperator\factors{Factors} \left\{ \begin{array}{lcl} \prefixes(L) & \equiv & \lbrace u \in A^* \mid \exists v \in A^* \text{ s.t. } uv \in L \} \\ \suffixes(L) & \equiv & \lbrace u \in A^* \mid \exists v \in A^* \text{ s.t. } uv \in L \} \\ \factors(L) & \equiv & \lbrace v \in A^* \mid \exists u, w \in A^* \text{ s.t. } uvw \in L \} \\ \end{array}\right. $$ Suppose that $L$ is recognized by an automaton $\mathcal{A}$. Then we produce automata for the above like so

for $\prefixes(L)$ consider the automaton obtained from $\mathcal{A}$ by adjoining a new state $\iota$ and introducing $\varepsilon$-transitions from $\iota$ to any state $q \in Q$ which is on at least one accepting path of $\mathcal{A}$; define the initial states of this new automaton to be $\lbrace \iota \rbrace$ and the accepting states to be those of $\mathcal{A}$; this automaton recognizes $\prefixes(L)$;
for $\suffixes(L)$ the same idea applies by applying a new state $f$ and adjoining $\varepsilon$-transitions $q \xrightarrow{\varepsilon} f$ from every state $q \in Q$ which lies on at least one accepting path of $\mathcal{A}$

And $\factors = \prefixes \circ \suffixes = \suffixes \circ \prefixes$ thus takes regular languages to regular languages.

Olivier Bégassat · Accepted Answer · 2025-07-10 10:15:36Z

Substrings

If $L$ is a regular language on $A$ define $$ \DeclareMathOperator\substrings{Substrings} \substrings(L) \equiv \lbrace w \in A^* \text{ substring of some } W \in L\} $$ where a word $w = a_1\cdots a_n \in A^*$, for some $n \in \Bbb{N}$, is a substring of another word $W = b_1 \cdots b_N$ if there exists indices $1 \leq i_1 < i_2 < \cdots < i_n \leq N $ with $a_k = b_{i_k}$ for all $k$.

If an automaton $\mathcal{A}$ recognizes $L$ then you construct a new automaton $\mathcal{A}'$ which has the same states, initial states and final states and for every pair of nodes $q, q'$ which admits some transition $q \xrightarrow{a} q'$ we add an $\varepsilon$-transition $q \xrightarrow{\varepsilon} q'$. Then $\mathcal{A}'$ recognizes $\substrings(L)$.

Olivier Bégassat · Accepted Answer · 2025-07-10 10:32:57Z

Root

Let $L$ be a regular language and define $$ \sqrt{\phantom{\big|}L\phantom{|}} \equiv \big \{ w \in A^* \mid w\cdot w \in L \big \} $$ Then $\sqrt{L}$ is regular. If $\mathcal{A}$ recognizes $L$ then define a nondeterministic automaton $$ \sqrt{\phantom{\big|}\mathcal{A}\phantom{|}} = \bigsqcup_{q \in Q} \mathcal{A}^q \times \mathcal{A}_q $$ where $\mathcal{A}^q$ and $\mathcal{A}_q$ coincide with $\mathcal{A}$ save for $\mathcal{A}^q$ having accepting set $\{q\}$ and $\mathcal{A}_q$ having intial state $q$. Then $\sqrt{\mathcal{A}}$ recognizes $\sqrt{L}$.

Olivier Bégassat · Accepted Answer · 2025-07-10 10:43:36Z

Interleaving

Suppose $K$ and $L$ are regular languages. Then the set of interleavings of two words of $K$ and $L$ $$ \mathrm{III}(K,L) \equiv \big\{ u_1v_1\cdots u_nv_n \mid u_1 \cdots u_n \in K \wedge v_1 \cdots v_n \in L \big\} $$ is again regular. Indeed, if $\mathcal{A}$ and $\mathcal{B}$ are automata that recognize $K$ and $L$ respectively then define a nondeterministic automaton structure on the product $\mathcal{A} \times \mathcal{B}$ with the following transitions $$ \begin{cases} (q,r) \xrightarrow{a} (q',r) & \text{if } q \xrightarrow{a} q' \text{ in } \mathcal{A} \\ (q,r) \xrightarrow{a} (q,r') & \text{if } r \xrightarrow{a} r' \text{ in } \mathcal{B} \\ \end{cases} $$ This automaton, with initial states $I_\mathcal{A} \times I_\mathcal{B}$ and final states $F_\mathcal{A} \times F_\mathcal{B}$ recognizes $\mathrm{III}(K,L)$.

Olivier Bégassat · Accepted Answer · 2025-07-10 11:03:34Z

Splicing

Suppose $K$ and $L$ are regular languages. Then the set of splicings of two words of $K$ and $L$ $$ \DeclareMathOperator\splicings{Splicings} \splicings(K,L) \equiv \big\{ uv \mid \exists \gamma \in A^* \text{ s.t. } u\gamma \in K \text{ and } \gamma v \in L \big\} $$ is again regular.

Indeed, suppose $\mathcal{A}$ and $\mathcal{B}$ recognize $K$ and $L$ respectively. Suppose furthermore that all states in their underlying state spaces, $Q_\mathcal{A}$ and $Q_\mathcal{B}$, are traversed by an accepting path. Define the relation on $\mathcal{R} \subset Q_\mathcal{A} \times Q_\mathcal{B}$ whereby $q \mathcal{R} q'$ if and only if $L(\mathcal{A}_q) \cap L(\mathcal{B}^q) \neq \emptyset$. The interpretation is that $q\mathcal{R}q'$ if and only if there is a path $\gamma$ that takes $q \in Q_\mathcal{A}$ to an accepting state of $\mathcal{A}$ and takes an initial state of $\mathcal{B}$ to $q'$. Then consider the automaton $\mathcal{A} \sqcup \mathcal{B}$ with underlying state space $Q_\mathcal{A} \sqcup Q_\mathcal{B}$, all transitions from $\mathcal{A}$ and $\mathcal{B}$ as well as $\varepsilon$-transitions $q \xrightarrow{\varepsilon} q'$ whenever $q\mathcal{R}q'$.

Then this automaton recognizes $\splicings(K,L)$.

Olivier Bégassat · Accepted Answer · 2025-07-10 11:19:23Z

Left and right quotients

If $L$ is a regular language and $X,Y \subset A^*$ are an arbitrary languages then $$ \left\{\begin{array}{lcl} X^{-1} L \equiv \big\{ u \mid \exists x \in X \text{ s.t. } xu \in L \big\} \\ L Y^{-1} \equiv \big\{ u \mid \exists y \in Y \text{ s.t. } uy \in L \big\} \end{array}\right. $$ Are both regular as is the language $(X^{-1} L) Y^{-1} = X^{-1} (L Y^{-1}) =: X^{-1} L Y^{-1}$. This is explained in this answer. The long and short of it is that $$ X^{-1} L = \bigcup_{x \in X} x^{-1} L $$ is necessarily a finite union of regular languages (since regular languages have a finit number of derivatives $u^{-1} L$, $u \in A^*$). The same argument applies to $L Y^{-1}$.

cat-sofia · Accepted Answer · 2025-09-16 13:15:42Z

Substitution

If $L$ is a regular language over $\Sigma$, and we have a function $f : \Sigma \rightarrow\mathbf{Regular}(\Xi)$, then we can define a regular language by substituting each $\sigma \in \Sigma$ with $f(\sigma)$. The procedure is easiest to define over regular expressions, but of course extends to every equivalent characterisation.

Stack Exchange Network

Constructions of regular languages

9 Answers 9

Standard operations

Complement and intersection

Prefixes, suffixes and factors

Substrings

Root

Interleaving

Splicing

Left and right quotients

Substitution

You must log in to answer this question.

Linked

Hot Network Questions

Constructions of regular languages

9 Answers 9

Standard operations

Complement and intersection

Prefixes, suffixes and factors

Substrings

Root

Interleaving

Splicing

Left and right quotients

Substitution

You must log in to answer this question.

Linked

Related

Hot Network Questions