Basic Definitions

TLDR

Symbol

A symbol is the smallest building block. A symbol can be anything.

Alphabet

An alphabet is a finite, non-empty set of symbols.

Some examples of alphabets include:

Alphabets are typically denoted using the Greek letter Σ (Sigma).

Word

A word is a finite sequence of symbols with respect to some alphabet.

Some examples of words include:

There also exists a null word, which is typically denoted using the Greek letter ε (Epsilon).

Language

A language is a set of words with respect to some alphabet. Languages may be finite or infinite.

Some examples of languages include:

The null language is denoted ∅.
(This must not be mistaken for { ε }, the language containing the null word.)
The language of all words with respect to some alphabet Σ is denoted Σ*.


Basic Operations

Concatenation

Concatenation is defined for words and languages.

Let w1, w2 be words.
The concatenation w1 ⋅ w2 is a new word containing the symbols of w1 followed by the symbols of w2.

Concatenation is non-commutative. That is, w1 ⋅ w2 may not necessarily equal w2 ⋅ w1.
For example, let w1 = LIST and w2 = EN.
w1 ⋅ w2 = LISTEN
w2 ⋅ w1 = ENLIST

The "⋅" operator may be omitted when concatenating words.
These are equivalent notations: a ⋅ b = ab

Let L1 and L2 be languages.
The concatenation L1 ⋅ L2 is a new language containing the pairwise concatenation of all words in L1 with all words in L2.

For example, let L1 = { a, b, c } and L2 = { 1, 2, 3 }.
L1 ⋅ L2 = { a1, a2, a3, b1, b2, b3, c1, c2, c3 }
L2 ⋅ L1 = { 1a, 1b, 1c, 2a, 2b, 2c, 3a, 3b, 3c }

Exponentiation

Exponentiation is defined for words and languages.

Let w be a word.
wn is a concatenation of n copies of w.
wn = w ⋅ w ⋅ … ⋅ w (n times)

For example, let w = HO
w3 = HOHOHO

Let L be a language.
Ln is a language where each word is a concatenation of n words in L.
Ln = { w1 ⋅ w2 ⋅ … ⋅ wn | w1, w2, … wn ∈ L }

For example, let L = { left, right }
L3 = { leftleftleft, leftleftright, leftrightleft, leftrightright, rightleftleft, rightleftright, rightrightleft, rightrightright }

L0 is defined to be { ε }.

Kleene Closure

Kleene closure is defined for alphabets and languages.

Let Σ be an alphabet.
Σ* is the language of all words with respect to Σ, including ε.

Let L be a language.
The Kleene closure of L is the language of all arbitrarily sized concatenations of words in L.
L* = L0 ∪ L1 ∪ L2 ∪ …

Kleene Plus

Kleene plus is defined for alphabets and languages.

Let Σ be an alphabet.
Σ+ is the language of all words with respect to Σ, excluding ε

Let L be a language.
The Kleene plus of L is the language of all arbitrarily sized concatenations of words in L, excluding ε.
L+ = L1 ∪ L2 ∪ L3 ∪ …


Formal Language Theory and Automata Theory

The theory of computation is a collection of several different branches, including:

Formal language theory and automata theory are so closely related that it is best to study them concurrently.