A symbol is the smallest building block. A symbol can be anything.
An alphabet is a finite, non-empty set of symbols.
Some examples of alphabets include:
Alphabets are typically denoted using the Greek letter Σ (Sigma).
A word is a finite sequence of symbols with respect to some alphabet.
Some examples of words include:
There also exists a null word, which is typically denoted using the Greek letter ε (Epsilon).
A language is a set of words with respect to some alphabet. Languages may be finite or infinite.
Some examples of languages include:
The null language is denoted ∅.
(This must not be mistaken for { ε }, the language containing the null word.)
The language of all words with respect to some alphabet Σ is denoted Σ*.
Concatenation is defined for words and languages.
Let w1, w2 be words.
The concatenation w1 ⋅ w2 is a new word containing the symbols of w1 followed by the symbols of w2.
Concatenation is non-commutative. That is, w1 ⋅ w2 may not necessarily equal w2 ⋅ w1.
For example, let w1 = LIST and w2 = EN.
w1 ⋅ w2 = LISTEN
w2 ⋅ w1 = ENLIST
The "⋅" operator may be omitted when concatenating words.
These are equivalent notations: a ⋅ b = ab
Let L1 and L2 be languages.
The concatenation L1 ⋅ L2 is a new language containing the pairwise concatenation of all words in L1 with all words in L2.
For example, let L1 = { a, b, c } and L2 = { 1, 2, 3 }.
L1 ⋅ L2 = { a1, a2, a3, b1, b2, b3, c1, c2, c3 }
L2 ⋅ L1 = { 1a, 1b, 1c, 2a, 2b, 2c, 3a, 3b, 3c }
Exponentiation is defined for words and languages.
Let w be a word.
wn is a concatenation of n copies of w.
wn = w ⋅ w ⋅ … ⋅ w (n times)
For example, let w = HO
w3 = HOHOHO
Let L be a language.
Ln is a language where each word is a concatenation of n words in L.
Ln = { w1 ⋅ w2 ⋅ … ⋅ wn | w1, w2, … wn ∈ L }
For example, let L = { left, right }
L3 = { leftleftleft, leftleftright, leftrightleft, leftrightright, rightleftleft, rightleftright, rightrightleft, rightrightright }
L0 is defined to be { ε }.
Kleene closure is defined for alphabets and languages.
Let Σ be an alphabet.
Σ* is the language of all words with respect to Σ, including ε.
Let L be a language.
The Kleene closure of L is the language of all arbitrarily sized concatenations of words in L.
L* = L0 ∪ L1 ∪ L2 ∪ …
Kleene plus is defined for alphabets and languages.
Let Σ be an alphabet.
Σ+ is the language of all words with respect to Σ, excluding ε
Let L be a language.
The Kleene plus of L is the language of all arbitrarily sized concatenations of words in L, excluding ε.
L+ = L1 ∪ L2 ∪ L3 ∪ …
The theory of computation is a collection of several different branches, including: