The set of regular languages is closed under the following operations:
What is the complement of L1?
What is L3 ∩ L4?
What is L3 o L4 (concatenation)?
What is L3*
What is ∅*
How does one prove that regular languages are closed under each of these operations?
NFAs are useful to show regular languages are closed under the last three operations (union, concatenation, star).
An NFA to recognize L1 ∪ L2
s1 is the old start state of L1.
s2 is the old start state of L2.
f1, f2 are old final states of L1.
f is the old final state of L2.
An NFA to recognize L1 o L2.
s1 is the old start state of L1.
s2 is the old start state of L2.
f1, f2 are old final states of L1.
f is the old final state of L2.
An NFA to recognize L*
s1 is the old start state of L.
f1, f2 are old final states of L.
Regular expressions over an alphabet ∑ represents sets of strings in the alphabet; that is, a formal language over the alphabet.
So a regular expression describes a formal language.
In the notation for regular expressions, an element a ∈ ∑ is also used to represent the regular expression that denotes the set of strings consisting of 'a' alone: { a }.
The formal definition of a regular expression over an alphabet ∑ is (page 64):
R is a regular expression if R is
∅ represents the empty language
ε represents the language { ε }
a ∈ ∑ represents the language { a }
The operations on regular expressions are
∑ = {0,1}
A Generalized Nondeterministic Finite Automaton is similar to an NFA but the transition function takes a state and a regular expression in the alphabet instead of a state and an alphabet element.
A GNFA for strings in {1,0} that contain the substring "1101".
The idea is that in state q0
the
transition to state q1 can be taken if the next input
matches the regular expression 1101
.
The formal definition is given (on page 73) by:
A generalized nondeterministic finite automaton is a 5-tuple, (Q,∑, δ, qstart, qaccept), where
Note that there is only one accept state. However, this is no real restriction for a nondetrministic automaton. (Why?)
On the other hand, the transition function is defined on a different arguments than is the case for an ordinary NFA.
The transition function, δ, for the example GNFA:
is
q1 | q2 | q3 | |
---|---|---|---|
q0 | ε | ⌀ | ⌀ |
q1 | (0 | 1) | 1101 | ⌀ |
q2 | ⌀ | (0 | 1) | ε |
The definition of the language of a GNFA is technically different than that of an NFA because the transition function is defined differently. However, the idea is really similar, but extended to allow regular expressions on the transitions.
The formal definition is given by (page 73):
A GNFA accepts a string w
in ∑* if w =
w1w2...wk, where each wi is in ∑* and a sequence of
states q0, q1, ..., qk exists such that
The string 0011011 is accepted by
If 0011011 is partitioned into blocks w1=ε w2=0 w3=0 w4=1101 w5=1 w6=ε; that is,
0011011 = ε 0 0 1101 1 ε
then
State | Input | Next | |
w1=ε ∊ L(δ(q0, q1)) | q0 | w1 | q1 |
w2=0 ∊ L(δ(q1, q1)) | q1 | w2 | q1 |
w3=0 ∊ L(δ(q1, q1)) | q1 | w3 | q1 |
w4=1101 ∊ L(δ(q1, q2)) | q1 | w4 | q2 |
w5=1 ∊ L(δ(q2, q2)) | q2 | w5 | q2 |
w6=ε ∊ L(δ(q2, q3)) | q2 | w6 | q3 |
For example, w1=ε ∊ L(δ(q0, q1)) means for state q0 and input w1, next state is q1
A DFA can be converted to an equivalent GNFA by
δ'(qi, qj) = a if and only if δ(qi, a) = qj
Steps 1 and 2 require adding a new start state and a new accepting state. Step 3 require no changes in the diagram except labeling the transitions out of the new start state and into the new accepting state.
Any state except the start state and the accept state of a GNFA can be eliminated to obtain a new equivalent GNFA with one fewer states.
For example, to eliminate state q1 in
There are 3 arrows coming into state q1 from other states: q0, q2, and q4.
If state q1 is removed, paths must be replaced by new transitions:
Eliminating q1, what regular expressions are needed?
Suppose qa is to be removed and
Then the path from qi to qj should be labeled
RiaRa*Raj ∪ Rij
Original DFA (converted to a GNFA):
After eliminating state q1:
To show the first part, if we are given an DFA, we need to show that there is a regular expression that describes exactly the language of the DFA.
The construction of the regular expression begins with the DFA and each step will eliminate one state of the DFA until the only state(s) remaining are the start state and a final state.
Exercise 1.21
Convert this DFA to a regular expression that describes the same language.
To show that for any regular expression there is an NFA that recognizes the same language described by the regular expression, the proof describes a procedure for constructing the NFA from the regular expression.
See Lemma 1.55 in the text.
Example regular expression
a(ba)*b ∪ bab
L = {anbm | n >= 0, m >= 0 and n ≠ m }
Pumping Lemma for Regular Languages: If A is a regular
language, then there is a number p
such that if
s
is any string in A of length at least
p
, then s
may be divided into three
pieces, s = xyz
with the properties:
i >= 0, xyiz ∈
A,
|y| > 0
, and|xy| <= p
Strategies for deciding whether a formal language is regular.
Example 1: { anbn | n >= 0 } is not regular.
Example 2: {anbm | n >= 0, m >= 0 and n ≠ m } is not regular.