Use of NFA's for Closure Properties of Regular Languages

The set of regular languages is closed under the following operations:

What is the complement of L1?

What is L3 ∩ L4?

What is L3 o L4 (concatenation)?

What is L3*

What is ∅*

How does one prove that regular languages are closed under each of these operations?

NFAs are useful to show regular languages are closed under the last three operations (union, concatenation, star).

Union

An NFA to recognize L1 ∪ L2

s1 is the old start state of L1.

s2 is the old start state of L2.

f1, f2 are old final states of L1.

f is the old final state of L2.

Concatenation

An NFA to recognize L1 o L2.

s1 is the old start state of L1.

s2 is the old start state of L2.

f1, f2 are old final states of L1.

f is the old final state of L2.

Star (Kleene Closure)

An NFA to recognize L*

s1 is the old start state of L.

f1, f2 are old final states of L.

Equivalences

  1. NFAs
  2. DFAs
  3. regular expressions
  4. regular grammars

Regular Expressions (1.3)

Regular expressions over an alphabet ∑ represents sets of strings in the alphabet; that is, a formal language over the alphabet.

So a regular expression describes a formal language.

In the notation for regular expressions, an element a ∈ ∑ is also used to represent the regular expression that denotes the set of strings consisting of 'a' alone: { a }.

Formal Definition

The formal definition of a regular expression over an alphabet ∑ is (page 64):

R is a regular expression if R is

  1. a for some a in the alphabet ∑,
  2. ε,
  3. ∅,
  4. (R1 ∪ R2), where R1 and R2 are regular expressions,
  5. (R1 o R2), where R1 and R2 are regular expressions, or
  6. (R1*), where R1 is a regular expression

∅ represents the empty language

ε represents the language { ε }

a ∈ ∑ represents the language { a }

Operations and Examples

The operations on regular expressions are

  1. union
  2. concatenation
  3. star (closure)

∑ = {0,1}

  1. Union: 0 ∪ 1 or 0 | 1 represents the set {0} ∪ {1} = {0,1}
  2. Concatenation: (0 ∪ 1)1 represents the set {01, 11}
  3. Star: (01|11)* represents the set {ε, 01, 11, 0111, 1101, ... }
  4. * represents the set { ε }
  5. (0 ∪ 1)*1101(0 ∪ 1)* What language does this describe?

Theorem

A language is regular if and only if some regular expression describes it.
Proof requires two parts.
First Part: If a language is regular, then it is described by some regular expression.
Second Part: If a language is described by some regular expression, then it is a regular language.

GNFAs

A Generalized Nondeterministic Finite Automaton is similar to an NFA but the transition function takes a state and a regular expression in the alphabet instead of a state and an alphabet element.

A generalized NFA for strings
	containing 1101

A GNFA for strings in {1,0} that contain the substring "1101".

The idea is that in state q0 the transition to state q1 can be taken if the next input matches the regular expression 1101.

The formal definition is given (on page 73) by:

A generalized nondeterministic finite automaton is a 5-tuple, (Q,∑, δ, qstart, qaccept), where

  1. Q is the finite set of states,
  2. ∑ is the input alphabet
  3. δ : (Q - {qaccept} x (Q - { qstart }) → R is the transition function, where R is the set of all regular expressions over ∑
  4. qstart is the start state
  5. qaccept is the accept state

Note that there is only one accept state. However, this is no real restriction for a nondetrministic automaton. (Why?)

On the other hand, the transition function is defined on a different arguments than is the case for an ordinary NFA.

GNFA Transition Function Example

The transition function, δ, for the example GNFA:

The example GNFA

is

   q1 q2    q3   
q0 ε
q1 (0 | 1) 1101
q2 (0 | 1) ε

Language of a GNFA

The definition of the language of a GNFA is technically different than that of an NFA because the transition function is defined differently. However, the idea is really similar, but extended to allow regular expressions on the transitions.

The formal definition is given by (page 73):

A GNFA accepts a string w in ∑* if w = w1w2...wk, where each wi is in ∑* and a sequence of states q0, q1, ..., qk exists such that

  1. q0 is the start state
  2. qk is the accept state
  3. for each i, wi ∊ L(Ri) where Ri = δ(qi-1, qi); that is, where Ri is the expression on the arrow from qi-1 to qi.

Example String Accepted by GNFA

The string 0011011 is accepted by

Example GNFA

If 0011011 is partitioned into blocks w1=ε w2=0 w3=0 w4=1101 w5=1 w6=ε; that is,

0011011 = ε 0 0 1101 1 ε

then

  State Input Next
w1=ε ∊ L(δ(q0, q1)) q0 w1 q1
w2=0 ∊ L(δ(q1, q1)) q1 w2 q1
w3=0 ∊ L(δ(q1, q1)) q1 w3 q1
w4=1101 ∊ L(δ(q1, q2)) q1 w4 q2
w5=1 ∊ L(δ(q2, q2)) q2 w5 q2
w6=ε ∊ L(δ(q2, q3)) q2 w6 q3

For example, w1=ε ∊ L(δ(q0, q1)) means for state q0 and input w1, next state is q1

Converting a DFA to an GNFA

A DFA can be converted to an equivalent GNFA by

  1. Adding a new start state with an ε transition to the old start state.
  2. Adding a new accepting state and adding ε transitions from all old accepting states to the new one.
  3. Definining the transition function δ' for the GNFA in terms of the transition function δ for the DFA by

    δ'(qi, qj) = a if and only if δ(qi, a) = qj

Example conversion of DFA to GNFA

DFA

Steps 1 and 2 require adding a new start state and a new accepting state. Step 3 require no changes in the diagram except labeling the transitions out of the new start state and into the new accepting state.

DFA to GNFA

Eliminating a state in an GNFA

Any state except the start state and the accept state of a GNFA can be eliminated to obtain a new equivalent GNFA with one fewer states.

For example, to eliminate state q1 in

Eliminate a state in a gnfa
  1. replace each existing path q to q1 to q' by a path from q to q'
  2. Label the new path with a regular expression that describes the strings that would cause a transition from state q to q1 to q'

There are 3 arrows coming into state q1 from other states: q0, q2, and q4.

If state q1 is removed, paths must be replaced by new transitions:

  1. q0 to q2
  2. q2 to q2
  3. q4 to q2

Determining the Regular Expression Labels

step2

Eliminating q1, what regular expressions are needed?

step2

GNFA Regular Expression Labels

Suppose qa is to be removed and

  1. the path being considered goes from qi to qa to qj,
  2. transition from qi to qa is labeled by the regular expression Ria
  3. transition from qa to qj is labeled by the regular expression Raj
  4. transition from qi to qj is labeled by the regular expression Rij (Note this may be the empty regular expression, ࣼ)

Then the path from qi to qj should be labeled

    RiaRa*Raj ∪ Rij

Result

Original DFA (converted to a GNFA):

After eliminating state q1:

Now the Theorem Again

A language is regular if and only if some regular expression describes it.
Proof requires two parts.
First Part: If a language is regular, then it is described by some regular expression.
Second Part: If a language is described by some regular expression, then it is a regular language.

DFA to Regular Expression

To show the first part, if we are given an DFA, we need to show that there is a regular expression that describes exactly the language of the DFA.

The construction of the regular expression begins with the DFA and each step will eliminate one state of the DFA until the only state(s) remaining are the start state and a final state.

Example

Exercise 1.21

Convert this DFA to a regular expression that describes the same language.

Regular Expresssion to NFA

To show that for any regular expression there is an NFA that recognizes the same language described by the regular expression, the proof describes a procedure for constructing the NFA from the regular expression.

  1. Parse the regular expression into its parts based on the regular expression operator precedence and parentheses used if any to determine the operands of each operator.
  2. star is highest, then concatenation, and union is lowest precedence.
  3. For each of the operators use the construction described in showing the closure properties of regular languages to construct an NFA for each operator and its operands.

See Lemma 1.55 in the text.

Example regular expression

   a(ba)*b ∪ bab
  1. Construct the NFA for (ba): concatenation of b and a
  2. Construct the NFA for (ba)*: star of (1)
  3. Construct the NFA for a(ba)*: concatenation of a and (2)
  4. Construct the NFA for a(ba)*b: concatenation of (3) and b
  5. Construct the NFA for ba: concatenation of b and a
  6. Construct the NFA for bab: concatenation of 5 and b
  7. Construct the NFA for a(ba)*b ∪ bab: union of (4) and (6)

More Text Exercises

  1. 1.19
  2. 1.20
  3. Is this language L over the alphabet {a, b} regular?

    L = {anbm | n >= 0, m >= 0 and n ≠ m }

Pumping Lemma

Pumping Lemma for Regular Languages: If A is a regular language, then there is a number p such that if s is any string in A of length at least p, then s may be divided into three pieces, s = xyz with the properties:

  1. for each i >= 0, xyiz ∈ A,
  2. |y| > 0, and
  3. |xy| <= p

Strategies

Strategies for deciding whether a formal language is regular.

  1. It may be easier to show whether the complement is regular.
  2. Use the pumping lemma to show the language is not regular.

Example 1: { anbn | n >= 0 } is not regular.

Example 2: {anbm | n >= 0, m >= 0 and n ≠ m } is not regular.