You are here

Regular Languages and Finite Automata

Author(s): 
Hing Leung (New Mexico State University)

 

Stephen Cole Kleene (Source: Wikimedia Commons)

Introduction

In 1943, Warren McCulloch and Walter Pitts published a pioneering work on a logical model for studying the behavior of nervous systems in the Bulletin of Mathematical Biophysics [2]. Following on their ideas, Stephen Cole Kleene (1909–1994) wrote the first paper on finite automata and regular expressions in 1956 [1]. A finite automaton can be considered as the simplest machine model in that the machine has a finite memory; that is, the memory size is independent of the input length.

In a paper published in 1959, Michael Rabin and Dana Scott presented finite automata in the simplest mathematical model [3]. However, Kleene’s more complex model of finite automata actually allows the user to express ideas more easily, even though formally both models are equivalent in their expressiveness. A similar situation happens with programming languages. While a lower level assembly-styled language with simpler syntax is sufficient to program the computer to do whatever a high level language can do, the programming community embraces higher level programming languages like Java, C++, and Python for higher productivity.

In this project for computer science students in a theory of computation course, we study Kleene’s concept of finite automata and regular expressions. In particular, we learn Kleene’s own proof of the theorem, now called Kleene’s theorem, that shows finite automata and regular expressions are equivalent in their expressiveness for denoting languages.

Our project, Regular Languages and Finite Automata, is ready for students, and the Latex source is also available for instructors who may wish to modify the project for students. The comprehensive “Notes to the Instructor” presented next are also appended to the project itself. Our project is part of a larger collection published in Convergence. For more projects, see Primary Historical Sources in the Classroom: Discrete Mathematics and Computer Science.

Notes to the Instructor

The project can be used in a senior undergraduate or beginning graduate level course on the theory of computation. Before working on the project, students are expected to have studied the concepts of finite automata and regular languages from a modern textbook. Students will learn that Kleene’s original definition of a finite automaton is more liberal than the textbook’s definition given by Rabin and Scott. Also presented in the project is Kleene’s proof of the equivalence between the expressiveness of regular expressions and finite automata.

The completion of the project requires two weeks. If the students are strong in mathematical thinking and proofs, an instructor may wish to assign the project as homework. Otherwise, the instructor can use some class time to go over materials in the project that the students find challenging. Students can work on the project individually or in groups of two or three. Exercises 2.1-2.5 can be completed in one week, while the rest of the exercises are completed in the second week.

Given that the exercises are open-ended questions, it is advised that the instructor should carefully work through the project before assigning it to the students.

Notes on Exercise 2.2: The usual logic for addition requires the carry bit to be passed sequentially from the lower ordered digits to the higher ordered digits as the addition process is performed. Recall that, with Kleene’s model, the logic for determining the i-th bit (an inner cell) at time t is said to be “determined by the states of all the cells [that include the input cells and inner cells] at time t – 1.” However, Kleene’s model leaves it wide open how the states of all the cells at time t − 1 decide the state of an inner cell at time t. So, the mechanism for determining the state of an inner cell can indeed be summarized by a sequential procedure (as employed in the addition logic).

Notes on Exercise 3.3: Some textbooks use the state elimination method in constructing a regular expression from a given finite automaton. The construction requires the introduction of the concept of generalized nondeterministic finite automata. Another approach is to use dynamic programming that resembles Floyd-Warshall’s algorithm for computing transitive closure of a graph. In contrast, Kleene’s approach seems to be more in line with the usual mathematical induction technique in that the assertion is proved for a small number of states, and an induction step is used to prove the claim as the number of states increases. So, it is quite possible that the students will find Kleene’s approach more natural, assuming that they have had good training in proofs by mathematical induction.

Download the project Regular Languages and Finite Automata as a pdf file ready for classroom use.

Download the modifiable Latex source file for this project.

For more projects, see Primary Historical Sources in the Classroom: Discrete Mathematics and Computer Science.

Bibliography

[1]  S. C. Kleene, “Representation of events in nerve nets and finite automata,” in Automata Studies (C. Shannon and J. McCarthy, eds.), Princeton University Press, NJ, 1956, 3–41.

[2]  W. S. McCulloch and W. H. Pitts, “A logical calculus of the ideas immanent in nervous activity,” Bull. Math. Biophys., 5 (1943), 115–133.

[3]  M. O. Rabin and D. Scott, “Finite automata and their decision problems,” IBM Journal of Research and Development, 3 (1959), 114–125.

Acknowledgment



The development of curricular materials for discrete mathematics has been partially supported by the National Science Foundation's Course, Curriculum and Laboratory Improvement Program under grants DUE-0717752 and DUE-0715392 for which the authors are most appreciative. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Hing Leung (New Mexico State University), "Regular Languages and Finite Automata," Convergence (July 2013), DOI:10.4169/loci003993