Revised Nov 1, 2016

CS 462: Formal Languages and Parsing


Watch a video introduction to this course on YouTube.

General description

Building on CS 360, this course discusses more advanced topics in formal languages and automata theory, including applications, compiler writing, and parsing methods.

Logistics

Audience

  • CS major students. Usually taken in fourth year.

Normally available

  • Winter Term

Related courses

  • Pre-requisites: CS 360 or 365 or equivalent course
  • Successors: Graduate courses in compiler writing or formal languages

For official details, see the UW calendar.

Software/hardware used

  • --

Typical reference(s)

  • J. Shallit, A Second Course in Formal Languages & Automata Theory, Cambridge University Press

Required preparation

At the start of the course, students should be able to

  • Prove statements by induction
  • Explain basic computing models, such as finite automata, pushdown automata, and Turing machines
  • Describe grammars and machine models

Learning objectives

At the end of the course, students should be able to

  • Explain more about basic computer models
  • Describe the theory behind modern parsing methods
  • Solve problems in the theory of combinatorics on words

Typical syllabus

Properties of strings (3 hours)

  • Outline of formal language theory
  • Strings, machines, proofs by induction
  • Combinatorics on words
  • Thue's problem

Regular sets (10 hours)

  • Closure of regular sets under quotient, substitution, and inverse homomorphism
  • Decision algorithms for regular sets
  • Myhill-Nerode theorem
  • Minimization of finite automata
  • Finite-state transducers

Context-free languages (6 hours)

  • Coping with ambiguity
  • Inherently ambiguous CFL's
  • Closure of context-free languages under substitution, inverse homomorphism, and intersection with regular sets
  • Decision algorithms for context-free languages
  • Parsing arbitrary context-free grammars
  • Decidability results for CFG's

Parsing (6 hours)

  • Phases of compilation
  • Top-down parsing
  • LL(1) grammars
  • Bottom-up parsing
  • LR(0) grammars and LR(k) grammars

Chomsky hierarchy (3 hours)

  • Left- and right-regular grammars
  • Unrestricted (Type 0) grammars
  • Equivalence of Type 0 grammars and Turing machines
  • Context-sensitive languages
  • Linear bounded automata

Deterministic context-free languages (3 hours)

  • Normal forms
  • Closure under complementation
  • Relationship to LR(0) grammars

Other language classes (5 hours)

  • L-systems
  • Applications to computer graphics
  • Rational series