CS 240 Data Structures and Data Management


Revised April 28, 2010

General Description

An introduction to widely used and effective methods of data organization, focusing on data structures, their algorithms, and the performance of these algorithms. Specific topics include priority queues, sorting, dictionaries, and data structures for text processing.

Students will learn to analyze, implement and use various data structures and data-management techniques in a variety of applications.

Logistics

Intended for 2B students in Computer Science. Normally available Fall, Winter and Spring.

Related courses (see calendar for official details)

Predecessors: CS 245 (logic) or SE 212; CS 246 or 247 (programming); any of STAT 206, 230, 240 (probability).
Successors: Most third-year CS major courses.
Conflicts: Other courses that seriously consider efficiency and correctness of fundamental data structures and their algorithms.

Hardware/Software used: UNIX, C++

Typical Reference(s)

R. Sedgewick, Algorithms in C, 3rd ed, Parts 1–4. Addison Wesley.

Required preparation

To begin the course, students will need the ability to do the following.

Analytical:

Computational and algorithmic:

Learning objectives

By the end of the course, students should have the ability to In addition to the above language-independent skills, students should have the ability to implement (code, debug, and test) any of the above structures and algorithms in C++, using appropriate design methodologies and language features. They should have preparation to transfer these abilities to other languages (once learned).

Typical syllabus

Introduction and review (3 hours)

The basic computer model: the random-access machine. The runtime of an algorithm: worst-case, best-case, and average-case. Asymptotic analysis, order notation, growth rates, and complexity.

Stacks, queues, and priority queues (3 hours)

Review of stacks and queues. The priority queue ADT and simple implementations. Heaps and Heapsort. Using heaps to solve the selection problem.

Sorting and analysis of randomized algorithms (5 hours)

Quicksort (non-randomized): worst-case, best-case and average-case complexity. Randomized quicksort and its analysis; application to selection and its analysis. Lower bound on comparison-based sorting. Non-comparison-based sorting algorithms (e.g., Counting Sort and Radix Sort).

Search trees (5 hours)

The Dictionary ADT and simple implementations. Binary search trees (insert and delete operations and analysis) Balanced search trees (insert and delete operations and analysis; instructors will normally choose two or more of AVL trees, 2-3 trees, red-black trees, etc.). 2-3-4 trees and B-trees (search, insert and delete operations and analysis).

Hashing (5 hours)

Key-indexed search, simple hash functions. Collision resolution: chaining and open addressing. Complexity of search, insertion and deletion. Extendible hashing. Self-organizing search.

Range search and multidimensional dictionaries (5 hours)

Range search in a binary search tree. Data structures for orthogonal range search: quad trees, Kd-trees, range trees.

Algorithms and data structures for text processing (8 hours)

Dictionaries for text strings: radix trees, tries, compressed tries, suffix tries. String matching algorithms: brute force, finite automata, the Knuth-Morris-Pratt algorithm. Text compression: Huffman codes, Lempel-Ziv.


Campaign Waterloo

David R. Cheriton School of Computer Science
University of Waterloo
Waterloo, Ontario, Canada N2L 3G1

Tel: 519-888-4567 x33293
Fax: 519-885-1208

Contact | Feedback: cs-uops@cs.uwaterloo.ca | David R. Cheriton School of Computer Science | Faculty of Mathematics


Valid HTML 4.01!Valid CSS! Last modified: Thursday, 08-Mar-2012 15:42:29 EST


Menu:ShowHide