Algorithms: Union-Find (Part I)

1. Algorithms

Algorithms + Data Structures = Programs

The notion of algorithm was first formalized by Church and Turing in the 1930s (http://en.wikipedia.org/wiki/Church%E2%80%93Turing_thesis).

Algorithms are computational models that are replacing mathematical models.

2. Dynamic Connectivity

Union-Find

Given a set of N objects, we can define two operations:

  • Union Command - connect two objects p and q (create a path between p and q)
  • Connected Query - are two objects p and q connected (is there a path from p to q)

Properties of the Union-Find algorithm:

  • Reflexive - p is connected to p (itself)
  • Symmetric - if p is connected to q, then q is connected to p
  • Transitive - if p is connected to an object q and q is connected to r, then p is connected to r

CONNECTED COMPONENTS - a maximal set of objects that are mutually connected.

Connected Components

We can implement our data type and the set of operations as a Python class:

3. Quick-Find

An eager approach to implement the Quick-Find algorithm would be to use an array of ids. The operations we can define on our data structure are:

  • FIND - p and q are connected if they have the same id
  • UNION - merge components by changing all entities that equal id[p] to id[q]

Here we have an example on how our data structure looks after performing UNION commands:

0 1 2 3 4 5 6 7 8 9
id 0 1 1 8 8 0 0 1 8 8

Quick-Find algorithm implementation (eager approach):

Resulting Cost Model:

algorithm initialize union find
Quick-Find N N 1

Defect: takes quadratic time (N^2) to process N union commands on a set of N objects

Other approaches and/or improvements to Quick-Find will follow in Part II.