fullscreen
timer
qrcode
plickers
selector
edit
reset

Union-Find

COS 265 - Data Structures & Algorithms

Union-Find

dynamic-connectivity problem

dynamic-connectivity problem

Given a set of \(N\) elements, support two operations:

dynamic-connectivity problem

connect(4, 3)
connect(3, 8)
connect(6, 5)
connect(9, 4)
connect(2, 1)
isConnected(8, 9) // ?
isConnected(5, 7) // ?
connect(5, 0)
connect(7, 2)
connect(6, 1)
connect(1, 0)
isConnected(5, 7) // ?

determining isConnected this way can be tricky, especially as number of elements and connect calls increases...

connect(4, 3)     // <--
connect(3, 8)
connect(6, 5)
connect(9, 4)
connect(2, 1)
isConnected(8, 9) // ?   
isConnected(5, 7) // ?    
connect(5, 0)
connect(7, 2)
connect(6, 1)
connect(1, 0)
isConnected(5, 7) // ?   



instead, working and thinking visually can make the problem a little easier

connect(4, 3)
connect(3, 8)     // <--
connect(6, 5)
connect(9, 4)
connect(2, 1)
isConnected(8, 9) // ?
isConnected(5, 7) // ?    
connect(5, 0)
connect(7, 2)
connect(6, 1)
connect(1, 0)
isConnected(5, 7) // ?



connect(4, 3)
connect(3, 8)
connect(6, 5)     // <--
connect(9, 4)
connect(2, 1)
isConnected(8, 9) // ?
isConnected(5, 7) // ?    
connect(5, 0)
connect(7, 2)
connect(6, 1)
connect(1, 0)
isConnected(5, 7) // ?



connect(4, 3)
connect(3, 8)
connect(6, 5)
connect(9, 4)     // <--
connect(2, 1)
isConnected(8, 9) // ?
isConnected(5, 7) // ?    
connect(5, 0)
connect(7, 2)
connect(6, 1)
connect(1, 0)
isConnected(5, 7) // ?



connect(4, 3)
connect(3, 8)
connect(6, 5)
connect(9, 4)
connect(2, 1)     // <--
isConnected(8, 9) // ?
isConnected(5, 7) // ?    
connect(5, 0)
connect(7, 2)
connect(6, 1)
connect(1, 0)
isConnected(5, 7) // ?



connect(4, 3)
connect(3, 8)
connect(6, 5)
connect(9, 4)
connect(2, 1)
isConnected(8, 9) // ? <--
isConnected(5, 7) // ? <--
connect(5, 0)
connect(7, 2)
connect(6, 1)
connect(1, 0)
isConnected(5, 7) // ?



connect(4, 3)
connect(3, 8)
connect(6, 5)
connect(9, 4)
connect(2, 1)
isConnected(8, 9) // ? <--
isConnected(5, 7) // ? <--
connect(5, 0)
connect(7, 2)
connect(6, 1)
connect(1, 0)
isConnected(5, 7) // ?



visually (intuitively), this is much easier to answer

connect(4, 3)
connect(3, 8)
connect(6, 5)
connect(9, 4)
connect(2, 1)
isConnected(8, 9) // true
isConnected(5, 7) // false
connect(5, 0)
connect(7, 2)
connect(6, 1)
connect(1, 0)
isConnected(5, 7) // ?



connect(4, 3)
connect(3, 8)
connect(6, 5)
connect(9, 4)
connect(2, 1)
isConnected(8, 9) // true
isConnected(5, 7) // false
connect(5, 0)     // <--
connect(7, 2)     // <--
connect(6, 1)     // <--
connect(1, 0)     // <--
isConnected(5, 7) // ?



connect(4, 3)
connect(3, 8)
connect(6, 5)
connect(9, 4)
connect(2, 1)
isConnected(8, 9) // true
isConnected(5, 7) // false
connect(5, 0)
connect(7, 2)
connect(6, 1)
connect(1, 0)
isConnected(5, 7) // ? <--



connect(4, 3)
connect(3, 8)
connect(6, 5)
connect(9, 4)
connect(2, 1)
isConnected(8, 9) // true
isConnected(5, 7) // false
connect(5, 0)
connect(7, 2)
connect(6, 1)
connect(1, 0)
isConnected(5, 7) // true



A larger connectivity example

Is there a path connecting cyan and pink elements?

Yes.

Note: finding the path explicitly is a harder problem

Note: as the problem size gets larger, the problem becomes harder

modeling the elements

Applications involve manipulating elements of all types

modeling the elements

When programming, convenient to name elements 0 to N-1.

modeling the elements

We model "is connected to" as an equivalence relation, which is reflexive, symmetric, and transitive.


Reflexive
p is connected to p

Symmetric
if p is connected to q, then q is connected to p

Transitive
if p is connected to q and q is connected to r, then p is connected to r

quiz: equivalence relation

Which is not a property of equivalence relation?

  1. Associative
  2. Reflexive
  3. Transitive
  4. Symmetric

modeling the elements

Connected component
maximal set of elements that are mutually connected

Example:

3 disjoint sets / connected components

\[ \{0\}\ \{1,4,5\}\ \{2,3,6,7\} \]

two core operations on disjoint sets

union(p, q)
replace sets containing elements p and q with their union
find(p)
in which set is element p?
isConnected(p, q)
can be defined as find(p) == find(q)

\[\{0\}\ \{1,4,5\}\ \{2,3,6,7\}\quad\underset{\textrm{union}(2,5)}{\Rightarrow}\quad\{0\}\ \{1,2,3,4,5,6,7\}\]

find(5) != find(6)
union(2, 5)         // 3 disjoint sets -> 2 disjoint sets
find(5) == find(6)

modeling dynamic-connectivity using u-f

How to model the dynamic-connectivity problem using union-find?

Maintain disjoint sets that correspond to connected components

union(2, 5)

union-find data type (api)

Goal: design an efficient union-find data type

public class UF {
    // initialize union-find data structure with N singleton sets (0 to N-1)
    UF(int N) { ... }

    // merge sets containing elements p and q
    void union(int p, int q) { ... }

    // identifier for set containing element p (0 to N-1)
    int find(int p) { ... }
}

dynamic-connectivity client


public static void main(String[] args) {
    int N = StdIn.readInt();
    UF uf = new UF(N);
    while(!StdIn.isEmpty()) {
        int p = StdIn.readInt();
        int q = StdIn.readInt();
        if(uf.find(p) != uf.find(q)) {
            uf.union(p, q);
            StdOut.println(p + " " + q);
        }
    }
}

dynamic-connectivity client

Note with input below, lines 7, 11, and 12 (highlighted) are already connected and therefore will not print.

Input:

10
4 3
3 8
6 5
9 4
2 1
8 9
5 0
7 2
6 1
1 0
6 7

Output:

4 3
3 8
6 5
9 4
2 1
5 0
7 2
6 1

Union-Find

quick find implementation

quick-find (eager approach)

Data Structure


\[ \{0,5,6\}\ \{1,2,7\}\ \{3,4,8,9\} \]

//           0 1 2 3 4 5 6 7 8 9   index
int [] id = {0,1,1,8,8,0,0,1,8,8};
// find(5) == 0

Q: How to implement find(p)?

quick-find (eager approach)

Data Structure


\[ \{0,5,6\}\ \{1,2,7\}\ \{3,4,8,9\} \]

//           0 1 2 3 4 5 6 7 8 9   index
int [] id = {0,1,1,8,8,0,0,1,8,8};
// find(5) == 0

Q: How to implement find(p)?
A: Easy, just return id[p]

quick-find (eager approach)

Data Structure

//           0 1 2 3 4 5 6 7 8 9   index
int [] id = {0,1,1,8,8,0,0,1,8,8};
union(6,1);
//     id = ??

Q: How to implement union(p,q)?

quick-find (eager approach)

Data Structure

//           0 1 2 3 4 5 6 7 8 9   index
int [] id = {0,1,1,8,8,0,0,1,8,8};
union(6,1);
//     id = ??

Q: How to implement union(p,q)?
A: Change all entries whose identifier equals id[p] to id[q].
id = {1,1,1,8,8,1,1,1,8,8}

quick-find java implementation

public class QuickFindUF {
    private int[] id;

    public QuickFindUF(int N) {
        // set id of each element to itself.  N array accesses
        id = new int[N];
        for(int i = 0; i < N; i++)
            id[i] = i;
    }

    public int find(int p) {
        // return the id of p.  1 array access
        return id[p];
    }

    public void union(int p, int q) {
        // change all entries with id[p] to id[q]
        // N+2 to 2N+2 array accesses
        int pid = id[p];
        int qid = id[q];
        for(int i = 0; i < id.length; i++) {
            if(id[i] == pid) id[i] = qid;
        }
    }
}

quick-find is too slow

Cost model
Number of array accesses (for read or write)

algorithm initialize union find
quick-find \(N\) \(N\) \(1\)

Note: ignoring leading constant


Union is too expensive! Processing a sequence of \(N\) union operations on \(N\) elements takes more than \(N^2\) (quadratic) array accesses.

quadratic algorithms do not scale

Rough standard (for now)


Ex. Huge problem for quick-find

quadratic algorithms do not scale

Quadratic algorithms don't scale with technology

Union-Find

quick union implementation

quick-union (lazy approach)

Data Structure

quick-union (lazy approach)

\[ \{0\}\ \{1\}\ \{2,3,4,9\}\ \{5,6\}\ \{7\}\ \{8\} \]

//               0 1 2 3 4 5 6 7 8 9   index
int [] parent = {0,1,9,4,9,6,6,7,8,9};
// parent of 3 is 4, parent of 4 is 9, parent of 9 is 9
//   root of 3 is 9
// parent and root of 5 is 6

Q: How to implement find(p)?

quick-union (lazy approach)

\[ \{0\}\ \{1\}\ \{2,3,4,9\}\ \{5,6\}\ \{7\}\ \{8\} \]

//               0 1 2 3 4 5 6 7 8 9   index
int [] parent = {0,1,9,4,9,6,6,7,8,9};
// parent of 3 is 4, parent of 4 is 9, parent of 9 is 9
//   root of 3 is 9
// parent and root of 5 is 6

Q: How to implement find(p)?
A: Return root of tree containing p

quick-union (lazy approach)

\[ \ldots \{2,3,4,9\} \{5,6\} \ldots \Rightarrow \ldots \{2,3,4,5,6,9\} \ldots \]

//               0 1 2 3 4 5 6 7 8 9   index
int [] parent = {0,1,9,4,9,6,6,7,8,9};
union(3, 5)
//     parent = ???

Q: How to implement union(p,q)?

quick-union (lazy approach)

\[ \ldots \{2,3,4,9\} \{5,6\} \ldots \Rightarrow \ldots \{2,3,4,5,6,9\} \ldots \]

//               0 1 2 3 4 5 6 7 8 9   index
int [] parent = {0,1,9,4,9,6,6,7,8,9};
union(3, 5)
//     parent = ???

Q: How to implement union(p,q)?
A: Set parent of p's root to parent of q's root.

quick-union (lazy approach)

\[ \ldots \{2,3,4,9\} \{5,6\} \ldots \Rightarrow \ldots \{2,3,4,5,6,9\} \ldots \]

//               0 1 2 3 4 5 6 7 8 9   index
int [] parent = {0,1,9,4,9,6,6,7,8,9};
union(3, 5)
//               0 1 2 3 4 5 6 7 8 9   index
//     parent = {0,1,9,4,9,6,6,7,8,6}
//                                 ^ only one value changes!

quick-union demo

union(4,3)
union(3,8)
union(6,5)
union(9,4)
union(2,1)
isConnected(8,9)
!isConnected(5,4)
union(5,0)
union(7,2)
union(6,1)
union(7,3)

quick-union demo

int [] parent = {0, ..., 9};

union(4,3);  // <== action

index:    0 1 2 3 4 5 6 7 8 9
parent: { 0 1 2 3 4 5 6 7 8 9 }
parent: { ??? }


union(4,3);

union(3,8);  // <== action
index:    0 1 2 3 4 5 6 7 8 9
parent: { 0 1 2 3 4 5 6 7 8 9 }
parent: { 0 1 2 3 3 5 6 7 8 9 }
parent: { ??? }

union(3,8);

union(6,5);  // <== action
index:    0 1 2 3 4 5 6 7 8 9
parent: { 0 1 2 3 3 5 6 7 8 9 }
parent: { 0 1 2 8 3 5 6 7 8 9 }
parent: { ??? }

union(6,5);

union(9,4);  // <== action
index:    0 1 2 3 4 5 6 7 8 9
parent: { 0 1 2 8 3 5 6 7 8 9 }
parent: { 0 1 2 8 3 5 5 7 8 9 }
parent: { ??? }

union(9,4);

union(2,1);  // <== action
index:    0 1 2 3 4 5 6 7 8 9
parent: { 0 1 2 8 3 5 5 7 8 9 }
parent: { 0 1 2 8 3 5 5 7 8 8 }
parent: { ??? }

union(2,1);

union(5,0);  // <== action
index:    0 1 2 3 4 5 6 7 8 9
parent: { 0 1 2 8 3 5 5 7 8 8 }
parent: { 0 1 1 8 3 5 5 7 8 8 }
parent: { ??? }

union(5,0);

union(7,2);  // <== action
index:    0 1 2 3 4 5 6 7 8 9
parent: { 0 1 1 8 3 5 5 7 8 8 }
parent: { 0 1 1 8 3 0 5 7 8 8 }
parent: { ??? }

union(7,2);

union(6,1);  // <== action
index:    0 1 2 3 4 5 6 7 8 9
parent: { 0 1 1 8 3 0 5 7 8 8 }
parent: { 0 1 1 8 3 0 5 1 8 8 }
parent: { ??? }

union(6,1);

union(7,3);  // <== action
index:    0 1 2 3 4 5 6 7 8 9
parent: { 0 1 1 8 3 0 5 1 8 8 }
parent: { 1 1 1 8 3 0 5 1 8 8 }
parent: { ??? }

union(7,3);

// all done!
index:    0 1 2 3 4 5 6 7 8 9
parent: { 1 1 1 8 3 0 5 1 8 8 }
parent: { 1 8 1 8 3 0 5 1 8 8 }

quick-union java implementation

public class QuickUnionUF {
    private int[] parent;

    public QuickUnionUF(int N) {
        // set parent of each element to itself.  N array accesses
        parent = new int[N];
        for(int i = 0; i < N; i++)
            parent[i] = i;
    }

    public int find(int p) {
        // chase parent pointers until root.  depth of p array accesses
        while(p != parent[p])
            p = parent[p];
        return p;
    }

    public void union(int p, int q) {
        // change root of p to point to root of q
        // depth of p and q array accesses
        int i = find(p);
        int j = find(q);
        if(i == j) return; // already unioned
        parent[i] = j;
    }
}

quick-union is also too slow

Cost model
Number of array accesses (for read or write)

algorithm initialize union find
quick-find \(N\) \(N\) \(1\)
quick-union \(N\) \(N^\dagger\) \(N\)

\(\dagger\) includes cost of finding two roots

Note: analyzed quick-union for worst case

quick-union is also too slow

Quick-find defect

  • Union too expensive (more than \(N\) array accesses)
  • Trees are flat, but too expensive to keep them flat

Quick-union defect

  • Trees can get tall
  • Find too expensive (could be more than \(N\) array accesses)
// worst-case input
union(0,1);
union(0,2);
union(0,3);
union(0,4);

Union-find

improvements

improvement 1: weighting

Weighted quick-union

quiz: weighted quick-union

Suppose that the parent[] array during weighted quick union is

//               0 1 2 3 4 5 6 7 8 9
int [] parent = {0,0,0,0,0,0,7,8,8,8};

Which parent[] entry changes during union(2,6)?

A. parent[0]
B. parent[2]
C. parent[6]
D. parent[8]

weighted quick-union demo

union(4,3)
union(3,8)
union(6,5)
union(9,4)
union(2,1)
union(5,0)
union(7,2)
union(6,1)
union(7,3)

weighted quick-union demo

int [] parent = {0,1,2,3,4,5,6,7,8,9};
union(4,3);     // <- next step

union(4,3);     // 0 1 2 3 4 5 6 7 8 9 => 0 1 2 4 4 5 6 7 8 9
union(3,8);     // <- next step

union(3,8);     // 0 1 2 4 4 5 6 7 8 9 => 0 1 2 4 4 5 6 7 4 9
union(6,5);     // <- next step

union(6,5);     // 0 1 2 4 4 5 6 7 4 9 => 0 1 2 4 4 6 6 7 4 9
union(9,4);     // <- next step

union(9,4);     // 0 1 2 4 4 6 6 7 4 9 => 0 1 2 4 4 6 6 7 4 4
union(2,1);     // <- next step

union(2,1);     // 0 1 2 4 4 6 6 7 4 4 => 0 2 2 4 4 6 6 7 4 4
union(5,0);     // <- next step

union(5,0);     // 0 2 2 4 4 6 6 7 4 4 => 6 2 2 4 4 6 6 7 4 4
union(7,2);     // <- next step

union(7,2);     // 6 2 2 4 4 6 6 7 4 4 => 6 2 2 4 4 6 6 2 4 4
union(6,1);     // <- next step

union(6,1);     // 6 2 2 4 4 6 6 2 4 4 => 6 2 6 4 4 6 6 2 4 4
union(7,3);     // <- next step

union(7,3);     // 6 2 6 4 4 6 6 2 4 4 => 6 2 6 4 6 6 6 2 4 4
// all done!

weighted quick-union demo

quick-union

weighted quick-union

quick-union vs. weighted quick-union

A larger example: 100 sites, 88 union() operations

quick-union, average distance to root = 5.11


weighted quick-union, average distance to root: 1.52

weighted quick-union java implementation

Data structure: same as quick-union, but maintain extra array size[i] to count number of elements in the tree rooted at i, initially set to 1.

Find: identical to quick-union

Union: modify quick-union to:

int i = find(p);
int j = find(q);
if(i == j) return;
if(size[i] < size[j]) { parent[i] = j; size[j] += size[i]; }
else                  { parent[j] = i; size[i] += size[j]; }

weighted quick-union analysis

Running time

Proposition: depth of any node \(\textsf{x}\) is at most \(\lg N\)

\[N = 10\] \[\text{depth}(\textsf{x}) \leq \lg N \approx 3.32\]

Note: in computer science, \(\lg\) means base-2 logarithm

weighted quick-union analysis

Proposition: depth of any node \(\textsf{x}\) is at most \(\lg N\)

Proof: What causes the depth of element \(\textsf{x}\) to increase? Increase by 1 when root of tree \(T_1\) containing \(\textsf{x}\) is linked to root of tree \(T_2\).

weighted quick-union analysis


algorithm initialize union find
quick-find \(N\) \(N\) \(1\)
quick-union \(N\) \(N^\dagger\) \(N\)
weighted QU \(N\) \(\lg N^\dagger\) \(\lg N\)

\(\dagger\) includes cost of finding two roots

Note: analyzed quick-union for worst case

summary

Key point: weighted quick-union makes it possible to solve problems that could not otherwise be addressed.

algorithm worst-case time
quick-find \(M N\)
quick-union \(M N\)
weighted QU \(N + M \log N\)
QU + path compression* \(N + M \log N\)
weighted QU + path compression* \(N + M \invackermann(N) \approx N+M\)

Order of growth for \(M\) union-find ops on a set of \(N\) elements

Example: \(10^9\) unions and finds with \(10^9\) elements

[ \(\invackermann\): inverse Ackermann function, link
*path compression analysis is amortized ]

Union-Find

applications

Union-find applications

hex, the game

The game of Hex is played on a diamond-shaped board of hexagons. Two players alternate turns by placing their colored stones (red/blue, white/black, etc.) on the board, attempting to make a connection between their respective opposite sides.

[ Hex board, photo by David J. Bush, link ]

dynamic-connectivity solution ⇒ winner

Q: How to determine if a player has won?
A: Model as a dynamic-connectivity problem and use union-find

dynamic-connectivity solution ⇒ winner

Create a node for each hexagon tile, named \(0\) to \(N^2-1\)

Color the node of the player to represent placing a stone

Color the node of the player to represent placing a stone

Add edge between two adjacent nodes if they are similarly colored
Note: could add up to 6 edges

A player wins when there is a path between their opposite sides of the board from top–bottom or left–right

Example: check each node at top against each node at bottom

How can we check this more efficiently?

Clever trick: introduce 4 virtual nodes, edges where appropriate

A player wins when there is a path between opposite virtual nodes

subtext of today's lecture (and this course)

Steps to developing a usable algorithm to solve a computational problem

  1. Model the problem
  2. Find an algorithm to solve it
  3. Fast enough? Fits in memory?
  4. If not, figure out why
  5. Find a way to address the problem
  6. Iterate until satisfied

This is the scientific method

Mathematical analysis

×