# Disjoint-set Data Structure

·

## Disjoint-set data structure

According to wiki:

In computer science, a disjoint-set data structure, also called a union–find data structure or merge–find set, is a data structure that stores a collection of disjoint (non-overlapping) sets.

The key is every subsets use a special data structure, to make it easy to do two operations:

• merge two subsets: `Union`
• find which subsets the element belongs to: `Find`

For example, we have an array `[s1, s2, s3, s4, s5]`. First, we make every element as a unique subset. Every subsets have a concept called subset leader. For initialization, we make every element as the leader itself. We can use a map to express this data structure, with map key as the element, map value as subset leader.

``````{
s1: s1,
s2: s2,
s3: s3,
s4: s4,
s5: s5,
}
``````

So now if we call function `Find` for element `s2`, we should its leader `s2` too.

Now let call function `Union` for subset `s1` and `s2` to merge the two subsets. So how to express this merged state? We can make `s1` as `s2`'s leader. So later, we can find `s1` and `s2` have the same leader so we know they are in the same subset. So now the data structure becomes:

``````{
s1: s1,
s2: s1,  // here, s2's leader is s1
s3: s3,
s4: s4,
s5: s5,
}
``````

So now if we want to find the leader of `s2`, we will find `s1` first, then call `Find` recursively. When we find the element's leader is itself, then we find the top leader.

So this is the basic process of disjoint set. Let's implement it in code.

``````class DisjointSet {
private m = new Map<number, number>();

constructor(universe: number[]) {
universe.forEach(e => {
this.m.set(e, e);
});
}

public find(k: number): number {
const v = this.m.get(k) as number;
if (v === k) {
return v;
} else {
// call find recursively to get the top leader
return this.find(v);
}
}

public union(a: number, b: number): boolean {
const x = this.find(a);
const y = this.find(b);
if (x === y) return false;

this.m.set(x, y);
// can also set(y, x);
}
}
``````

## Unbalanced Problem

From the above process, we can see that, as we do many merge operations, this map tree may become very deep. We may need to call `Find` recursively many times to find the top leader.

For example, the structure may becomes:

``````{
s1: s1,
s2: s1,
s3: s2,
s4: s3,
s5: s4,
}
``````

In this case, we need to call `Find` many times to find `s5`'s top leader.

There are two ways to solve this problem.

The first solution is called union by rank. The idea is, for each subset, we store a value called `rank`, which stands for the tree's deep degree. Then when we call `Union`, we make the leader with lower rank points to the leader with higher rank. This way, the tree should become more balanced.

The second solution is called path compression. The idea is, when we call `Find` to find element `s5`'s leader, say we find the result is `s1`. Before returns the result, we can set `s5`'s leader as `s1` directly. So next time when we want to find `s5`'s leader, we should get the result instantly.

Now let see how to code these two solutions.

``````class DisjointSet {
private m = new Map<number, number>();

// every leader have its rank
private rank = new Map<number, number>();

constructor(universe: number[]) {
universe.forEach(e => {
this.m.set(e, e);
this.rank.set(e, 0); // rank init with 0
});
}

public find(k: number): number {
// path compression
const v = this.m.get(k) as number;
if (v !== k) {
this.m.set(k, this.find(v));
}
return this.m.get(k) as number;
}

public union(a: number, b: number): boolean {
const x = this.find(a);
const y = this.find(b);
if (x === y) return false;

const rx = this.rank.get(x) as number;
const ry = this.rank.get(y) as number;

// higher rank value as leader
if (rx > ry) {
this.m.set(y, x);
} else if (rx < ry) {
this.m.set(x, y);
} else {
// same rank, either can be leader
this.m.set(x, y);

// now tree becomes deeper
this.rank.set(y, ry + 1);
}
return true;
}
}
``````

## redundant-connection

Now let's see how to use this data structure to solve problems.

The problem is to find the reduncdant connection. We could make every node as a subset of the disjoint set. Then we can iterate all the edges. For every iteration, we check if the two nodes are in the same subsets. If they are, then this edge is redundant. If they are not, then we call union to make them as one subsets.

``````function findRedundantConnection(edges: number[][]): number[] {
const universe = Array.from({ length: edges.length }, (_, i) => i + 1);
const ds = new DisjointSet(universe);

for (let i = 0; i < edges.length; i++) {
const [u, v] = edges[i];
// union returns false means they were already connected
if (!ds.union(u, v)) {
return edges[i];
}
}

return [];
};

// console.log(findRedundantConnection([[1, 2], [1, 3], [2, 3]]));
``````