Boyer–Moore majority vote algorithm

Let's solve the leetcode problem majority element.

The problem is to find the majority element which is the element that appears more than n/2 times.

The most intuitive solution is to use a map to store the count of all unique elements. With this method, we could achieve time complexity O(N) but also space complexity O(N).

There exists a few other solutions that make use of the feature of majority element. But this is not the point of this article.

The hard part of this problem is to find a solution with O(N) time complexity and also O(1) space complexity. There is an algorithm called Boyer–Moore majority vote algorithm which could achive this.

This algorithm is pretty easy to implement but not that easy to understand. I found that it is easier to understand if I check the code first then find intuition from it. So let's see the code first.

class Solution {
 public:
  int majorityElement(vector<int>& nums) {
    int count = 0;
    int v = 0;
    for (auto& num : nums) {
      if (count == 0) {
        v = num;
      }
      count += v == num ? 1 : -1;
    }
    return v;
  }
};

As you can see, the flow is pretty simple: if current count is 0, then we set v as current num. If v is equal to current num, then increment count by 1, else decrement count by 1.

Because count is initiazed as 0, so v should be set to the first num at the start. And later on v will be updated only when count is 0.

I found that the key to understand this algorithm is to understand that the process of decrement and increment of count is like canceling elements with each other.

Let's see a few examples.

Say we have array as [1,1,1,2,2,2,...], ... here means we have a bunch of element later that we don't care what. As you can see, we have first 3 elements 1, then v will be set to 1 and count will be incremented to 3 first. Then we have 3 elements 2, this means count will be decremented to 0 and v will be set to the value next. So this is like [1,1,1] is cancelling [2,2,2]. When this cancelling is done, what would happen then? The Answer is nothing. If we have a majority element in the array, then after cancelling we will still have the same element as majority element in the remaining array. So in other words, this cancelling is like making the problem smaller without affecting the final answer.

Say we have array as [1,1,1,2,3,4,...]. If we walk through the process like above, we would see that this is the same. The final answer does not change.

Say we have array as [1,2,1,2,1,2,1]. As you can see, the 6 elements would be cancelled with each other, and the last element would be the answer. Same thing happens if the last element is 2.

So we have the majority element in the array, because it occures more than n/2 times, so no matter how other elements calcel this majority element, the num stays in variable v should be the answer.

But if we don't have the majority element, this may not work. Since we may have multiple elements with equal occurrence, so we would not know which num will be stored after cancelling. In this case, we can simply make a second pass to check that is the v is the majority element. The time complexity still stays as O(N).

yaox023's blog

yaox023's blog

Boyer–Moore majority vote algorithm