Skip to main content

Hashing

First Missing Positive

Problem Description

Given an unsorted integer array nums. Return the smallest positive integer that is not present in nums.
For instance, in the array [1, 3, 5, 0, -1, 9], 1 is present, making 2 the smallest positive integer not present.

Example 1:

Input: nums = [3,4,-1,1]
Output: 2
Explanation: 1 is in the array but 2 is missing.

Example 2:

Input: nums = [7,8,9,11,12]
Output: 1
Explanation: The smallest positive integer 1 is missing.

Constraints:

1 <= nums.length <= 10⁵
-2³¹ <= nums[i] <= 2³¹ - 1

Brute Force Approach:

We need to find the smallest positive integer that isn't in the array, so we start by checking if 1 is present. If it is, we move on to check 2 and continue this process until we find the first missing positive number.

Now, Let’s Walk Through the Examples:

Let's dry-run the approach with the list [3, 4, -1, 1]:

  1. Check for 1: Is 1 in the list? Yes, it is.
  2. Check for 2: Is 2 in the list? No, it’s not.

Since 2 is the first missing positive number, that's the answer.

So, the smallest missing positive integer for [3, 4, -1, 1] is 2.

Here's another Example:

Here is another dry-run of the approach with the list [1, 2, 3, 4, 5]:

  1. Check for 1: Is 1 in the list? Yes, it is.
  2. Check for 2: Is 2 in the list? Yes, it is.
  3. Check for 3: Is 3 in the list? Yes, it is.
  4. Check for 4: Is 4 in the list? Yes, it is.
  5. Check for 5: Is 5 in the list? Yes, it is.
  6. Check for 6: Is 6 in the list? No, it’s not.

Since 6 is the first missing positive number, that's the answer.

So, the smallest missing positive integer for [1, 2, 3, 4, 5] is 6.

How to Code It Up?

Start by looping through positive integers, beginning from i=1.

For each positive integer, use a second loop to go through the entire array. 

Check if the current positive integer i is present in the array. 

If it's found, break the loop and move on to the next integer. 

However, if you traverse the entire array and don't find the integer, return i as the smallest positive number that isn't in the array.

Code Implementation
1. C++ Try on Compiler
class Solution {
public:
    int firstMissingPositive(vector<int>& nums) {

        // Start with 1 and increment
        for (int i = 1;; ++i) {

            bool found = false;
            for (int j = 0; j < nums.size(); ++j) {

                // Check if i is in the array
                if (nums[j] == i) {

                    // If found, break to check the next number
                    found = true;
                    break;
                }
            }
            if (!found) {

                // If i is not found, return it
                return i;
            }
        }
    }
};

2. Java Try on Compiler
class Solution {
    public int firstMissingPositive(int[] nums) {
        
        // Start with 1 and increment
        for (int i = 1;; ++i) {

            boolean found = false;
            for (int j = 0; j < nums.length; ++j) {

                // Check if i is in the array
                if (nums[j] == i) {

                    // If found, break to check the next number
                    found = true;
                    break;
                }
            }

            // If i is not found, return it
            if (!found) {
                return i;
            }
        }
    }
}

3. Python Try on Compiler
class Solution(object):
    def firstMissingPositive(self, nums):
        
        # Start with 1 and increment
        i = 1
        while True:

            found = False
            for num in nums:

                # Check if i is in the array
                if num == i:

                    # If found, break to check the next number
                    found = True
                    break


            # If i is not found, return it
            if not found:
                return i

            i += 1
        

4. Javascript Try on Compiler
var firstMissingPositive = function (nums) {
    
    // Start with 1 and increment
    for (let i = 1; ; ++i) {

        let found = false;
        for (let j = 0; j < nums.length; ++j) {

            // Check if i is in the array
            if (nums[j] === i) {

                // If found, break to check the next number
                found = true;
                break;
            }
        }

        // If i is not found, return it
        if (!found) {
            return i;
        }
    }
};

Time Complexity: O(n²) 

The inner loop always runs from j = 0 to j = n-1, which gives it a fixed time complexity of O(n). However, the outer loop’s iterations depend on the range of numbers present in the array. If all elements from 1 to n are present, the outer loop will run up to n + 1 because the smallest missing positive integer in this case would be n + 1. Otherwise, the loop will stop at the first missing positive number, which could be anywhere between 1 and n + 1.

In the best case, if the smallest missing number is near the start (like 1), the outer loop can terminate quickly (after 1 iteration only), making the total time complexity closer to O(n).

In the worst-case scenario where all numbers from 1 to n are present, the outer loop will iterate n + 1 times. Combining this with the inner loop's time complexity, the total time complexity becomes O(n) * O(n + 1), which simplifies to O(n²).

Space Complexity: O(n)

We are using only an input array of size n. 

Auxiliary Space: O(1)
Explanation: The auxiliary space is O(1) since the algorithm does not use any variables or additional data structures.

Total Space Complexity: O(n)
Explanation: Thus, the total space complexity is dominated by the input array, making it O(n).

Will Brute Force Work Against the Given Constraints? 

The time complexity of O(n²) becomes a problem, especially when n can go up to 10⁵. In this scenario, the number of operations would be approximately 10⁵ × 10⁵ = 10¹⁰, which is a very large number of iterations because most competitive programming environments can handle around 10⁷ to 10⁸ operations within the given time limits.

This can cause significant delays, especially in competitive programming, where solutions are expected to run within strict time limits.

In competitive programming environments, efficiency is critical. A solution with 10¹⁰ operations is likely to exceed the time limits set by the problem. 

As a result, even though the brute force approach will eventually give the correct answer, it isn't fast enough to handle the larger input sizes effectively. It’s too slow for the upper limits of the problem's constraints.

However, if the constraint for n were smaller, say n ≤ 10³, then the number of operations would be 10³ × 10³ = 10⁶. This is far more manageable, as 10⁶ operations are generally acceptable within the typical time limits of competitive programming (often around 1-2 seconds).

So while the brute force method works, its inefficiency makes it unsuitable for larger inputs. It can handle small constraints well, but as n increases, the need for a more optimized solution becomes clear in order to avoid time limit exceed (TLE) errors.

Can we optimize it?

In the brute force approach, we iterate through the entire array for each integer to check if it's present. 

However, what if we could determine the presence of an integer more efficiently?

Imagine a teacher in a classroom calling out each student's name during roll call. As the teacher reads out each name, they look through the class list and mark the students who are present. The students who remain unmarked by the end of the roll call are considered absent.

Let’s say the class list contains students numbered from 1 to 5:
Class List: [1, 2, 3, 4, 5]

Now, imagine that only students 1, 3, and 4 show up for class. The teacher starts calling out names:

"1" - Present! (marks student 1)

"2" - Not present. (does not mark)

"3" - Present! (marks student 3)

"4" - Present! (marks student 4)

"5" - Not present. (does not mark)

At the end of the roll call, the teacher checks their attendance register and finds that students 2 and 5 are absent. This method is efficient because the teacher only checks each student once. 

Can you think of something now?

Now, if we relate this back to finding the smallest missing positive integer from an array, we can think of using a hash table to track the presence of integers.

What is hash table?

A hash table can be thought of as an auxiliary array where each index stores data linked to a specific key. The hash function computes the index in this auxiliary array for a given key, allowing direct access to the stored value. This structure ensures fast operations since accessing an index in an array is an O(1) operation.

Since, we only need to know if a number exists, not associate it with any other value. Moreover, given the constraints of the problem, where integers can range from -2³¹ <= nums[i] <= 2³¹ - 1 meaning the numbers can be both negative and spread across a huge range. Building a hash table with such a large range would lead to inefficient memory usage, as we'd need an enormous table to account for all possible values.

So, how do we tackle this? 

Yes! we can think of using a hash set to track the presence of integers.

What is a hash set?

A hash set is a data structure that stores unique elements in an unordered manner, providing fast access, insertion, and deletion for an average time complexity of O(1). Hash sets are ideal for checking the presence of an element efficiently without worrying about duplicates or order. This is much faster than the brute force approach of scanning the entire array each time.

We create a hash set to keep track of which integers are present in the array.

As we traverse the array of integers, we add each positive integer to the hash set.

Once we have processed the entire array, we can then check from 1 onward to find the smallest positive integer that is not in the hash set.

This method is more efficient and straightforward than using a hash table, and it is well-suited for handling larger inputs while ensuring we find the smallest missing positive integer effectively.

Let's walk through this example:

Let's dry-run the approach of finding the smallest missing positive integer using the array [1, 8, 2, 11, 3].

Step 1: Store nums in a hash set:

set = {1, 8, 2, 11, 3}

Step 2: For each positive integer (1 to n+1), check if it exists in set:

  • 1: present in set
  • 2: present in set
  • 3: present in set
  • 4: not present in the set

return 4 as the answer as it is not present in the set

Thus 4 is the smallest positive integer not present in the array.

How to Code It Up?

Create a hash set to store the presence of integers.

As you iterate through the array, add each positive integer to the hash set.

After processing the array, iterate through the integers starting from 1 and check the hash set for the first positive integer that is not marked.

Code Implementation
1. C++ Try on Compiler
class Solution {
public:
    int firstMissingPositive(vector<int>& nums) {
        // Create a hash set to store the presence of numbers
        unordered_set<int> numSet;

        // Mark the presence of each positive integer in the hash set
        for (int num : nums) {
            if (num > 0) {
                numSet.insert(num);
            }
        }

        // Check for the smallest missing positive integer
        for (int i = 1;; ++i) {

            // If i is not found in the hash set
            if (numSet.find(i) == numSet.end()) {

                // Return the first missing positive integer
                return i;
            }
        }
    }
};

2. Java Try on Compiler
class Solution {
    public int firstMissingPositive(int[] nums) {
    
        // Create a hash set to store the presence of numbers
        HashSet<Integer> numSet = new HashSet<>();

        // Mark the presence of each positive integer in the hash set
        for (int num : nums) {
            if (num > 0) {
                numSet.add(num);
            }
        }

        // Check for the smallest missing positive integer
        for (int i = 1;; ++i) {

            // If i is not found in the hash set
            if (!numSet.contains(i)) {

                // Return the first missing positive integer
                return i;
            }
        }
    }
}

3. Python Try on Compiler
class Solution(object):
    def firstMissingPositive(self, nums):
        # Create a hash set to store the presence of numbers
        num_set = set()  

        # Mark the presence of each positive integer in the hash set
        for num in nums:
            if num > 0:
                num_set.add(num)

        # Check for the smallest missing positive integer
        i = 1
        while True:

            # If i is not found in the hash set
            if i not in num_set:  

               # Return the first missing positive integer
                return i  

            i += 1

4. Javascript Try on Compiler
var firstMissingPositive = function (nums) {

    // Create a hash set to store the presence of numbers
    const numSet = new Set();

    // Mark the presence of each positive integer in the hash set
    for (const num of nums) {
        if (num > 0) {
            numSet.add(num);
        }
    }

    // Check for the smallest missing positive integer
    let i = 1;
    while (true) {

        // If i is not found in the hash set
        if (!numSet.has(i)) {

            // Return the first missing positive integer
            return i;
        }
        i++;
    }
};

Time Complexity: O(n) 

As we iterate through the array from index i = 0 to i = n - 1, we add each positive integer to the hash set. This step has a time complexity of O(n) since we are processing each element of the array exactly once.

After this, we need to find the smallest missing positive integer by checking the hash set. In the worst case, when all elements are present in the array, the hash set can contain up to n elements. Therefore, iterating through the hash set results in another O(n) operation.

Since both of these loops operate independently of each other, we can combine their time complexities: O(n)+ O(n) = O(2n) which is equivalent to O(n).

Thus, the overall time complexity of our approach remains O(n). This efficiency allows us to handle larger arrays effectively without significant delays.

Space Complexity: O(n)

We use a hash set to store the count of each number in the array. In the worst case, if all elements are present, the hash set will contain n entries, where n is the size of the array. Therefore, the space required to hold these counts leads to a space complexity of O(n)

Consequently, the overall space complexity of our solution is O(n).

Can there be a better approach?

Yes, there’s definitely a smarter way to solve this problem—one that works in linear time and uses constant space

In the previous approach, we saw how using a hash set helps us find the first missing positive integer by marking whether an element is present. 

But what if we could improve space efficiency while keeping the time complexity manageable?

A common way to check if an element is present is by sorting the array and looking for the first mismatch in order (of element and index, like in hash table). However, sorting takes O(n*log⁡ n) time, which is slower than the hash set approach. But what if we could sort the array in O(n) time? 

That seems impossible, right? Actually, it’s not!

Here’s the trick: instead of sorting the entire array, we can place each element in its correct position based on its value. 

For example, the number 1 should be at index 1, 2 should be at index 2, and so on. If a number isn’t in its correct position, we swap it with the number currently in that position. 

We’ll keep swapping until the element is in its correct spot. But what about the number we swapped? It may also be in the wrong position, so we’ll swap it too, continuing this process until every number is either in the right place or cannot be placed.

Now, you might be thinking: What if an element doesn't belong in the array, like if it's less than 1 or greater than the array size? In that case, we just skip it because it can’t be part of the valid range of numbers we’re concerned with.

By the end of this process, all the valid numbers will be in their correct positions, and the ones that don’t belong will be ignored. 

At this point, we can easily find the smallest missing positive number by scanning the array once more. If an element isn’t where it should be, that’s the missing number.

But what if all elements would be in place?Then we would simply return n+1, because that would be our smallest positive element not present in the array.

This approach is efficient because we’re processing each element exactly once—either swapping it into the correct position or skipping it. 

So, the time complexity remains O(n), just like the previous method. And the best part? We’re doing all this in place, so the space complexity is O(1). [We have talked about this in detail in the Time Complexity section after the code.]

To sum it up, we traverse the array, and for every element in the range of 1 to n, we place it in its corresponding index. If it’s out of range, we skip it. Once the array is traversed, we can quickly find the smallest missing positive number by checking the positions. 

This method is both time-efficient and space-efficient!

This method is commonly referred to as Cyclic Sort!

In Cyclic Sort, we aim to place each number at its correct index (i.e., the number i should be at index i if the numbers range from 1 to n). If a number isn’t in the right place, we keep swapping it with the number at its target index until it either lands in the correct spot or is out of range. 

This approach ensures that all the elements are in their correct positions by the end of the process, with each element being moved at most once, making the time complexity O(n).

0:00
/2:08

Let's understand with visuals

Now, let’s see this in action with an example!

Let’s take an Array: [1, 3, 5, 0, -1, 9]

Firstly append 0 at the end of the array because indexing is 0-based, so the element n will not have a corresponding valid index because in 0-based indexing, indexes are from 0 to n-1.

Appending 0 will add one more index so we can put all elements in its place with ease.

Array: [1, 3, 5, 0, -1, 9, 0]

  1. Index 0 (value 1):1 should be at index 1, so we swap 1 with the element at index 1 (which is 3).
    Array after swap: [3, 1, 5, 0, -1, 9, 0]
    Now, 3 is at index 0. Which should be at index 3.
    Array after swap: [0, 1, 5, 3, -1, 9, 0]
    Now 0 is less than 1, so skip.
  2. Index 1 (value 1):1 is at the correct index, so skip.
    Array: [0, 1, 5, 3, -1, 9, 0]
  3. Index 2 (value 5):5 should be at index 5, so we swap 5 with the element at index 5 (which is 9).
    Array after swap: [0, 1, 9, 3, -1, 5, 0]
    Now 9, is greater than 6 (not between 1 and n), so skip.
  4. Index 3 (value 0):0 is out of range (not between 1 and n), so we skip and move on.
    Array: [0, 1, 9, 3, -1, 5, 0]
  5. Index 4 (value -1):-1 is out of range (not between 1 and n), so we skip and move on.
    Array: [3, 1, 9, 0, -1, 5, 0]
  6. Index 5 (value 5):5 is at correct index, so skip.
    Array: [0, 1, 9, 3, -1, 5, 0]

Final Array: [0, 1, 9, 3, -1, 5, 0]

Now, we traverse the array to find the first missing positive integer:

  • Index 1: 1 is correct.
  • Index 2: 9 is incorrect (should be 2), so the smallest missing positive integer is 2.

Result: The smallest missing positive integer is 2.

How to code it up?

Firstly,  insert 0 into the array to adjust 0-based indexing.

Traverse the array using a for-loop. For each element, check if it's in the correct position (i.e., if the element at index i should be i).

If it's not, and the element is within the valid range (between 1 and n), swap it with the element in its correct place.

Repeat this swapping process until the element is either in its correct position or out of range.

After the traversal, every element that belongs to the range from 1 to n will be in its correct place, or it will be out of range and ignored.

Now, traverse the array again. The first index that does not contain the correct value (i at index i) is the smallest missing positive integer.

If no missing integer is found in the second traversal, return n + 1, because that would be the smallest missing number.

Code Implementation
1. C++ Try on Compiler
class Solution {
public:
    int firstMissingPositive(vector<int>& nums) {
        int n = nums.size();

        // Add 0 to handle 0-based indexing and prevent out-of-bound issues
        nums.push_back(0);

        // First pass: Traverse the array and place each number in its correct
        // index
        for (int i = 0; i < n; i++) {

            // While the current number is within the valid range (1 to n) and
            // not already in its correct position, swap it with the number at
            // its correct index.
            while (nums[i] > 0 && nums[i] <= n && nums[i] != nums[nums[i]])
                swap(nums[i], nums[nums[i]]);
        }

        // Second pass: Traverse again to find the first index i where nums[i]
        // != i
        for (int i = 1; i <= n; i++) {
            if (nums[i] != i)

                // Return the first missing positive
                return i;
        }

        // If all numbers from 1 to n are in their correct places, return n+1
        return n + 1;
    }
};

2. Java Try on Compiler
class Solution {
    public int firstMissingPositive(int[] nums) {
        int n = nums.length;

        // Add 0 to handle 0-based indexing and prevent out-of-bound issues
        nums = Arrays.copyOf(nums, n + 1);
        nums[n] = 0;

        // First pass: Traverse the array and place each number in its correct index
        for (int i = 0; i < n; i++) {

            // While the current number is within the valid range (1 to n) and
            // not already in its correct position, swap it with the number at its correct
            // index.
            while (nums[i] > 0 && nums[i] <= n && nums[i] != nums[nums[i]]) {
                int temp = nums[i];
                nums[i] = nums[temp];
                nums[temp] = temp;
            }
        }

        // Second pass: Traverse again to find the first index i where nums[i] != i
        for (int i = 1; i <= n; i++) {
            if (nums[i] != i)

                // Return the first missing positive
                return i;
        }

        // If all numbers from 1 to n are in their correct places, return n+1
        return n + 1;
    }
}

3. Python Try on Compiler
class Solution(object):
    def firstMissingPositive(self, nums):
        n = len(nums)
        
        # Add 0 to handle 0-based indexing and prevent out-of-bound issues
        nums.append(0)
        
        # First pass: Traverse the array and place each number in its correct index
        for i in range(n):

            # While the current number is within the valid range (1 to n) and
            # not already in its correct position, swap it with the number at its correct index.

            while 0 < nums[i] <= n and nums[i] != nums[nums[i]]:
                nums[nums[i]], nums[i] = nums[i], nums[nums[i]]

        # Second pass: Traverse again to find the first index i where nums[i] != i
        for i in range(1, n + 1):
            if nums[i] != i:

                # Return the first missing positive
                return i
        
        # If all numbers from 1 to n are in their correct places, return n+1
        return n + 1

4. Javascript Try on Compiler
var firstMissingPositive = function (nums) {
    let n = nums.length;

    // Add 0 to handle 0-based indexing and prevent out-of-bound issues
    nums.push(0);

    // First pass: Traverse the array and place each number in its correct index
    for (let i = 0; i < n; i++) {

        // While the current number is within the valid range (1 to n) and
        // not already in its correct position, swap it with the number at its correct index.

        while (nums[i] > 0 && nums[i] <= n && nums[i] !== nums[nums[i]]) {
            let temp = nums[i];
            nums[i] = nums[temp];
            nums[temp] = temp;
        }
    }

    // Second pass: Traverse again to find the first index i where nums[i] != i
    for (let i = 1; i <= n; i++) {
        if (nums[i] !== i) {

            // Return the first missing positive
            return i;
        }
    }

    // If all numbers from 1 to n are in their correct places, return n+1
    return n + 1;
};

Time complexity: O(n)

As we go through the array, if an element is already in its correct spot or if it’s not within the valid range (like numbers that are too large or negative), we simply skip over it.

When an element is not in its correct position, we swap it with the element that should be there. Importantly, the inner while loop ensures that every swap puts an element into its correct spot. Because of this, each element only needs to be moved at most once to its correct place. So, across all the iterations of the outer loop, the inner loop will run a maximum of n times(since we have n elements) in total (irrespective of the outer loop, because in n iterations all elements would be in place or out of range, and outer loop will just skip the inner loop). It’s not doing extra work for every element individually.

The outer loop runs n times, and the inner loop, combined across all iterations, also runs a maximum of n times. So, the total work for swapping elements is O(n) + O(n) = O(2n).

After placing all elements in their correct positions, we do one more pass through the array to find the smallest missing positive number. This adds another O(n) to the time complexity.

In total, we have O(2n) for the swapping process and O(n) for finding the missing element. This gives us O(2n) + O(n) = O(3n). But since we ignore constants in Big-O notation, the final time complexity is simply O(n).

Space Complexity: O(n)

We are using only an input array of size n. 

Auxiliary Space: O(1)
Explanation: The auxiliary space is O(1) since the algorithm does not use any variables or additional data structures.

Total Space Complexity: O(n)
Explanation: Thus, the total space complexity is dominated by the input array, making it O(n).

Learning Tip

Now we have successfully tackled this problem, let's try these similar problems.

Given an array nums containing n distinct numbers in the range [0, n], return the only number in the range that is missing from the array.

Given an array of integers nums containing n + 1 integers where each integer is in the range [1, n] inclusive.

There is only one repeated number in nums, return this repeated number.

You must solve the problem without modifying the array nums and using only constant extra space.