Contains Duplicate
Problem Description
Given an integer array nums, return true if any value appears at least twice in the array, and return false if every element is distinct.
Examples:
Input: nums = [1, 2, 4, 3, 1].
Output: true
Explanation: Here element 1 occurs two times at index 0 and 4.
Input: nums = [4, 2, 6]
Output: false
Explanation: all elements are distinct.
Input: nums = [3, 2, 3, 4, 2, 1, 5]
Output: true
Explanation: here all elements are not distinct because 3 and 2 occur more than 1.
Constraints:
1 ≤ nums.length ≤ 105
-109 ≤ nums[i] ≤ 109
Brute force approach:
In the brute-force method, we compare each element with every other element in the array. If any element is found to appear more than once, we return true. If no duplicates are found, we return false.
Let Understand with an example
Consider the array nums = [3, 1, 2, 1].
Start with the first element (index i = 0), which is 3.
- Compare nums[0] (which is 3) with all elements that come after it:
- Compare nums[0] with nums[1] (which is 1). They are not the same.
- Compare nums[0] with nums[2] (which is 2). They are not the same.
- Compare nums[0] with nums[3] (which is 1). They are not the same.
- Compare nums[1] (which is 1) with all elements that come after it:
- Compare nums[1] with nums[2] (which is 2). They are not the same.
- Compare nums[1] with nums[3] (which is 1). They are the same!
Since we found a match (value 1 at index 1 and index 3), we return true immediately because at least one value appears more than once in the array.
How to Code it up?
- Step 1: Loop through each element of the array using a loop (let's call it i).
- Step 2: For each element i, start another loop (let's call it j) that checks the elements after i (i.e., from i+1 to the n where n is the size of the nums array).
- Step 3: Compare the element at index i with the element at index j.
- Step 4: If any element at index j matches the element at i, return true because we found a duplicate.
- Step 5: If the loops finish without finding any duplicates, return false because all elements are distinct.
Code Implementation
C++
class Solution
{
public:
bool containsDuplicate(vector<int> &nums)
{
// Loop through each element of the array
for (int i = 0; i < nums.size(); i++)
{
// Compare the current element with every element after it
for (int j = i + 1; j < nums.size(); j++)
{
// If a duplicate is found, return true
if (nums[i] == nums[j])
{
return true;
}
}
}
// If no duplicates are found, return false
return false;
}
};
Java
class Solution {
public boolean containsDuplicate(int[] nums) {
// Loop through each element of the array
for (int i = 0; i < nums.length; i++) {
// Compare the current element with every element after it
for (int j = i + 1; j < nums.length; j++) {
// If a duplicate is found, return true
if (nums[i] == nums[j]) {
return true;
}
}
}
// If no duplicates are found, return false
return false;
}
}
Python
class Solution:
def containsDuplicate(self, nums):
# Loop through each element of the array
for i in range(len(nums)):
# Compare the current element with every element after it
for j in range(i + 1, len(nums)):
# If a duplicate is found, return True
if nums[i] == nums[j]:
return True
# If no duplicates are found, return False
return False
Javascript
var containsDuplicate = function(nums) {
// Loop through each element of the array
for (let i = 0; i < nums.length; i++) {
// Compare the current element with every element after it
for (let j = i + 1; j < nums.length; j++) {
// If a duplicate is found, return true
if (nums[i] === nums[j]) {
return true;
}
}
}
// If no duplicates are found, return false
return false;
};
Time Complexity : O(n²)
Reason: This approach runs two nested loops to check whether any element has occurred at least twice where the outer loop runs n times(ie. i ranges from 0 to n-1) and the inner loop(ie j ranges from i+1 to n-1) also run n-i-1 items.
Analyzing the Loops:
Outer Loop (indexed by i): Runs from 0 to n-1. So, it iterates n times.
Inner Loop (indexed by j): For each fixed i, j runs from i+1 to n-1.When i = 0, j runs n-1 times.When i = 1, j runs n-2 times.When i = 2, j runs n-3 times.…When i = n-1, j runs 0 times.
Total Iterations: To find the total number of iterations of the inner loop across all iterations of the outer loop, we sum up the iterations for each value of i: ∑ (n−i) i=0 to n−1.
This is the sum of an arithmetic series: n-1+(n−2)+(n−3)+…+0.
If we closely observe and reverse the series we get 0 + 1 + 2 + 3 + … + (n-2) + (n-1).
That is nothing but the sum of (n-1) terms. So the sum of n numbers from 1 to n is (n*(n-1))/2.
Thus, the total number of operations is (n*(n-1))/2.
Simplifying to Big-O Notation:
In Big-O notation, we focus on the term that grows the fastest as n increases, and we ignore constant factors and lower-order terms. Therefore: O((n*(n-1))/2) ≈ O(n²)
Conclusion: The time complexity of the given code is O(n²). This is because the nested loops result in a quadratic number of total operations, which dominate the runtime as the input size n becomes large.
Space Complexity: O(1)
Auxiliary Space: We need some local i and j pointers to traverse the array, which is constant. Therefore, the space complexity is constant O(1).
Total Space Complexity: O(n)
Explanation: While the algorithm itself uses only a few variables (like i and j), we still need to store the input array of size n. This means the total space complexity is at least O(n) due to the input.
So, the overall space complexity is dominated by the input array, making it O(n).
Will Brute Force Work Against the Given Constraints?
The time complexity of O(n²) becomes a problem, especially when n can go up to 10⁵. In this scenario, the number of operations would be approximately 10⁵ × 10⁵ = 10¹⁰, which is a very large number of iterations because most competitive programming environments can handle around 10⁷ to 10⁸ operations within the given time limits.
This can cause significant delays, especially in competitive programming, where solutions are expected to run within strict time limits.
In competitive programming environments, efficiency is critical. A solution with 10¹⁰ operations is likely to exceed the time limits set by the problem.
As a result, even though the brute force approach will eventually give the correct answer, it isn't fast enough to handle the larger input sizes effectively. It’s too slow for the upper limits of the problem's constraints.
However, if the constraint for n were smaller, say n ≤ 10³, then the number of operations would be 10³ × 10³ = 10⁶. This is far more manageable, as 10⁶ operations are generally acceptable within the typical time limits of competitive programming (often around 1-2 seconds).
So while the brute force method works, its inefficiency makes it unsuitable for larger inputs. It can handle small constraints well, but as n increases, the need for a more optimized solution becomes clear in order to avoid time limit exceed (TLE) errors.
If your interviewer is not satisfied with this brute-force approach and tells you to optimize it.
Can we optimize it?
So in the question, we are finding whether any element is occurring at least twice or not. What if we sort the array, then any duplicate elements will appear next to each other.
This means that instead of comparing each element with every other element (as in the brute force method), we can just compare each element with the one next to it after sorting.
Sorting organizes the array in ascending order, placing elements in a specific sequence. Therefore, if two elements are the same, they will be neighbors in the sorted array.
Let understand with an example
Consider the array nums = [4, 2, 3, 1, 3].
Sort the array: After sorting, the array becomes nums = [1, 2, 3, 3, 4]. Loop through the sorted array and compare adjacent elements:
Start with i = 0: Compare nums[0] (which is 1) with nums[1] (which is 2) they are not the same.
Move to i = 1: Compare nums[1] (which is 2) with nums[2] (which is 3) they are not the same.
Move to i = 2: Compare nums[2] (which is 3) with nums[4] (which is 3) they are the same, so we found a duplicate. We return true here
How to code it up?
- Step 1: Sort the array
- Step 2: Loop through 0 to n-2(inclusive) and compare each element i to the next one (i+1) of it.
- Step 3: If two adjacent elements are the same nums[i]==nums[i+1], we immediately know there is a duplicate, so return true. If we can’t find any duplicate after iterating through the array, then we return false.
Code Implementation
C++
// Class Solution with the contains duplicate function
class Solution
{
public:
bool containsDuplicate(vector<int> &nums)
{
// Sort the array
sort(nums.begin(), nums.end());
// Loop through the array from 0 to n-2 and check adjacent elements
for (int i = 0; i <= nums.size() - 2; i++)
{
if (nums[i] == nums[i + 1])
{
// If two adjacent elements are equal, return true
return true;
}
}
// If no duplicates are found, return false
return false;
}
};
Java
class Solution {
public boolean containsDuplicate(int[] nums) {
// Sort the array
Arrays.sort(nums);
// Loop through the array and check adjacent elements
for (int i = 0; i <= nums.length - 2; i++) {
if (nums[i] == nums[i + 1]) {
// If two adjacent elements are equal, return true
return true;
}
}
// If no duplicates are found, return false
return false;
}
}
Python
class Solution:
def containsDuplicate(self, nums):
# Sort the list
nums.sort()
# Loop through the list and check adjacent elements
for i in range(len(nums) - 1):
if nums[i] == nums[i + 1]:
# If two adjacent elements are equal, return True
return True
# If no duplicates are found, return False
return False
Javascript
var containsDuplicate = function (nums) {
// Sort the array
nums.sort((a, b) => a - b);
// Loop through the array and check adjacent elements
for (let i = 0; i <= nums.length - 2; i++) {
if (nums[i] === nums[i + 1]) {
// If two adjacent elements are equal, return true
return true;
}
}
// If no duplicates are found, return false
return false;
};
Time Complexity: O(n*log(n))
In this approach, we are sorting the array with the STL sort function which will take O(n*log(n)) time to sort the array also we are checking for whether an adjacent element is the same or not which will take O(n) time so overall time complexity will be O(n*log(n)) + O(n) after simplifying it is same as O(n*log(n)).
Space Complexity: O(1)
Auxiliary Space: Here We don’t use any extra space to store anything we just use i pointer to iterate the array which is constant so space complexity will be O(1).
Total Space Complexity: O(n)
Explanation: While the algorithm itself uses only a few variables (like i and j), we still need to store the input array of size n. This means the total space complexity is at least O(n) due to the input.
So, the overall space complexity is dominated by the input array, making it O(n).
Does this work for the given constraints?
Here n can go up to 105 so time complexity is O(n*log(n)). So in worst case it will be O(105 * log(105)) = O(105 * 16) = O(106). In most competitive programming environments can handle at most 10⁷ to 10⁸ operations so it works here with O(n*log(n)) time complexity.
What if the interviewer again asked to optimize it? now what comes to your mind?
Final Optimization
In the problem, we are trying to check if any value in the array appears more than once. In other words, we want to identify if any number occurs at least twice.
What if we have some mechanism or data structure that will store the frequency of each element then we can simply check that if any element's frequency is greater than 1 then return true else false.
Yeah, to store the frequency of each element which Data structure can we use?
A map (also known as a hash map or unordered_map) data structure.
Why map and not vector to store frequency?
Because in the question we have elements from -109 to 109 and the vector wouldn't be practical because we'd need to allocate a huge amount of memory which is not possible because at max vector can allocate memory for numbers till 106 to 107 and we have negative numbers also.
What is a map?
A map (or hash map) is a data structure that stores data in pairs, where each pair consists of a {key, value}. You can think of it like a dictionary: for each unique key, there is an associated value. It helps us quickly look up, add, or update values based on keys.
A map can come in two types: unordered map and map.
An unordered map doesn't keep its keys in any specific order(ie. Sorted based on key value) and uses a hash table to store key-value pairs. This allows it to perform operations like inserting, looking up, or deleting elements very quickly in constant average Time complexity O(1). although in rare cases, it can be slower due to hash collisions.
A map(ordered map) keeps the keys in sorted order based on the key and uses a balanced binary search tree (like a Red-Black tree) to store the pairs. This means it takes O(log n), for operations like insertion, lookup, or deletion (where n is the size of the inserted data), but it ensures the keys are always in a sorted sequence.
In the context of this question, a map allows us to track how many times each element appears in the array and we don’t use unorderd_map here because when your constraints are fit with map log(n) time complexity it's always recommended to use a map.
We don’t use unordered_map because sometimes in unordered_map can be slower due to hash collisions and take O(n) Time complexity to look up and insert.
Approach
First, create a map to store the frequency of each element in the array. The keys of the map will be the elements from the array, and the values will be how many times each element has appeared like this {element, frequency}.
Now, go through each element of the array one by one and increase its count. After it iterates through the map pair {key, value} and check if any key (element) has a value (frequency) greater than 1.
If any element has a frequency greater than 1 then we can simply return true because we found a duplicate element if we can't find any element that has a count greater than 1 then we can return false.
Let Understand with an example
Example: nums = [0, 3, 1, 4, 2, 3]
Step 1: Create a Map
We start by creating an empty map (dictionary) to store the count of each element in the form {element: frequency}.
Step 2: Update the Map for Each Element
We loop through each element in the array and update the map as follows:
- First element (0): Add to the map → {0: 1}
- Second element (3): Add to the map → {0: 1, 3: 1}
- Third element (1): Add to the map → {0: 1, 3: 1, 1: 1}
- Fourth element (4): Add to the map → {0: 1, 3: 1, 1: 1, 4: 1}
- Fifth element (2): Add to the map → {0: 1, 3: 1, 1: 1, 4: 1, 2: 1}
- Sixth element (3): Already in the map, so increment its count → {0: 1, 3: 2, 1: 1, 4: 1, 2: 1}
Step 3: Check for Duplicates in the Map
Now, we check the counts in the map:
- {0: 1} → Frequency is 1, no duplicate.
- {3: 2} → Frequency is 2, indicating a duplicate!
At this point, we can stop and return true because element 3 appears more than once.
Result:
Element 3 is the duplicate, and the method returns true.
Code Implementation
C++
class Solution
{
public:
bool containsDuplicate(vector<int> &nums)
{
//Create a map
map<int, int> count;
// Loop through the array from 0 to n-1
for (int i = 0; i < nums.size(); i++)
{
//Increase the frequency of each element
count[nums[i]]++;
}
//Check if any pair {element , frequency} has frequency greater than 1.
for (auto it : count)
{
//If its frequency is greater than 1 so we found a duplicate
if (it.second > 1)
{
return true;
}
}
//Every element has a frequency of 1. No element is a duplicate here.
return false;
}
};
Java
class Solution {
public boolean containsDuplicate(int[] nums) {
// Create a map to store the frequency of elements
Map<Integer, Integer> count = new HashMap<>();
// Loop through the array
for (int i = 0; i < nums.length; i++) {
// Increase the frequency of each element
count.put(nums[i], count.getOrDefault(nums[i], 0) + 1);
}
// Check if any element has a frequency greater than 1
for (Map.Entry<Integer, Integer> entry : count.entrySet()) {
if (entry.getValue() > 1) {
return true; // Duplicate found
}
}
// No duplicates found
return false;
}
}
Python
class Solution:
def containsDuplicate(self, nums):
# Create a dictionary to store the frequency of elements
count = {}
# Loop through the list
for num in nums:
# Increase the frequency of each element
count[num] = count.get(num, 0) + 1
# Check if any element has a frequency greater than 1
for value in count.values():
if value > 1:
return True # Duplicate found
# No duplicates found
return False
Javascript
var containsDuplicate = function (nums) {
// Create a map to store the frequency of elements
const count = new Map();
// Loop through the array
for (let num of nums) {
// Increase the frequency of each element
count.set(num, (count.get(num) || 0) + 1);
}
// Check if any element has a frequency greater than 1
for (let [key, value] of count) {
if (value > 1) {
return true; // Duplicate found
}
}
// No duplicates found
return false;
};
Time Complexity: O(n)
Here we iterator through the nums array to increase its frequency which will take O(n) time and depend on the map if we use an unordered map it will take O(1) to insert and look up so overall time complexity will be O(n)*O(1)= O(n).
Space Complexity : O(n)
Auxiliary Space: We use a frequency map to store the count of each element in the array. In the worst-case scenario, where all elements in the array are unique, the map will store all n elements. Therefore, the auxiliary space required to store the frequency map is O(n).
Total Space Complexity: The total space includes the space for the input array and the auxiliary space for the frequency map. Since the input array already takes O(n) space, and the frequency map also takes an additional O(n) space, the total space complexity is O(n) + O(n) = O(n).
Thus, the auxiliary space complexity is O(n), and the total space complexity is also O(n).
In the previous optimal solution, we used a hashmap but we can use unorderd_set or set to see if any element is occurring twice or not which has the same Time Complexity and Space Complexity as the hashmap solution.
Learning Tip
Since we have solved the Contains Duplicate problem, let’s now tackle the following related problems:
Given an integer array nums and an integer k, return true if there are two distinct indices i and j in the array such that nums[i] == nums[j] and abs(i - j) <= k.
You are given a non-negative integer array nums. In one operation, you must:
- Choose a positive integer x such that x is less than or equal to the smallest non-zero element in nums.
- Subtract x from every positive element in nums.
Return the minimum number of operations to make every element in nums equal to 0.