Remove Duplicates From an Array in JavaScript
If you already understand the basics of JavaScript arrays, it's time to take your skills to the next level with more advanced topics. In this series of tutorials, you'll explore intermediate-level topics for programming with arrays in JavaScript.
In this post, we're going to deal with a commonly needed functionality for arrays in JavaScript: how to remove duplicates. There are multiple ways of removing array duplicates, and I'll show you each in this tutorial.
You'll learn how to remove duplicates from an array of primitives, and how to remove duplicates from an array of objects the right way.
Here's a summary of the methods we'll cover:
Method | Benefits | Drawbacks |
---|---|---|
Set | O(n) performance | only works with primitive arrays |
Array.reduce |
simple implementation | O(n2) performance |
Array.filter |
simple implementation | O(n2) performance |
Nested loops | doesn't use modern JavaScript features | O(n2) performance, more error-prone |
Hashing | O(n) performance | only works with primitive arrays |
Hashing with custom keys | O(n) performance, works with complex types | more complicated implementation |
Using Set to Remove Duplicates From an Array
This is one of our most-loved methods for removing duplicates from an array. First, you should know what a Set
is. A Set
is an important data object introduced in ES6. It can only store unique values. The moment you convert an array into a set, it will remove all duplicates. The use of Set
to remove duplicates involves two stages:
- Create a new
Set
using the values of the array. - Convert the
Set
back into an array. For this, you can use the spread operator or theArray.from
function.
1 |
const array = [1,1,2,3,2,2,1] |
2 |
|
3 |
//Step 1
|
4 |
const newSet = new Set(array) |
5 |
|
6 |
//Step 2
|
7 |
const uniqueArray = [...newSet] |
8 |
//or
|
9 |
const uniqueArray = Array.from(new Set(array)) |
10 |
|
11 |
console.log(uniqueArray) // [1,2,3] |
Time Complexity of Using Set
Creating a set from an existing array has time complexity O(N), i.e. it is proportional to the length of the array. That's because the internal implementation of Set
in real-world JavaScript implementations uses an efficient data structure like a hash table. This is much faster than many other methods for removing duplicates from an array.
A Drawback of Using Set
You can use Set
only on primitive values. Set does not work when you want to remove duplicates from an array of objects. If you want an efficient method to remove duplicates from an array of objects, scroll to the bottom of this post.
Using filter
to Remove Duplicates From an Array
filter
is one of the oldest methods for removing duplicates from an array. Before we understand how the filter works, you need to be aware of the indexOf
method. The indexOf
method is used to find the first index of any item in an array. We will use this logic inside the filter
function.
- Iterate every element in the array using the
filter
method. The filter method returns a new array containing only the elements that return true on a helper function. - Filter will call the helper function with each item and its index. Next, we search to see if that item occurs earlier in the array using the
indexOf
method. This happens whenindexOf
returns a different value than the index passed to the helper function. - If
indexOf
finds another instance of the item in the array, returnfalse
from the helper function. This will tell filter not to include the current item in the filtered array. - Otherwise, the item is not a duplicate and the helper function can return
true
.
1 |
const array = [ 1, 2, 1, 4, 2, 3]; |
2 |
|
3 |
const newArray = array.filter((item, index) => { |
4 |
return array.indexOf(item) === index; |
5 |
});
|
6 |
|
7 |
console.log(newArray) // [1,2,4,3] |
Here is a table to help you understand how the logic works:
item |
index |
indexOf(item) |
Result |
1 | 0 | 0 | true |
2 | 1 | 1 | true |
1 | 2 | 0 | false |
4 | 3 | 3 | true |
2 | 4 | 1 | false |
3 | 5 | 5 | true |
Time Complexity of Using Filter
The time complexity to remove duplicates using filter is O(n2). For short arrays, this is an easy way to remove duplicates, but as the array becomes longer, it will be much more expensive than using Set
.
Using reduce
to Remove Duplicates From an Array
Another interesting method for removing elements from an array is reduce
. Here, we can use the indexOf
function too. However, we are going to try another method called includes
.
Reduce can be rather tricky if you are unaware of how it works. So let me help you by breaking down the entire process. Every reduce helper function has an accumulator and a current item. With each operation, we can choose to add another item into the accumulator, or skip. In our case, the pseudo-code would be as follows.
- Iterate through every item in the array using
reduce
. - The helper function will be passed two parameters: the accumulator (the unique items array) and the current item.
- If the accumulator has the current item already, return the accumulator without modification. We use the
includes
function to perform this check. - Otherwise, insert the item into the accumulator and return.
1 |
const array = [1, 1, 2, 4, 2, 3]; |
2 |
|
3 |
const newArray = array.reduce((unique, item) => { |
4 |
return unique.includes(item) ? unique : [...unique, item]; |
5 |
}, []); |
6 |
|
7 |
console.log(newArray) // [1,2,4,3] |
Time Complexity of Using Reduce
The time complexity to remove duplicates using reduce
is O(n2). In general, the time to run reduce on any array is O(n). However, we are executing Array.includes
for each element inside the reduce
helper function, which makes the overall time complexity increase to O(n2).
Using Nested Loops to Remove Duplicates From an Array
Now, we are going to look at a classic approach that uses a forEach
loop and the find
function to remove duplicates.
Pseudo-Code for Using Nested Loops
- Create a new array called
unique
to store values without duplicates. - Iterate through the array using the
forEach
method. - Use the
find
method within eachforEach
iteration to check if the current item is already present in the arrayunique
. If the current item is not present,insert
it.
1 |
const array = [1,3,4,1,2,3,4,1] |
2 |
|
3 |
function removeDuplicates(inputArray) { |
4 |
const unique = []; |
5 |
inputArray.forEach(item => { |
6 |
const isFound = unique.find(inputItem => item === inputItem); |
7 |
if (!isFound) |
8 |
unique.push(item); |
9 |
});
|
10 |
return unique; |
11 |
}
|
12 |
|
13 |
console.log(removeDuplicates(array)) // [1,3,4,2] |
Time Complexity of Using Nested Loops
The removeDuplicates
method calls unique.find()
for each element in the input array. Again, this makes the overall time complexity O(n2).
Using Hashes to Remove Duplicates From an Array
As you've seen, other than the Set
technique, all the ways of removing duplicates from an array have O(n2) time complexity. Not great for larger data sets.
Now, we're going to do something interesting. Let's write a custom function which removes duplicates with a time complexity of O(n). We are going to take advantage of the fact that objects are effectively hash maps. The time complexity for accessing any value from an object is O(1). Here is the pseudo-code:
- Create a temporary object to track existing items and an array to collect unique items.
- Iterate through the array using
forEach
. - Test if the current item has been previously found by checking whether it exists as a key in the temporary object.
- If not, create one, and push the item into the array
unique
.
1 |
function removeDuplicates(inputArray){ |
2 |
const unique = []; |
3 |
const obj = {}; |
4 |
inputArray.forEach(item => { |
5 |
if (!obj[item]) { |
6 |
unique.push(item); |
7 |
obj[item] = item; |
8 |
}
|
9 |
});
|
10 |
return unique; |
11 |
}
|
Time Complexity of Using Hashes
The time complexity here in removing duplicates is O(n), much better than the other techniques that use only the array methods. Of course, just like with using Set, this method only works on primitive values, not objects.
Remove Duplicates From an Array of Objects
What if you need to remove duplicates from an array of objects? This is a common use case not addressed well by the other methods.
It's not hard to do, though. We'll use the efficient hash-based function above, with one small change:
1 |
function removeDuplicatesByKey(inputArray, keyFunction){ |
2 |
const unique = []; |
3 |
const obj = {}; |
4 |
inputArray.forEach(item => { |
5 |
let key = keyFunction(item) |
6 |
if (!obj[key]) { |
7 |
unique.push(item); |
8 |
obj[key] = item; |
9 |
}
|
10 |
});
|
11 |
return unique; |
12 |
}
|
This uses the same hash-based comparison test to determine whether each item is unique. But this time, we'll use a custom key to track the items. If two items have the same key, they can be considered the same. The keyFunction
parameter is what makes that possible. It's a helper function that takes an item and returns a key for that item.
But what to use as the key? It depends on the specific data you're working with. But there are a couple of common options.
Using an Id Field as the Key
This is probably the most common solution. If the objects have a uniquely identifying field—i.e. an id—we can just use that as the key.
1 |
const array = [{id:1, name: 'a'},{id:2, name: 'b'},{id:1, name: 'a2'} ] |
2 |
|
3 |
const idKey = (item) => item.id; |
4 |
const uniqueArray = removeDuplicatesByKey(array, idKey); |
5 |
|
6 |
console.log(uniqueArray) //[{id:1, name: 'a'},{id:2, name: 'b'}] |
Using JSON.stringify
as the Key
You can also use JSON.stringify()
. This is rather a generic solution, and it doesn't require the data items to have unique ids. The idea is to use the JSON representation of each item as a key. This effectively does a deep comparison of each item's object structure and will only recognize items as being equal if they have the same object structure and values.
1 |
const array = [{x:1, y:1}, {x:2, y:1}, {x:1, y:2}, {x:1, y:1}] |
2 |
|
3 |
const jsonKey = (item) => JSON.stringify(item); |
4 |
const uniqueArray = removeDuplicatesByKey(array, jsonKey); |
5 |
|
6 |
console.log(uniqueArray) //[{id:1, name: 'a'},{id:2, name: 'b'}] |
Conclusion
In this post, we saw a number of ways to remove duplicates from a JavaScript array. The methods that use hashing have much better performance when arrays get large. However, it's also possible to remove duplicates using only the built-in Array
methods.
Finally, if you need to remove duplicates from an array of objects, you can define a key for each item and use that to filter duplicates out of the list.