How to Check If a List Has Duplicate Elements in Java

Introduction

In this lab, you will learn how to efficiently check if a Java List contains duplicate elements. We will explore a common and effective technique using the HashSet data structure.

You will first implement a method that leverages the unique element property of HashSet to detect duplicates by iterating through the list and adding elements to the set. If an element is already present in the set, a duplicate is found. Subsequently, you will learn an alternative approach by comparing the size of the original list with the size of a HashSet populated with the list's elements. Finally, you will test your implementation with various scenarios, including null and empty lists, to ensure robustness.

Use HashSet for Duplicate Detection

In this step, we will explore how to use a HashSet in Java to efficiently detect duplicate elements within a collection. HashSet is part of the Java Collections Framework and is particularly useful for storing unique elements.

First, let's create a new Java file named DuplicateDetector.java in your ~/project directory. You can do this using the WebIDE's File Explorer on the left. Right-click in the ~/project area, select "New File", and type DuplicateDetector.java.

Now, open the DuplicateDetector.java file in the Code Editor and add the following code:

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

public class DuplicateDetector {

    public static boolean containsDuplicates(List<String> list) {
        // Create a HashSet to store unique elements
        Set<String> uniqueElements = new HashSet<>();

        // Iterate through the list
        for (String element : list) {
            // If the element is already in the HashSet, it's a duplicate
            if (uniqueElements.contains(element)) {
                return true; // Found a duplicate
            }
            // Otherwise, add the element to the HashSet
            uniqueElements.add(element);
        }

        // If the loop finishes without finding duplicates, return false
        return false;
    }

    public static void main(String[] args) {
        // Example usage
        List<String> myListWithDuplicates = new ArrayList<>();
        myListWithDuplicates.add("apple");
        myListWithDuplicates.add("banana");
        myListWithDuplicates.add("apple"); // Duplicate
        myListWithDuplicates.add("orange");

        List<String> myListWithoutDuplicates = new ArrayList<>();
        myListWithoutDuplicates.add("grape");
        myListWithoutDuplicates.add("mango");
        myListWithoutDuplicates.add("kiwi");

        System.out.println("List with duplicates: " + myListWithDuplicates);
        System.out.println("Contains duplicates? " + containsDuplicates(myListWithDuplicates)); // Expected: true

        System.out.println("\nList without duplicates: " + myListWithoutDuplicates);
        System.out.println("Contains duplicates? " + containsDuplicates(myListWithoutDuplicates)); // Expected: false
    }
}

Let's understand the key parts of this code:

import java.util.ArrayList;, import java.util.HashSet;, import java.util.List;, import java.util.Set;: These lines import the necessary classes from the Java Collections Framework.
public static boolean containsDuplicates(List<String> list): This is a method that takes a List of String objects as input and returns true if it contains duplicates, and false otherwise.
Set<String> uniqueElements = new HashSet<>();: This creates an empty HashSet called uniqueElements. HashSet is designed to store only unique elements.
for (String element : list): This loop iterates through each element in the input list.
if (uniqueElements.contains(element)): This checks if the current element is already present in the uniqueElements HashSet. If it is, it means we've found a duplicate, and the method returns true.
uniqueElements.add(element);: If the element is not already in the HashSet, it's added. Because HashSet only stores unique elements, adding an element that is already present has no effect.
return false;: If the loop completes without finding any duplicates, the method returns false.
The main method demonstrates how to use the containsDuplicates method with example lists.

Save the DuplicateDetector.java file (Ctrl+S or Cmd+S).

Now, let's compile and run this program in the Terminal. Make sure you are in the ~/project directory.

Compile the code:

javac DuplicateDetector.java

If there are no compilation errors, you will see no output.

Now, run the compiled code:

java DuplicateDetector

You should see output similar to this:

List with duplicates: [apple, banana, apple, orange]
Contains duplicates? true

List without duplicates: [grape, mango, kiwi]
Contains duplicates? false

This output confirms that our containsDuplicates method correctly identified the list with duplicates. Using a HashSet is an efficient way to check for duplicates because checking for the presence of an element in a HashSet (using contains()) is very fast, on average.

Compare List Size with Set Size

In the previous step, we used a HashSet to check for duplicates by iterating through the list and adding elements to the set. A simpler and often more efficient way to detect duplicates is by comparing the size of the original list with the size of a HashSet created from that list.

Remember that a HashSet only stores unique elements. If a list contains duplicates, the size of a HashSet created from that list will be smaller than the size of the original list. If there are no duplicates, the sizes will be the same.

Let's modify our DuplicateDetector.java file to implement this approach. Open ~/project/DuplicateDetector.java in the Code Editor.

Replace the containsDuplicates method with the following code:

    public static boolean containsDuplicates(List<String> list) {
        // Create a HashSet from the list
        Set<String> uniqueElements = new HashSet<>(list);

        // Compare the size of the list with the size of the HashSet
        return list.size() != uniqueElements.size();
    }

Here's what's happening in the new code:

Set<String> uniqueElements = new HashSet<>(list);: This line directly creates a HashSet and initializes it with all the elements from the input list. The HashSet automatically handles the uniqueness, so any duplicate elements from the list will not be added to the set.
return list.size() != uniqueElements.size();: This line compares the number of elements in the original list (list.size()) with the number of unique elements in the HashSet (uniqueElements.size()). If the sizes are different (!=), it means there were duplicates in the list, and the method returns true. If the sizes are the same, there were no duplicates, and the method returns false.

The main method can remain the same as it already calls the containsDuplicates method.

Save the DuplicateDetector.java file (Ctrl+S or Cmd+S).

Now, let's compile and run the modified program. Make sure you are in the ~/project directory in the Terminal.

Compile the code:

javac DuplicateDetector.java

Run the compiled code:

java DuplicateDetector

You should see the same output as before:

List with duplicates: [apple, banana, apple, orange]
Contains duplicates? true

List without duplicates: [grape, mango, kiwi]
Contains duplicates? false

This confirms that our new, simpler method for detecting duplicates using the size comparison works correctly. This approach is generally more concise and often more efficient than iterating and checking for containment one by one, especially for larger lists.

Test with Null and Empty Lists

In real-world programming, it's important to consider edge cases, such as when a list might be empty or even null. Our current containsDuplicates method works well for lists with elements, but what happens if we pass an empty list or a null list?

Let's test this by adding more examples to our main method in ~/project/DuplicateDetector.java. Open the file in the Code Editor and add the following lines to the main method, after the existing code:

        System.out.println("\nEmpty list: " + new ArrayList<>());
        System.out.println("Contains duplicates? " + containsDuplicates(new ArrayList<>())); // Expected: false

        List<String> nullList = null;
        System.out.println("\nNull list: " + nullList);
        // The following line will cause a NullPointerException if not handled
        // System.out.println("Contains duplicates? " + containsDuplicates(nullList));

Save the file (Ctrl+S or Cmd+S).

Now, compile and run the program again.

Compile:

javac DuplicateDetector.java

Run:

java DuplicateDetector

You should see the output for the empty list:

List with duplicates: [apple, banana, apple, orange]
Contains duplicates? true

List without duplicates: [grape, mango, kiwi]
Contains duplicates? false

Empty list: []
Contains duplicates? false

The output for the empty list is correct; an empty list does not contain duplicates.

However, if you uncomment the line System.out.println("Contains duplicates? " + containsDuplicates(nullList)); and try to compile and run, you will get a NullPointerException. This happens because we are trying to create a HashSet from a null list, which is not allowed.

To make our containsDuplicates method more robust, we should handle the case where the input list is null. We can add a check at the beginning of the method.

Modify the containsDuplicates method in ~/project/DuplicateDetector.java to include a null check:

    public static boolean containsDuplicates(List<String> list) {
        // Handle null input
        if (list == null) {
            return false; // A null list does not contain duplicates
        }

        // Create a HashSet from the list
        Set<String> uniqueElements = new HashSet<>(list);

        // Compare the size of the list with the size of the HashSet
        return list.size() != uniqueElements.size();
    }

Now, uncomment the line that tests the null list in the main method:

        List<String> nullList = null;
        System.out.println("\nNull list: " + nullList);
        System.out.println("Contains duplicates? " + containsDuplicates(nullList)); // Expected: false

Save the file (Ctrl+S or Cmd+S).

Compile and run the program one last time.

Compile:

javac DuplicateDetector.java

Run:

java DuplicateDetector

The output should now include the result for the null list without crashing:

List with duplicates: [apple, banana, apple, orange]
Contains duplicates? true

List without duplicates: [grape, mango, kiwi]
Contains duplicates? false

Empty list: []
Contains duplicates? false

Null list: null
Contains duplicates? false

By adding the null check, our containsDuplicates method is now more robust and can handle null input gracefully. This is an important practice in programming to prevent unexpected errors.

Summary

In this lab, we learned how to check if a Java List contains duplicate elements. We explored the use of a HashSet for efficient duplicate detection. By iterating through the list and attempting to add each element to a HashSet, we can quickly determine if an element is already present, indicating a duplicate.

We also learned an alternative method by comparing the size of the original list with the size of a HashSet created from the list. If the sizes are different, it signifies the presence of duplicates. Finally, we considered edge cases by testing the methods with null and empty lists to ensure robustness.