How to Check If an Array Has Duplicate Elements in Java

Introduction

In this lab, you will learn how to check if an array has duplicate elements in Java using different approaches. We will start with a fundamental method using nested loops, which provides a clear understanding of the comparison process.

Next, we will explore a more efficient technique utilizing the HashSet data structure, demonstrating how to leverage Java collections for faster duplicate detection. Finally, we will examine how sorting the array can simplify the process of identifying duplicates. By the end of this lab, you will be equipped with multiple strategies for handling duplicate elements in Java arrays.

Use Nested Loops for Duplicates

In this step, we will explore a fundamental approach to finding duplicate elements within an array using nested loops in Java. This method is straightforward and easy to understand, making it a good starting point for learning about array manipulation and basic algorithm design.

First, let's create a new Java file named FindDuplicatesNested.java in your ~/project directory. You can do this directly in the WebIDE File Explorer by right-clicking in the project folder and selecting "New File", then typing the name.

Now, open the FindDuplicatesNested.java file in the Code Editor and add the following Java code:

public class FindDuplicatesNested {

    public static void main(String[] args) {
        int[] numbers = {1, 2, 3, 4, 2, 7, 8, 8, 3};

        System.out.println("Finding duplicate elements using nested loops:");

        // Use nested loops to compare each element with every other element
        for (int i = 0; i < numbers.length; i++) {
            for (int j = i + 1; j < numbers.length; j++) {
                // If a duplicate is found (elements are equal and not the same element)
                if (numbers[i] == numbers[j]) {
                    System.out.println("Duplicate found: " + numbers[j]);
                }
            }
        }
    }
}

Let's break down this code:

int[] numbers = {1, 2, 3, 4, 2, 7, 8, 8, 3};: This line declares an integer array named numbers and initializes it with some values, including duplicates.
for (int i = 0; i < numbers.length; i++): This is the outer loop. It iterates through each element of the array using an index i.
for (int j = i + 1; j < numbers.length; j++): This is the inner loop. For each element at index i, it iterates through the remaining elements of the array starting from the element after index i. This is important to avoid comparing an element with itself and to avoid finding the same pair of duplicates twice (e.g., comparing index 1 with index 4 and then index 4 with index 1).
if (numbers[i] == numbers[j]): This condition checks if the element at index i is equal to the element at index j. If they are equal, it means we've found a duplicate.
System.out.println("Duplicate found: " + numbers[j]);: If a duplicate is found, this line prints a message indicating the duplicate element.

Save the file by pressing Ctrl + S (or Cmd + S on macOS).

Now, open the Terminal at the bottom of the WebIDE. Make sure you are in the ~/project directory. You can confirm this by typing pwd and pressing Enter. The output should be /home/labex/project.

Compile the Java code using the javac command:

javac FindDuplicatesNested.java

If there are no errors, the compilation will be successful, and a FindDuplicatesNested.class file will be created in the ~/project directory. You can verify this by typing ls and pressing Enter.

Finally, run the compiled Java program using the java command:

java FindDuplicatesNested

You should see the output indicating the duplicate elements found by the program.

This nested loop approach works by comparing every possible pair of elements in the array. While it's simple to understand, it can become inefficient for very large arrays. In the next steps, we will explore more efficient ways to find duplicates.

Use HashSet for Efficient Duplicate Check

In the previous step, we used nested loops to find duplicates, which is simple but can be slow for large arrays. In this step, we will learn a more efficient way to find duplicates using a HashSet.

A HashSet is a collection in Java that stores unique elements. This means that if you try to add an element that is already in the HashSet, the add operation will fail (or rather, return false). We can leverage this property to efficiently detect duplicates.

Here's the idea: we iterate through the array, and for each element, we try to add it to a HashSet. If the add() method returns false, it means the element is already in the set, and therefore, it's a duplicate.

Let's create a new Java file named FindDuplicatesHashSet.java in your ~/project directory.

Open the FindDuplicatesHashSet.java file in the Code Editor and add the following Java code:

import java.util.HashSet;
import java.util.Set;

public class FindDuplicatesHashSet {

    public static void main(String[] args) {
        int[] numbers = {1, 2, 3, 4, 2, 7, 8, 8, 3};

        // Create a HashSet to store unique elements
        Set<Integer> uniqueElements = new HashSet<>();

        System.out.println("Finding duplicate elements using HashSet:");

        // Iterate through the array
        for (int number : numbers) {
            // Try to add the element to the HashSet
            // If add() returns false, the element is a duplicate
            if (!uniqueElements.add(number)) {
                System.out.println("Duplicate found: " + number);
            }
        }
    }
}

Let's look at the new parts of this code:

import java.util.HashSet; and import java.util.Set;: These lines import the necessary classes for using HashSet.
Set<Integer> uniqueElements = new HashSet<>();: This line creates an empty HashSet that will store Integer objects. We use Set as the type because HashSet implements the Set interface.
for (int number : numbers): This is an enhanced for loop (also known as a for-each loop) which is a convenient way to iterate through each element of the numbers array.
!uniqueElements.add(number): This is the core logic. uniqueElements.add(number) attempts to add the current number to the HashSet. If the number is already present, add() returns false. The ! operator negates this result, so the if condition is true only when add() returns false, indicating a duplicate.

Save the file (Ctrl + S or Cmd + S).

Now, compile the Java code in the Terminal:

javac FindDuplicatesHashSet.java

If the compilation is successful, run the program:

java FindDuplicatesHashSet

You should see the output listing the duplicate elements found using the HashSet method. Notice that this method is generally faster than the nested loop approach, especially for larger arrays, because adding and checking for elements in a HashSet is very efficient.

Test with Sorted Array

In this final step, we will explore another approach to finding duplicates, specifically when the array is sorted. If an array is sorted, duplicate elements will always be adjacent to each other. This allows for a very simple and efficient way to find duplicates by just comparing adjacent elements.

First, let's create a new Java file named FindDuplicatesSorted.java in your ~/project directory.

Open the FindDuplicatesSorted.java file in the Code Editor and add the following Java code:

import java.util.Arrays;

public class FindDuplicatesSorted {

    public static void main(String[] args) {
        int[] numbers = {1, 2, 3, 4, 2, 7, 8, 8, 3};

        // First, sort the array
        Arrays.sort(numbers);

        System.out.println("Finding duplicate elements in a sorted array:");

        // Iterate through the sorted array and compare adjacent elements
        for (int i = 0; i < numbers.length - 1; i++) {
            // If the current element is equal to the next element, it's a duplicate
            if (numbers[i] == numbers[i + 1]) {
                System.out.println("Duplicate found: " + numbers[i]);
            }
        }
    }
}

Let's examine the key parts of this code:

import java.util.Arrays;: This line imports the Arrays class, which provides utility methods for arrays, including sorting.
Arrays.sort(numbers);: This line sorts the numbers array in ascending order.
for (int i = 0; i < numbers.length - 1; i++): This loop iterates through the sorted array. We loop up to numbers.length - 1 because we are comparing the current element (numbers[i]) with the next element (numbers[i + 1]).
if (numbers[i] == numbers[i + 1]): This condition checks if the current element is equal to the next element. If they are the same, it means we have found a duplicate.

Save the file (Ctrl + S or Cmd + S).

Now, compile the Java code in the Terminal:

javac FindDuplicatesSorted.java

If the compilation is successful, run the program:

java FindDuplicatesSorted

You should see the output listing the duplicate elements found. Notice that because the array is sorted, the duplicates will appear consecutively in the output.

This method is very efficient for sorted arrays as it only requires a single pass through the array after sorting. However, the initial sorting step itself has a time cost, which depends on the sorting algorithm used by Arrays.sort(). For primitive types like int, Java's Arrays.sort() uses a dual-pivot quicksort, which has an average time complexity of O(n log n).

You have now explored three different ways to find duplicates in an array in Java: using nested loops, using a HashSet, and using a sorted array. Each method has its own trade-offs in terms of simplicity, efficiency, and requirements (like the array being sorted). Understanding these different approaches is valuable for choosing the most suitable method for a given problem.

Summary

In this lab, we explored different methods for checking if an array contains duplicate elements in Java. We began by implementing a straightforward approach using nested loops, which involves comparing each element with every other element in the array. This method, while easy to understand, has a time complexity of O(n^2), making it less efficient for large arrays.

Next, we learned how to leverage the HashSet data structure for a more efficient duplicate check. By iterating through the array and attempting to add each element to a HashSet, we can quickly determine if an element is a duplicate because the add() method of HashSet returns false if the element already exists. This approach offers a significantly improved time complexity, typically O(n) on average. Finally, we considered how sorting the array first can also be used to find duplicates efficiently, as duplicate elements will be adjacent after sorting.