Introduction
In this lab, you will learn how to efficiently check if a Java List contains duplicate elements. We will explore a common and effective technique using the HashSet data structure.
You will first implement a method that leverages the unique element property of HashSet to detect duplicates by iterating through the list and adding elements to the set. If an element is already present in the set, a duplicate is found. Subsequently, you will learn an alternative approach by comparing the size of the original list with the size of a HashSet populated with the list's elements. Finally, you will test your implementation with various scenarios, including null and empty lists, to ensure robustness.
Use HashSet for Duplicate Detection
In this step, we will explore how to use a HashSet in Java to efficiently detect duplicate elements within a collection. HashSet is part of the Java Collections Framework and is particularly useful for storing unique elements.
First, let's create a new Java file named DuplicateDetector.java in your ~/project directory. You can do this using the WebIDE's File Explorer on the left. Right-click in the ~/project area, select "New File", and type DuplicateDetector.java.
Now, open the DuplicateDetector.java file in the Code Editor and add the following code:
import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
public class DuplicateDetector {
public static boolean containsDuplicates(List<String> list) {
// Create a HashSet to store unique elements
Set<String> uniqueElements = new HashSet<>();
// Iterate through the list
for (String element : list) {
// If the element is already in the HashSet, it's a duplicate
if (uniqueElements.contains(element)) {
return true; // Found a duplicate
}
// Otherwise, add the element to the HashSet
uniqueElements.add(element);
}
// If the loop finishes without finding duplicates, return false
return false;
}
public static void main(String[] args) {
// Example usage
List<String> myListWithDuplicates = new ArrayList<>();
myListWithDuplicates.add("apple");
myListWithDuplicates.add("banana");
myListWithDuplicates.add("apple"); // Duplicate
myListWithDuplicates.add("orange");
List<String> myListWithoutDuplicates = new ArrayList<>();
myListWithoutDuplicates.add("grape");
myListWithoutDuplicates.add("mango");
myListWithoutDuplicates.add("kiwi");
System.out.println("List with duplicates: " + myListWithDuplicates);
System.out.println("Contains duplicates? " + containsDuplicates(myListWithDuplicates)); // Expected: true
System.out.println("\nList without duplicates: " + myListWithoutDuplicates);
System.out.println("Contains duplicates? " + containsDuplicates(myListWithoutDuplicates)); // Expected: false
}
}
Let's understand the key parts of this code:
import java.util.ArrayList;,import java.util.HashSet;,import java.util.List;,import java.util.Set;: These lines import the necessary classes from the Java Collections Framework.public static boolean containsDuplicates(List<String> list): This is a method that takes aListofStringobjects as input and returnstrueif it contains duplicates, andfalseotherwise.Set<String> uniqueElements = new HashSet<>();: This creates an emptyHashSetcalleduniqueElements.HashSetis designed to store only unique elements.for (String element : list): This loop iterates through eachelementin the inputlist.if (uniqueElements.contains(element)): This checks if the currentelementis already present in theuniqueElementsHashSet. If it is, it means we've found a duplicate, and the method returnstrue.uniqueElements.add(element);: If the element is not already in theHashSet, it's added. BecauseHashSetonly stores unique elements, adding an element that is already present has no effect.return false;: If the loop completes without finding any duplicates, the method returnsfalse.- The
mainmethod demonstrates how to use thecontainsDuplicatesmethod with example lists.
Save the DuplicateDetector.java file (Ctrl+S or Cmd+S).
Now, let's compile and run this program in the Terminal. Make sure you are in the ~/project directory.
Compile the code:
javac DuplicateDetector.java
If there are no compilation errors, you will see no output.
Now, run the compiled code:
java DuplicateDetector
You should see output similar to this:
List with duplicates: [apple, banana, apple, orange]
Contains duplicates? true
List without duplicates: [grape, mango, kiwi]
Contains duplicates? false
This output confirms that our containsDuplicates method correctly identified the list with duplicates. Using a HashSet is an efficient way to check for duplicates because checking for the presence of an element in a HashSet (using contains()) is very fast, on average.
Compare List Size with Set Size
In the previous step, we used a HashSet to check for duplicates by iterating through the list and adding elements to the set. A simpler and often more efficient way to detect duplicates is by comparing the size of the original list with the size of a HashSet created from that list.
Remember that a HashSet only stores unique elements. If a list contains duplicates, the size of a HashSet created from that list will be smaller than the size of the original list. If there are no duplicates, the sizes will be the same.
Let's modify our DuplicateDetector.java file to implement this approach. Open ~/project/DuplicateDetector.java in the Code Editor.
Replace the containsDuplicates method with the following code:
public static boolean containsDuplicates(List<String> list) {
// Create a HashSet from the list
Set<String> uniqueElements = new HashSet<>(list);
// Compare the size of the list with the size of the HashSet
return list.size() != uniqueElements.size();
}
Here's what's happening in the new code:
Set<String> uniqueElements = new HashSet<>(list);: This line directly creates aHashSetand initializes it with all the elements from the inputlist. TheHashSetautomatically handles the uniqueness, so any duplicate elements from the list will not be added to the set.return list.size() != uniqueElements.size();: This line compares the number of elements in the originallist(list.size()) with the number of unique elements in theHashSet(uniqueElements.size()). If the sizes are different (!=), it means there were duplicates in the list, and the method returnstrue. If the sizes are the same, there were no duplicates, and the method returnsfalse.
The main method can remain the same as it already calls the containsDuplicates method.
Save the DuplicateDetector.java file (Ctrl+S or Cmd+S).
Now, let's compile and run the modified program. Make sure you are in the ~/project directory in the Terminal.
Compile the code:
javac DuplicateDetector.java
Run the compiled code:
java DuplicateDetector
You should see the same output as before:
List with duplicates: [apple, banana, apple, orange]
Contains duplicates? true
List without duplicates: [grape, mango, kiwi]
Contains duplicates? false
This confirms that our new, simpler method for detecting duplicates using the size comparison works correctly. This approach is generally more concise and often more efficient than iterating and checking for containment one by one, especially for larger lists.
Test with Null and Empty Lists
In real-world programming, it's important to consider edge cases, such as when a list might be empty or even null. Our current containsDuplicates method works well for lists with elements, but what happens if we pass an empty list or a null list?
Let's test this by adding more examples to our main method in ~/project/DuplicateDetector.java. Open the file in the Code Editor and add the following lines to the main method, after the existing code:
System.out.println("\nEmpty list: " + new ArrayList<>());
System.out.println("Contains duplicates? " + containsDuplicates(new ArrayList<>())); // Expected: false
List<String> nullList = null;
System.out.println("\nNull list: " + nullList);
// The following line will cause a NullPointerException if not handled
// System.out.println("Contains duplicates? " + containsDuplicates(nullList));
Save the file (Ctrl+S or Cmd+S).
Now, compile and run the program again.
Compile:
javac DuplicateDetector.java
Run:
java DuplicateDetector
You should see the output for the empty list:
List with duplicates: [apple, banana, apple, orange]
Contains duplicates? true
List without duplicates: [grape, mango, kiwi]
Contains duplicates? false
Empty list: []
Contains duplicates? false
The output for the empty list is correct; an empty list does not contain duplicates.
However, if you uncomment the line System.out.println("Contains duplicates? " + containsDuplicates(nullList)); and try to compile and run, you will get a NullPointerException. This happens because we are trying to create a HashSet from a null list, which is not allowed.
To make our containsDuplicates method more robust, we should handle the case where the input list is null. We can add a check at the beginning of the method.
Modify the containsDuplicates method in ~/project/DuplicateDetector.java to include a null check:
public static boolean containsDuplicates(List<String> list) {
// Handle null input
if (list == null) {
return false; // A null list does not contain duplicates
}
// Create a HashSet from the list
Set<String> uniqueElements = new HashSet<>(list);
// Compare the size of the list with the size of the HashSet
return list.size() != uniqueElements.size();
}
Now, uncomment the line that tests the null list in the main method:
List<String> nullList = null;
System.out.println("\nNull list: " + nullList);
System.out.println("Contains duplicates? " + containsDuplicates(nullList)); // Expected: false
Save the file (Ctrl+S or Cmd+S).
Compile and run the program one last time.
Compile:
javac DuplicateDetector.java
Run:
java DuplicateDetector
The output should now include the result for the null list without crashing:
List with duplicates: [apple, banana, apple, orange]
Contains duplicates? true
List without duplicates: [grape, mango, kiwi]
Contains duplicates? false
Empty list: []
Contains duplicates? false
Null list: null
Contains duplicates? false
By adding the null check, our containsDuplicates method is now more robust and can handle null input gracefully. This is an important practice in programming to prevent unexpected errors.
Summary
In this lab, we learned how to check if a Java List contains duplicate elements. We explored the use of a HashSet for efficient duplicate detection. By iterating through the list and attempting to add each element to a HashSet, we can quickly determine if an element is already present, indicating a duplicate.
We also learned an alternative method by comparing the size of the original list with the size of a HashSet created from the list. If the sizes are different, it signifies the presence of duplicates. Finally, we considered edge cases by testing the methods with null and empty lists to ensure robustness.



