Python 집합 (Set) 사용법 및 활용

소개

본 실습에서는 Python 의 집합 (sets) 에 대한 실습 경험을 쌓게 됩니다. 집합은 고유하고 순서가 없는 요소들을 저장하는 데 사용되는 기본적인 자료 구조입니다. 집합은 요소가 컬렉션에 존재하는지 확인하거나 수학적 집합 연산을 수행하는 것과 같은 작업에 매우 효율적입니다.

집합을 생성하고, 요소를 추가 및 제거하며, 합집합 (union), 교집합 (intersection), 차집합 (difference) 과 같은 일반적인 연산을 수행하는 방법을 배우게 됩니다. 마지막으로, 리스트에서 중복 항목을 쉽게 제거하기 위해 집합을 사용하는 실용적인 응용 사례를 살펴보겠습니다.

집합 생성 및 요소 추가

첫 번째 단계에서는 집합을 생성하고 새 요소를 추가하는 방법을 배웁니다. 집합은 고유한 항목들의 컬렉션이므로, 중복된 요소는 자동으로 제거됩니다.

여러분의 환경에는 set_basics.py라는 빈 파일이 포함되어 있습니다. 편집기 왼쪽의 파일 탐색기를 사용하여 ~/project/set_basics.py를 찾아 엽니다.

다음 Python 코드를 파일에 추가합니다. 이 코드는 집합을 생성하는 몇 가지 방법을 보여줍니다.

## Method 1: Using curly braces {}
## This creates a set with initial elements.
my_set = {'apple', 'banana', 'cherry'}
print("Set created with braces:", my_set)
print("Type of my_set:", type(my_set))

## Note: Sets automatically remove duplicate elements.
duplicate_set = {'apple', 'banana', 'apple'}
print("Set with duplicates:", duplicate_set)

## Method 2: Using the set() constructor on an iterable (like a string)
## This creates a set from the unique characters in the string.
char_set = set('hello world')
print("Set from a string:", char_set)

## Method 3: Creating an empty set
## You must use set() to create an empty set. {} creates an empty dictionary.
empty_set = set()
print("An empty set:", empty_set)
print("Type of empty_set:", type(empty_set))

파일을 저장합니다. 이제 편집기에서 터미널을 열고 (메뉴: Terminal -> New Terminal 사용 가능) 다음 명령어로 스크립트를 실행합니다.

python ~/project/set_basics.py

다음과 유사한 출력을 보게 될 것입니다. 집합 내 요소의 순서는 보장되지 않으며 중복 항목이 제거되었는지 확인하십시오.

Set created with braces: {'cherry', 'apple', 'banana'}
Type of my_set: <class 'set'>
Set with duplicates: {'banana', 'apple'}
Set from a string: {'d', 'l', 'o', 'r', 'w', ' ', 'h', 'e'}
An empty set: set()
Type of empty_set: <class 'set'>

다음으로, 기존 집합에 새 요소를 추가해 보겠습니다. add() 메서드를 사용하여 단일 요소를 추가하거나 update() 메서드를 사용하여 여러 요소를 추가할 수 있습니다.

set_basics.py 파일의 맨 아래에 다음 코드를 추가합니다.

## --- Adding elements ---
fruits = {'apple', 'banana'}
print("\nOriginal fruits set:", fruits)

## Use add() to add a single element
fruits.add('orange')
print("After adding 'orange':", fruits)

## add() has no effect if the element is already present
fruits.add('apple')
print("After adding 'apple' again:", fruits)

## Use update() to add multiple elements from an iterable (like a list)
fruits.update(['mango', 'grape'])
print("After updating with a list:", fruits)

파일을 다시 저장하고 터미널에서 업데이트된 스크립트를 실행합니다.

python ~/project/set_basics.py

출력에는 이제 요소 추가 결과가 포함됩니다.

Set created with braces: {'cherry', 'apple', 'banana'}
Type of my_set: <class 'set'>
Set with duplicates: {'banana', 'apple'}
Set from a string: {'d', 'l', 'o', 'r', 'w', ' ', 'h', 'e'}
An empty set: set()
Type of empty_set: <class 'set'>

Original fruits set: {'banana', 'apple'}
After adding 'orange': {'banana', 'orange', 'apple'}
After adding 'apple' again: {'banana', 'orange', 'apple'}
After updating with a list: {'grape', 'mango', 'banana', 'orange', 'apple'}

이제 집합을 생성하고 새 요소를 추가하여 수정하는 방법을 배웠습니다.

집합에서 요소 제거하기

이 단계에서는 집합에서 요소를 제거하는 다양한 방법을 배웁니다. 집합은 순서가 없으므로 인덱스를 사용하여 항목을 제거할 수 없습니다. 대신 Python 은 이 목적을 위한 특정 메서드를 제공합니다.

~/project 디렉토리에서 set_removal.py 파일을 찾아 엽니다.

다음 코드를 파일에 추가합니다. 이 코드는 remove(), discard(), pop(), clear() 메서드를 시연합니다.

## --- Removing elements ---
my_set = {'a', 'b', 'c', 'd', 'e'}
print("Original set:", my_set)

## Method 1: remove()
## This removes a specified element. It raises a KeyError if the element is not found.
my_set.remove('b')
print("After removing 'b':", my_set)
## The following line would cause an error: my_set.remove('z')

## Method 2: discard()
## This also removes a specified element, but it does NOT raise an error if the element is not found.
print("\nStarting set for discard:", my_set)
my_set.discard('c')
print("After discarding 'c':", my_set)
my_set.discard('z') ## 'z' is not in the set, but no error occurs.
print("After discarding 'z' (non-existent):", my_set)

## Method 3: pop()
## This removes and returns an arbitrary element from the set.
## Since sets are unordered, you don't know which item will be popped.
print("\nStarting set for pop:", my_set)
popped_item = my_set.pop()
print("Popped item:", popped_item)
print("Set after pop():", my_set)

## Method 4: clear()
## This removes all elements from the set, leaving an empty set.
print("\nStarting set for clear:", my_set)
my_set.clear()
print("Set after clear():", my_set)

파일을 저장합니다. 이제 터미널에서 스크립트를 실행합니다.

python ~/project/set_removal.py

출력은 다음과 유사해야 합니다. pop()으로 제거된 요소는 집합이 순서가 없기 때문에 스크립트를 실행할 때마다 다를 수 있습니다.

Original set: {'d', 'c', 'e', 'a', 'b'}
After removing 'b': {'d', 'c', 'e', 'a'}

Starting set for discard: {'d', 'c', 'e', 'a'}
After discarding 'c': {'d', 'e', 'a'}
After discarding 'z' (non-existent): {'d', 'e', 'a'}

Starting set for pop: {'d', 'e', 'a'}
Popped item: d
Set after pop(): {'e', 'a'}

Starting set for clear: {'e', 'a'}
Set after clear(): set()

이제 집합에서 요소를 제거하는 주요 메서드와 remove()와 discard()의 중요한 차이점을 이해했습니다.

집합 연산 수행하기

집합은 합집합 (union), 교집합 (intersection), 차집합 (difference) 과 같은 수학적 연산을 수행하는 데 특히 강력합니다. 이 단계에서는 Python 에서 이러한 연산을 수행하는 방법을 배웁니다.

~/project 디렉토리에서 set_operations.py 파일을 찾아 엽니다.

다음 코드를 파일에 추가합니다. 이 코드는 두 개의 집합을 정의한 다음 세 가지 주요 집합 연산을 수행합니다.

set_a = {'a', 'b', 'c', 'd'}
set_b = {'c', 'd', 'e', 'f'}

print("Set A:", set_a)
print("Set B:", set_b)

## --- Union ---
## The union contains all unique elements from both sets.
## You can use the | operator or the .union() method.
union_set_op = set_a | set_b
union_set_method = set_a.union(set_b)
print("\nUnion with | operator:", union_set_op)
print("Union with .union() method:", union_set_method)

## --- Intersection ---
## The intersection contains only the elements that are common to both sets.
## You can use the & operator or the .intersection() method.
intersection_set_op = set_a & set_b
intersection_set_method = set_a.intersection(set_b)
print("\nIntersection with & operator:", intersection_set_op)
print("Intersection with .intersection() method:", intersection_set_method)

## --- Difference ---
## The difference contains elements that are in the first set but NOT in the second set.
## You can use the - operator or the .difference() method.
difference_set_op = set_a - set_b
difference_set_method = set_a.difference(set_b)
print("\nDifference (A - B) with - operator:", difference_set_op)
print("Difference (A - B) with .difference() method:", difference_set_method)

## Note that the order matters for difference
difference_b_a = set_b - set_a
print("Difference (B - A):", difference_b_a)

파일을 저장하고 터미널에서 실행합니다.

python ~/project/set_operations.py

출력에는 각 연산의 결과가 명확하게 표시됩니다.

Set A: {'d', 'c', 'a', 'b'}
Set B: {'d', 'c', 'f', 'e'}

Union with | operator: {'d', 'c', 'f', 'e', 'a', 'b'}
Union with .union() method: {'d', 'c', 'f', 'e', 'a', 'b'}

Intersection with & operator: {'d', 'c'}
Intersection with .intersection() method: {'d', 'c'}

Difference (A - B) with - operator: {'a', 'b'}
Difference (A - B) with .difference() method: {'a', 'b'}
Difference (B - A): {'f', 'e'}

연산자 기호와 메서드를 모두 사용하여 집합에 대한 합집합, 교집합 및 차집합 연산을 성공적으로 수행했습니다.

Set 을 사용하여 리스트에서 중복 제거하기

집합의 가장 일반적이고 실용적인 용도 중 하나는 리스트에서 중복 요소를 신속하게 제거하는 것입니다. 집합은 고유한 요소만 포함할 수 있으므로, 리스트를 집합으로 변환했다가 다시 리스트로 변환하는 것은 이를 달성하는 간단하고 효율적인 방법입니다.

이 실습의 마지막 파일인 remove_duplicates.py를 ~/project 디렉토리에서 찾아 엽니다.

다음 코드를 파일에 추가합니다.

## A list containing several duplicate numbers
numbers_list = [1, 5, 2, 3, 5, 1, 4, 2, 2, 5]
print("Original list with duplicates:", numbers_list)

## Step 1: Convert the list to a set.
## This automatically removes all duplicate elements.
unique_numbers_set = set(numbers_list)
print("Set created from list (duplicates gone):", unique_numbers_set)

## Step 2: Convert the set back to a list.
## The new list will only contain the unique elements.
unique_numbers_list = list(unique_numbers_set)
print("Final list with duplicates removed:", unique_numbers_list)

## Note: This process does not preserve the original order of the elements
## because sets are an unordered data structure.

파일을 저장하고 터미널에서 실행합니다.

python ~/project/remove_duplicates.py

출력은 전체 과정을 보여주며, 원본 리스트, 중간 집합, 그리고 최종적으로 중복이 제거된 리스트를 보여줍니다.

Original list with duplicates: [1, 5, 2, 3, 5, 1, 4, 2, 2, 5]
Set created from list (duplicates gone): {1, 2, 3, 4, 5}
Final list with duplicates removed: [1, 2, 3, 4, 5]

집합에 대한 지식을 성공적으로 적용하여 리스트에서 중복을 제거하는 일반적인 프로그래밍 문제를 해결했습니다.

요약

본 실습에서는 Python 에서 집합 (set) 을 다루는 필수적인 기술들을 배웠습니다. 다양한 구문을 사용하여 집합을 생성하는 것으로 시작했으며, 집합이 본질적으로 고유성 (uniqueness) 을 어떻게 강제하는지 학습했습니다. 또한 add()와 update()를 사용하여 요소를 추가하고, remove(), discard(), pop(), clear()를 사용하여 요소를 제거하는 방법을 연습했으며, 이러한 메서드들 간의 주요 차이점을 확인했습니다.

나아가, 데이터 분석 및 알고리즘 설계의 기본이 되는 핵심 수학적 집합 연산인 합집합 (|), 교집합 (&), 차집합 (-) 을 탐구했습니다. 마지막으로, 이 지식을 실제 적용하여 데이터 정리 및 준비에서 흔히 발생하는 작업인 리스트에서 중복 항목을 제거하는 우아한 기법을 구현했습니다. 이제 여러분은 Python 프로그램에서 집합을 효과적으로 사용할 수 있는 역량을 갖추게 되었습니다.

Python 에서 집합 (Set) 다루기

소개

집합 생성 및 요소 추가

집합에서 요소 제거하기

집합 연산 수행하기

Set 을 사용하여 리스트에서 중복 제거하기

요약