Comment attendre qu'un thread Python se termine

PythonPythonBeginner
Pratiquer maintenant

💡 Ce tutoriel est traduit par l'IA à partir de la version anglaise. Pour voir la version originale, vous pouvez cliquer ici

Introduction

Mastering how to wait for Python threads to finish is essential for building robust and reliable applications. In multi-threaded programs, proper synchronization ensures that operations complete in the correct order and that resources are used efficiently.

In this lab, you will learn how to create Python threads, wait for them to complete, and handle multiple threads. These skills are fundamental for developing concurrent applications that can perform multiple tasks simultaneously while maintaining proper synchronization.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL python(("Python")) -.-> python/AdvancedTopicsGroup(["Advanced Topics"]) python/AdvancedTopicsGroup -.-> python/threading_multiprocessing("Multithreading and Multiprocessing") subgraph Lab Skills python/threading_multiprocessing -.-> lab-417461{{"Comment attendre qu'un thread Python se termine"}} end

Creating Your First Python Thread

Python's threading module provides a simple way to create and manage threads. In this step, you will learn how to create a basic thread and observe its behavior.

Understanding Threads

A thread is a separate flow of execution in a program. When you run a Python script, it starts with a single thread called the main thread. By creating additional threads, your program can perform multiple tasks concurrently.

Threads are useful for:

  • Running time-consuming operations without blocking the main program
  • Processing tasks in parallel to improve performance
  • Handling multiple client connections in a server application

Creating a Simple Thread

Let's start by creating a simple Python script that demonstrates how to create and start a thread.

  1. Open a new file in the editor by clicking on the "File" menu, selecting "New File", and then saving it as simple_thread.py in the /home/labex/project directory.

  2. Add the following code to the file:

import threading
import time

def print_numbers():
    """Function that prints numbers from 1 to 5 with a delay."""
    for i in range(1, 6):
        print(f"Number {i} from thread")
        time.sleep(1)  ## Sleep for 1 second

## Create a thread that targets the print_numbers function
number_thread = threading.Thread(target=print_numbers)

## Start the thread
print("Starting the thread...")
number_thread.start()

## Main thread continues execution
print("Main thread continues to run...")
print("Main thread is doing other work...")

## Sleep for 2 seconds to demonstrate both threads running concurrently
time.sleep(2)
print("Main thread finished its work!")
  1. Save the file by pressing Ctrl+S or clicking on "File" > "Save".

  2. Run the script by opening a terminal (if not already open) and executing:

python3 /home/labex/project/simple_thread.py

You should see output similar to this:

Starting the thread...
Main thread continues to run...
Main thread is doing other work...
Number 1 from thread
Number 2 from thread
Main thread finished its work!
Number 3 from thread
Number 4 from thread
Number 5 from thread

Analyzing What Happened

In this example:

  1. We imported the threading and time modules.
  2. We defined a function print_numbers() that prints numbers from 1 to 5 with a 1-second delay between each.
  3. We created a thread object, specifying the function to run using the target parameter.
  4. We started the thread using the start() method.
  5. The main thread continued its execution, printing messages and sleeping for 2 seconds.
  6. Both the main thread and our number thread ran concurrently, which is why the output is interleaved.

Notice that the main thread finished before the number thread printed all its numbers. This is because threads run independently, and by default, the Python program will exit when the main thread finishes, even if other threads are still running.

In the next step, you will learn how to wait for a thread to complete using the join() method.

Waiting for a Thread to Complete with join()

In the previous step, you created a thread that ran independently of the main thread. However, there are many situations where you need to wait for a thread to finish its work before proceeding with the rest of your program. This is where the join() method becomes useful.

Understanding the join() Method

The join() method of a thread object blocks the calling thread (usually the main thread) until the thread whose join() method is called terminates. This is essential when:

  • The main thread needs results from a worker thread
  • You need to ensure all threads complete before exiting the program
  • The order of operations matters for your application logic

Creating a Thread and Waiting for it to Complete

Let's modify our previous example to demonstrate how to wait for a thread to complete using the join() method.

  1. Create a new file named join_thread.py in the /home/labex/project directory.

  2. Add the following code to the file:

import threading
import time

def calculate_sum(numbers):
    """Function that calculates the sum of numbers with a delay."""
    print("Starting the calculation...")
    time.sleep(3)  ## Simulate a time-consuming calculation
    result = sum(numbers)
    print(f"Calculation result: {result}")
    return result

## Create a list of numbers
numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

## Create a thread that targets the calculate_sum function
calculation_thread = threading.Thread(target=calculate_sum, args=(numbers,))

## Start the thread
print("Main thread: Starting the calculation thread...")
calculation_thread.start()

## Do some other work in the main thread
print("Main thread: Doing some other work while waiting...")
for i in range(5):
    print(f"Main thread: Working... ({i+1}/5)")
    time.sleep(0.5)

## Wait for the calculation thread to complete
print("Main thread: Waiting for the calculation thread to finish...")
calculation_thread.join()
print("Main thread: Calculation thread has finished!")

## Continue with the main thread
print("Main thread: Continuing with the rest of the program...")
  1. Save the file and run it with the following command:
python3 /home/labex/project/join_thread.py

You should see output similar to this:

Main thread: Starting the calculation thread...
Starting the calculation...
Main thread: Doing some other work while waiting...
Main thread: Working... (1/5)
Main thread: Working... (2/5)
Main thread: Working... (3/5)
Main thread: Working... (4/5)
Main thread: Working... (5/5)
Main thread: Waiting for the calculation thread to finish...
Calculation result: 55
Main thread: Calculation thread has finished!
Main thread: Continuing with the rest of the program...

The Importance of join()

In this example:

  1. We created a thread that performs a calculation (summing numbers).
  2. The main thread did some other work concurrently.
  3. When the main thread needed to ensure the calculation was complete, it called calculation_thread.join().
  4. The join() method caused the main thread to wait until the calculation thread finished.
  5. After the calculation thread completed, the main thread continued its execution.

This pattern is very useful when you need to ensure that all threaded tasks are completed before proceeding with the rest of your program. Without join(), the main thread might continue and even exit before the worker threads have completed their tasks.

Using join() with a Timeout

Sometimes, you might want to wait for a thread but not indefinitely. The join() method accepts an optional timeout parameter that specifies the maximum number of seconds to wait.

Let's modify our code to demonstrate this:

  1. Create a new file named join_timeout.py in the /home/labex/project directory.

  2. Add the following code:

import threading
import time

def long_running_task():
    """A function that simulates a very long-running task."""
    print("Long-running task started...")
    time.sleep(10)  ## Simulate a 10-second task
    print("Long-running task completed!")

## Create and start the thread
task_thread = threading.Thread(target=long_running_task)
task_thread.start()

## Wait for the thread to complete, but only for up to 3 seconds
print("Main thread: Waiting for up to 3 seconds...")
task_thread.join(timeout=3)

## Check if the thread is still running
if task_thread.is_alive():
    print("Main thread: The task is still running, but I'm continuing anyway!")
else:
    print("Main thread: The task has completed within the timeout period.")

## Continue with the main thread
print("Main thread: Continuing with other operations...")
## Let's sleep a bit to see the long-running task complete
time.sleep(8)
print("Main thread: Finished.")
  1. Save the file and run it:
python3 /home/labex/project/join_timeout.py

The output should look like this:

Long-running task started...
Main thread: Waiting for up to 3 seconds...
Main thread: The task is still running, but I'm continuing anyway!
Main thread: Continuing with other operations...
Long-running task completed!
Main thread: Finished.

In this example, the main thread waits for up to 3 seconds for the task thread to complete. Since the task takes 10 seconds, the main thread continues after the timeout, while the task thread keeps running in the background.

This approach is useful when you want to give threads a chance to complete, but need to continue regardless after a certain amount of time.

Working with Multiple Threads

In real-world applications, you often need to work with multiple threads simultaneously. This step will teach you how to create, manage, and synchronize multiple threads in Python.

Creating Multiple Threads

When dealing with multiple similar tasks, it's common to create multiple threads to process them concurrently. This can significantly improve performance, especially for I/O-bound operations like downloading files or making network requests.

Let's create an example that uses multiple threads to process a list of tasks:

  1. Create a new file named multiple_threads.py in the /home/labex/project directory.

  2. Add the following code:

import threading
import time
import random

def process_task(task_id):
    """Function to process a single task."""
    print(f"Starting task {task_id}...")
    ## Simulate variable processing time
    processing_time = random.uniform(1, 3)
    time.sleep(processing_time)
    print(f"Task {task_id} completed in {processing_time:.2f} seconds.")
    return task_id

## List of tasks to process
tasks = list(range(1, 6))  ## Tasks with IDs 1 through 5

## Create a list to store our threads
threads = []

## Create and start a thread for each task
for task_id in tasks:
    thread = threading.Thread(target=process_task, args=(task_id,))
    threads.append(thread)
    print(f"Created thread for task {task_id}")
    thread.start()

print(f"All {len(threads)} threads have been started")

## Wait for all threads to complete
for thread in threads:
    thread.join()

print("All tasks have been completed!")
  1. Save the file and run it:
python3 /home/labex/project/multiple_threads.py

The output will vary each time due to the random processing times, but should look similar to this:

Created thread for task 1
Starting task 1...
Created thread for task 2
Starting task 2...
Created thread for task 3
Starting task 3...
Created thread for task 4
Starting task 4...
Created thread for task 5
Starting task 5...
All 5 threads have been started
Task 1 completed in 1.23 seconds.
Task 3 completed in 1.45 seconds.
Task 2 completed in 1.97 seconds.
Task 5 completed in 1.35 seconds.
Task 4 completed in 2.12 seconds.
All tasks have been completed!

Understanding the Execution Flow

In this example:

  1. We defined a function process_task() that simulates processing a task with a random duration.
  2. We created a list of task IDs (1 to 5).
  3. For each task, we created a thread, stored it in a list, and started it.
  4. After starting all threads, we used a second loop with join() to wait for each thread to complete.
  5. Only after all threads completed did we print the final message.

This pattern is very useful when you have a batch of independent tasks that can be processed in parallel.

Thread Pool Executors

For more advanced thread management, Python's concurrent.futures module provides the ThreadPoolExecutor class. This creates a pool of worker threads that can be reused, which is more efficient than creating and destroying threads for each task.

Let's rewrite our example using a thread pool:

  1. Create a new file named thread_pool.py in the /home/labex/project directory.

  2. Add the following code:

import concurrent.futures
import time
import random

def process_task(task_id):
    """Function to process a single task."""
    print(f"Starting task {task_id}...")
    ## Simulate variable processing time
    processing_time = random.uniform(1, 3)
    time.sleep(processing_time)
    print(f"Task {task_id} completed in {processing_time:.2f} seconds.")
    return f"Result of task {task_id}"

## List of tasks to process
tasks = list(range(1, 6))  ## Tasks with IDs 1 through 5

## Create a ThreadPoolExecutor
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
    ## Submit all tasks and store the Future objects
    print(f"Submitting {len(tasks)} tasks to the thread pool with 3 workers...")
    future_to_task = {executor.submit(process_task, task_id): task_id for task_id in tasks}

    ## As each task completes, get its result
    for future in concurrent.futures.as_completed(future_to_task):
        task_id = future_to_task[future]
        try:
            result = future.result()
            print(f"Got result from task {task_id}: {result}")
        except Exception as e:
            print(f"Task {task_id} generated an exception: {e}")

print("All tasks have been processed!")
  1. Save the file and run it:
python3 /home/labex/project/thread_pool.py

The output will again vary due to random processing times, but should look similar to this:

Submitting 5 tasks to the thread pool with 3 workers...
Starting task 1...
Starting task 2...
Starting task 3...
Task 2 completed in 1.15 seconds.
Starting task 4...
Got result from task 2: Result of task 2
Task 1 completed in 1.82 seconds.
Starting task 5...
Got result from task 1: Result of task 1
Task 3 completed in 2.25 seconds.
Got result from task 3: Result of task 3
Task 4 completed in 1.45 seconds.
Got result from task 4: Result of task 4
Task 5 completed in 1.67 seconds.
Got result from task 5: Result of task 5
All tasks have been processed!

Benefits of Thread Pools

The thread pool approach offers several advantages:

  1. Resource Management: It limits the number of threads that can run simultaneously, preventing system resource exhaustion.
  2. Task Scheduling: It handles the scheduling of tasks automatically, starting new tasks as threads become available.
  3. Result Collection: It provides convenient ways to collect results from completed tasks.
  4. Exception Handling: It makes handling exceptions in threads more straightforward.

In our example, we set max_workers=3, meaning only 3 threads will run at once, even though we have 5 tasks. As threads complete their tasks, they are reused for the remaining tasks.

Thread pools are particularly useful when you have many more tasks than you want threads running simultaneously, or when tasks are being continuously generated.

Thread Timeouts and Daemon Threads

In this final step, you will learn about two important concepts in thread management: setting timeouts and using daemon threads. These techniques give you more control over how threads behave and interact with the main program.

Working with Thread Timeouts

As you learned in Step 2, the join() method accepts a timeout parameter. This is useful when you want to wait for a thread to complete, but only up to a certain point.

Let's create a more practical example where we implement a function that attempts to fetch data with a timeout:

  1. Create a new file named thread_with_timeout.py in the /home/labex/project directory.

  2. Add the following code:

import threading
import time
import random

def fetch_data(data_id):
    """Simulate fetching data that might take varying amounts of time."""
    print(f"Fetching data #{data_id}...")

    ## Simulate different fetch times, occasionally very long
    fetch_time = random.choices([1, 8], weights=[0.8, 0.2])[0]
    time.sleep(fetch_time)

    if fetch_time > 5:  ## Simulate a slow fetch
        print(f"Data #{data_id}: Fetch took too long!")
        return None
    else:
        print(f"Data #{data_id}: Fetch completed in {fetch_time} seconds!")
        return f"Data content for #{data_id}"

def fetch_with_timeout(data_id, timeout=3):
    """Fetch data with a timeout."""
    result = [None]  ## Using a list to store result from the thread

    def target_func():
        result[0] = fetch_data(data_id)

    ## Create and start the thread
    thread = threading.Thread(target=target_func)
    thread.start()

    ## Wait for the thread with a timeout
    thread.join(timeout=timeout)

    if thread.is_alive():
        print(f"Data #{data_id}: Fetch timed out after {timeout} seconds!")
        return "TIMEOUT"
    else:
        return result[0]

## Try to fetch several pieces of data
for i in range(1, 6):
    print(f"\nAttempting to fetch data #{i}")
    result = fetch_with_timeout(i, timeout=3)
    if result == "TIMEOUT":
        print(f"Main thread: Fetch for data #{i} timed out, moving on...")
    elif result is None:
        print(f"Main thread: Fetch for data #{i} completed but returned no data.")
    else:
        print(f"Main thread: Successfully fetched: {result}")

print("\nAll fetch attempts completed!")
  1. Save the file and run it:
python3 /home/labex/project/thread_with_timeout.py

The output will vary, but should look similar to this:

Attempting to fetch data #1
Fetching data #1...
Data #1: Fetch completed in 1 seconds!
Main thread: Successfully fetched: Data content for #1

Attempting to fetch data #2
Fetching data #2...
Data #2: Fetch completed in 1 seconds!
Main thread: Successfully fetched: Data content for #2

Attempting to fetch data #3
Fetching data #3...
Data #3: Fetch timed out after 3 seconds!
Main thread: Fetch for data #3 timed out, moving on...
Data #3: Fetch took too long!

Attempting to fetch data #4
Fetching data #4...
Data #4: Fetch completed in 1 seconds!
Main thread: Successfully fetched: Data content for #4

Attempting to fetch data #5
Fetching data #5...
Data #5: Fetch completed in 1 seconds!
Main thread: Successfully fetched: Data content for #5

All fetch attempts completed!

This example demonstrates:

  1. A function that attempts to fetch data and might be slow
  2. A wrapper function that uses threading with a timeout
  3. How to handle timeouts gracefully and continue with other operations

Understanding Daemon Threads

In Python, daemon threads are threads that run in the background. The key difference between daemon and non-daemon threads is that Python will not wait for daemon threads to complete before exiting. This is useful for threads that perform background tasks that should not prevent the program from exiting.

Let's create an example to demonstrate daemon threads:

  1. Create a new file named daemon_threads.py in the /home/labex/project directory.

  2. Add the following code:

import threading
import time

def background_task(name, interval):
    """A task that runs in the background at regular intervals."""
    count = 0
    while True:
        count += 1
        print(f"{name}: Iteration {count} at {time.strftime('%H:%M:%S')}")
        time.sleep(interval)

def main_task():
    """The main task that runs for a set amount of time."""
    print("Main task: Starting...")
    time.sleep(5)
    print("Main task: Completed!")

## Create two background threads
print("Creating background monitoring threads...")
monitor1 = threading.Thread(target=background_task, args=("Monitor-1", 1), daemon=True)
monitor2 = threading.Thread(target=background_task, args=("Monitor-2", 2), daemon=True)

## Start the background threads
monitor1.start()
monitor2.start()

print("Background monitors started, now starting main task...")

## Run the main task
main_task()

print("Main task completed, program will exit without waiting for daemon threads.")
print("Daemon threads will be terminated when the program exits.")
  1. Save the file and run it:
python3 /home/labex/project/daemon_threads.py

The output should look similar to this:

Creating background monitoring threads...
Background monitors started, now starting main task...
Main task: Starting...
Monitor-1: Iteration 1 at 14:25:10
Monitor-2: Iteration 1 at 14:25:10
Monitor-1: Iteration 2 at 14:25:11
Monitor-1: Iteration 3 at 14:25:12
Monitor-2: Iteration 2 at 14:25:12
Monitor-1: Iteration 4 at 14:25:13
Monitor-1: Iteration 5 at 14:25:14
Monitor-2: Iteration 3 at 14:25:14
Main task: Completed!
Main task completed, program will exit without waiting for daemon threads.
Daemon threads will be terminated when the program exits.

In this example:

  1. We created two daemon threads that run continuously, printing messages at regular intervals.
  2. We set daemon=True when creating the threads, which marks them as daemon threads.
  3. The main thread runs for 5 seconds and then exits.
  4. When the main thread exits, the program terminates, and the daemon threads are automatically terminated as well.

Non-Daemon vs. Daemon Threads

To understand the difference better, let's create one more example that compares daemon and non-daemon threads:

  1. Create a new file named daemon_comparison.py in the /home/labex/project directory.

  2. Add the following code:

import threading
import time

def task(name, seconds, daemon=False):
    """A task that runs for a specified amount of time."""
    print(f"{name} starting {'(daemon)' if daemon else '(non-daemon)'}")
    time.sleep(seconds)
    print(f"{name} finished after {seconds} seconds")

## Create a non-daemon thread that runs for 8 seconds
non_daemon_thread = threading.Thread(
    target=task,
    args=("Non-daemon thread", 8, False),
    daemon=False  ## This is the default, so it's not actually needed
)

## Create a daemon thread that runs for 8 seconds
daemon_thread = threading.Thread(
    target=task,
    args=("Daemon thread", 8, True),
    daemon=True
)

## Start both threads
non_daemon_thread.start()
daemon_thread.start()

## Let the main thread run for 3 seconds
print("Main thread will run for 3 seconds...")
time.sleep(3)

## Check which threads are still running
print("\nAfter 3 seconds:")
print(f"Daemon thread is alive: {daemon_thread.is_alive()}")
print(f"Non-daemon thread is alive: {non_daemon_thread.is_alive()}")

print("\nMain thread is finishing. Here's what will happen:")
print("1. The program will wait for all non-daemon threads to complete")
print("2. Daemon threads will be terminated when the program exits")

print("\nWaiting for non-daemon threads to finish...")
## We don't need to join the non-daemon thread, Python will wait for it
## But we'll explicitly join it for clarity
non_daemon_thread.join()
print("All non-daemon threads have finished, program will exit now.")
  1. Save the file and run it:
python3 /home/labex/project/daemon_comparison.py

The output should look like this:

Non-daemon thread starting (non-daemon)
Daemon thread starting (daemon)
Main thread will run for 3 seconds...

After 3 seconds:
Daemon thread is alive: True
Non-daemon thread is alive: True

Main thread is finishing. Here's what will happen:
1. The program will wait for all non-daemon threads to complete
2. Daemon threads will be terminated when the program exits

Waiting for non-daemon threads to finish...
Non-daemon thread finished after 8 seconds
All non-daemon threads have finished, program will exit now.

Key observations:

  1. Both threads start and run concurrently.
  2. After 3 seconds, both threads are still running.
  3. The program waits for the non-daemon thread to finish (after 8 seconds).
  4. The daemon thread is still running when the program exits, but it gets terminated.
  5. The daemon thread never gets to print its completion message because it's terminated when the program exits.

When to Use Daemon Threads

Daemon threads are useful for:

  • Background monitoring tasks
  • Cleanup operations
  • Services that should run for the duration of the program but not prevent it from exiting
  • Timer threads that trigger events at regular intervals

Non-daemon threads are appropriate for:

  • Critical operations that must complete
  • Tasks that should not be interrupted
  • Operations that must finish cleanly before the program exits

Understanding when to use each type is an important part of designing robust multi-threaded applications.

Summary

In this lab, you have learned the essential techniques for working with Python threads and how to wait for them to complete. Here is a summary of the key concepts covered:

  1. Creating and Starting Threads: You learned how to create a thread object, specify the target function, and start its execution with the start() method.

  2. Waiting for Threads with join(): You discovered how to use the join() method to wait for a thread to complete before continuing with the main program, ensuring proper synchronization.

  3. Working with Multiple Threads: You practiced creating and managing multiple threads, both manually and using the ThreadPoolExecutor class for more efficient thread management.

  4. Thread Timeouts and Daemon Threads: You explored advanced topics including setting timeouts for thread operations and using daemon threads for background tasks.

These skills provide a foundation for developing multi-threaded applications in Python. Multi-threading enables your programs to perform multiple tasks concurrently, improving performance and responsiveness, especially for I/O-bound operations.

As you continue to work with threads, remember these best practices:

  • Use threads for I/O-bound tasks, not CPU-bound tasks (consider using multiprocessing for the latter)
  • Be mindful of shared resources and use appropriate synchronization mechanisms
  • Consider using higher-level abstractions like ThreadPoolExecutor for managing multiple threads
  • Use daemon threads for background tasks that should not prevent the program from exiting

With these skills and practices, you are now equipped to build more efficient and responsive Python applications using multi-threading techniques.