Comment résoudre l'erreur 'ModuleNotFoundError' lors de la construction d'une image Docker

DockerDockerBeginner
Pratiquer maintenant

💡 Ce tutoriel est traduit par l'IA à partir de la version anglaise. Pour voir la version originale, vous pouvez cliquer ici

Introduction

When building Docker images for Python applications, developers often encounter the 'ModuleNotFoundError' message. This error occurs when Python cannot locate a module or package that your application requires. For Docker beginners, this can be particularly challenging to troubleshoot.

In this hands-on lab, you will create a simple Python application, containerize it with Docker, encounter the ModuleNotFoundError, and learn practical ways to resolve it. By the end, you will understand how to properly manage Python dependencies in Docker images and avoid this common issue in your projects.


Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL docker(("Docker")) -.-> docker/DockerfileGroup(["Dockerfile"]) docker(("Docker")) -.-> docker/ContainerOperationsGroup(["Container Operations"]) docker/ContainerOperationsGroup -.-> docker/run("Run a Container") docker/DockerfileGroup -.-> docker/build("Build Image from Dockerfile") subgraph Lab Skills docker/run -.-> lab-417722{{"Comment résoudre l'erreur 'ModuleNotFoundError' lors de la construction d'une image Docker"}} docker/build -.-> lab-417722{{"Comment résoudre l'erreur 'ModuleNotFoundError' lors de la construction d'une image Docker"}} end

Creating a Simple Python Application

Let's create a basic Python application and set up Docker to run it. This will help us understand how the ModuleNotFoundError occurs in a Docker environment.

Understanding the Python Application Structure

First, let's create a project directory and navigate to it:

mkdir -p ~/project/docker-python-app
cd ~/project/docker-python-app

Now, let's create a simple Python application that imports a third-party module. We'll create two files:

  1. A main application file
  2. A requirements file to list dependencies

Create the main application file:

nano app.py

Add the following code to app.py:

import requests

def main():
    response = requests.get("https://www.example.com")
    print(f"Status code: {response.status_code}")
    print(f"Content length: {len(response.text)} characters")

if __name__ == "__main__":
    main()

This simple script uses the requests library to make an HTTP request to example.com and print some basic information about the response.

Now, let's create a requirements file:

nano requirements.txt

Add the following line to requirements.txt:

requests==2.28.1

Creating a Basic Dockerfile

Now, let's create a simple Dockerfile that will demonstrate the ModuleNotFoundError:

nano Dockerfile

Add the following content to the Dockerfile:

FROM python:3.9-slim

WORKDIR /app

COPY app.py .

## We're intentionally NOT copying or installing requirements
## to demonstrate the ModuleNotFoundError

CMD ["python", "app.py"]

This Dockerfile:

  • Uses the Python 3.9 slim image as a base
  • Sets the working directory to /app
  • Copies our application file
  • Specifies the command to run our application

Notice that we deliberately didn't copy the requirements.txt file or install any dependencies. This will cause the ModuleNotFoundError when we try to run the container.

Building and Running the Docker Image

Let's build the Docker image:

docker build -t python-app-error .

You should see output similar to this:

Sending build context to Docker daemon  3.072kB
Step 1/4 : FROM python:3.9-slim
 ---> 3a4bac80b3ea
Step 2/4 : WORKDIR /app
 ---> Using cache
 ---> a8a4f574dbf5
Step 3/4 : COPY app.py .
 ---> Using cache
 ---> 7d5ae315f84b
Step 4/4 : CMD ["python", "app.py"]
 ---> Using cache
 ---> f5a9b09d7d8e
Successfully built f5a9b09d7d8e
Successfully tagged python-app-error:latest

Now, let's run the Docker container:

docker run python-app-error

You should see an error message similar to this:

Traceback (most recent call last):
  File "/app/app.py", line 1, in <module>
    import requests
ModuleNotFoundError: No module named 'requests'

This is the ModuleNotFoundError we're focusing on in this lab. The error occurs because we didn't include the required requests module in our Docker image.

Understanding and Fixing the ModuleNotFoundError

Now that we've encountered the ModuleNotFoundError, let's understand why it happened and how to fix it.

Why Does ModuleNotFoundError Occur in Docker?

The ModuleNotFoundError occurs in Docker for several common reasons:

  1. Missing dependency installation: We didn't install the required Python packages in the Docker image.
  2. Incorrect PYTHONPATH: The Python interpreter can't find the modules in the expected locations.
  3. File structure issues: The application code structure doesn't match how imports are being done.

In our case, the error occurred because we didn't install the requests package in our Docker image. Unlike our local development environment where we might have this package installed globally, Docker containers are isolated environments.

Method 1: Installing Dependencies Using pip in the Dockerfile

Let's modify our Dockerfile to install the required dependencies:

nano Dockerfile

Update the Dockerfile with the following content:

FROM python:3.9-slim

WORKDIR /app

COPY app.py .

## Fix Method 1: Directly install the required package
RUN pip install requests==2.28.1

CMD ["python", "app.py"]

Let's build and run this updated image:

docker build -t python-app-fixed-1 .

You should see output that includes the package installation:

Sending build context to Docker daemon  3.072kB
Step 1/5 : FROM python:3.9-slim
 ---> 3a4bac80b3ea
Step 2/5 : WORKDIR /app
 ---> Using cache
 ---> a8a4f574dbf5
Step 3/5 : COPY app.py .
 ---> Using cache
 ---> 7d5ae315f84b
Step 4/5 : RUN pip install requests==2.28.1
 ---> Running in 5a6d7e8f9b0c
Collecting requests==2.28.1
  Downloading requests-2.28.1-py3-none-any.whl (62 kB)
Collecting charset-normalizer<3,>=2
  Downloading charset_normalizer-2.1.1-py3-none-any.whl (39 kB)
Collecting certifi>=2017.4.17
  Downloading certifi-2022.9.24-py3-none-any.whl (161 kB)
Collecting idna<4,>=2.5
  Downloading idna-3.4-py3-none-any.whl (61 kB)
Collecting urllib3<1.27,>=1.21.1
  Downloading urllib3-1.26.12-py2.py3-none-any.whl (140 kB)
Installing collected packages: urllib3, idna, charset-normalizer, certifi, requests
Successfully installed certifi-2022.9.24 charset-normalizer-2.1.1 idna-3.4 requests-2.28.1 urllib3-1.26.12
 ---> 2b3c4d5e6f7g
Removing intermediate container 5a6d7e8f9b0c
Step 5/5 : CMD ["python", "app.py"]
 ---> Running in 8h9i0j1k2l3m
 ---> 3n4o5p6q7r8s
Removing intermediate container 8h9i0j1k2l3m
Successfully built 3n4o5p6q7r8s
Successfully tagged python-app-fixed-1:latest

Now let's run the fixed container:

docker run python-app-fixed-1

You should see output similar to this:

Status code: 200
Content length: 1256 characters

Great! The application now runs successfully because we installed the required dependency.

Method 2: Using requirements.txt for Dependency Management

While directly installing packages works, it's better practice to use a requirements.txt file for more organized dependency management. Let's update our Dockerfile:

nano Dockerfile

Update the Dockerfile with the following content:

FROM python:3.9-slim

WORKDIR /app

## Copy requirements first to leverage Docker cache
COPY requirements.txt .

## Fix Method 2: Use requirements.txt
RUN pip install -r requirements.txt

## Copy the rest of the application
COPY app.py .

CMD ["python", "app.py"]

This approach has several advantages:

  • It separates dependency management from code
  • It makes it easier to update dependencies
  • It follows best practices for Docker image layer caching

Let's build and run this updated image:

docker build -t python-app-fixed-2 .

You should see output similar to the previous build, but this time it's using requirements.txt:

Sending build context to Docker daemon  4.096kB
Step 1/5 : FROM python:3.9-slim
 ---> 3a4bac80b3ea
Step 2/5 : WORKDIR /app
 ---> Using cache
 ---> a8a4f574dbf5
Step 3/5 : COPY requirements.txt .
 ---> Using cache
 ---> b2c3d4e5f6g7
Step 4/5 : RUN pip install -r requirements.txt
 ---> Running in h8i9j0k1l2m3
Collecting requests==2.28.1
  Using cached requests-2.28.1-py3-none-any.whl (62 kB)
Collecting charset-normalizer<3,>=2
  Using cached charset_normalizer-2.1.1-py3-none-any.whl (39 kB)
Collecting idna<4,>=2.5
  Using cached idna-3.4-py3-none-any.whl (61 kB)
Collecting certifi>=2017.4.17
  Using cached certifi-2022.9.24-py3-none-any.whl (161 kB)
Collecting urllib3<1.27,>=1.21.1
  Using cached urllib3-1.26.12-py2.py3-none-any.whl (140 kB)
Installing collected packages: urllib3, idna, charset-normalizer, certifi, requests
Successfully installed certifi-2022.9.24 charset-normalizer-2.1.1 idna-3.4 requests-2.28.1 urllib3-1.26.12
 ---> n4o5p6q7r8s9
Removing intermediate container h8i9j0k1l2m3
Step 5/5 : COPY app.py .
 ---> t0u1v2w3x4y5
Step 6/6 : CMD ["python", "app.py"]
 ---> Running in z5a6b7c8d9e0
 ---> f1g2h3i4j5k6
Removing intermediate container z5a6b7c8d9e0
Successfully built f1g2h3i4j5k6
Successfully tagged python-app-fixed-2:latest

Now let's run the container:

docker run python-app-fixed-2

You should see the same successful output:

Status code: 200
Content length: 1256 characters

You've successfully fixed the ModuleNotFoundError using two different methods!

Best Practices for Avoiding ModuleNotFoundError

Now that we've fixed the immediate issue, let's look at some best practices to avoid ModuleNotFoundError in Docker images.

Understanding Docker Caching for Efficient Builds

Docker uses a layered approach to building images. Each instruction in a Dockerfile creates a new layer. When you rebuild an image, Docker reuses cached layers if possible, which can significantly speed up the build process.

For Python applications, you can optimize caching by:

  1. Copying and installing requirements before copying the application code
  2. Keeping frequently changing files (like application code) in the later layers

Let's update our Dockerfile to follow these best practices:

nano Dockerfile

Update the Dockerfile with the following optimized content:

FROM python:3.9-slim

WORKDIR /app

## Copy requirements first for better caching
COPY requirements.txt .

## Install dependencies
RUN pip install --no-cache-dir -r requirements.txt

## Copy application code (changes more frequently)
COPY . .

## Make sure we run the application with Python's unbuffered mode for better logging
CMD ["python", "-u", "app.py"]

Let's build this optimized image:

docker build -t python-app-optimized .

And run it to verify it works:

docker run python-app-optimized

You should see the same successful output:

Status code: 200
Content length: 1256 characters

Using a .dockerignore File

To make your Docker builds more efficient, it's a good practice to use a .dockerignore file to exclude files and directories that aren't needed in the Docker image. This reduces the build context size and improves build performance.

Let's create a .dockerignore file:

nano .dockerignore

Add the following content:

__pycache__
*.pyc
*.pyo
*.pyd
.Python
.git
.gitignore
*.log
*.pot
*.env

Creating a More Complex Application Structure

For larger applications with multiple modules, it's important to structure your project correctly. Let's create a slightly more complex example:

mkdir -p myapp

Create a module file:

nano myapp/__init__.py

Leave this file empty (it just marks the directory as a Python package).

Now create a module file with some functionality:

nano myapp/utils.py

Add the following code:

def get_message():
    return "Hello from myapp.utils module!"

Now update our main application to use this module:

nano app.py

Replace the content with:

import requests
from myapp.utils import get_message

def main():
    response = requests.get("https://www.example.com")
    print(f"Status code: {response.status_code}")
    print(f"Content length: {len(response.text)} characters")
    print(get_message())

if __name__ == "__main__":
    main()

Build and run the updated application:

docker build -t python-app-modules .
docker run python-app-modules

You should see output that includes our custom message:

Status code: 200
Content length: 1256 characters
Hello from myapp.utils module!

Additional Best Practices

Here are some additional best practices to avoid ModuleNotFoundError in Docker:

  1. Virtual environments: While not strictly necessary in Docker (since containers are isolated), using virtual environments can help ensure consistency between development and production.

  2. Pinned dependencies: Always specify exact versions of dependencies to ensure consistency across different environments.

  3. Multi-stage builds: For production images, consider using multi-stage builds to create smaller images with only the necessary dependencies.

  4. Regular dependency updates: Regularly update your dependencies to get security fixes and improvements.

By following these best practices, you'll minimize the chances of encountering ModuleNotFoundError in your Docker containers and create more efficient, maintainable Docker images.

Summary

In this lab, you learned how to identify, troubleshoot, and fix the ModuleNotFoundError when working with Docker images for Python applications. You've gained practical experience in:

  • Creating a basic Python application and containerizing it with Docker
  • Understanding why ModuleNotFoundError occurs in Docker environments
  • Fixing dependency issues using direct package installation and requirements.txt
  • Implementing Docker best practices like proper layer caching and file structure
  • Creating a more complex application structure with multiple modules
  • Using .dockerignore to optimize Docker builds

These skills will help you create more reliable and maintainable Docker images for your Python applications. By following the best practices covered in this lab, you can avoid common pitfalls like ModuleNotFoundError and optimize your Docker development workflow.

Remember that proper dependency management is crucial when working with containerized applications. Always ensure your Docker images include all necessary dependencies, properly structured code, and follow best practices for efficiency and maintainability.