Implementing Best Practices for Container Reliability
Now that we understand how to troubleshoot container exits, let's implement best practices to make our containers more reliable and robust. We'll enhance our example application with proper error handling, health checks, and logging.
1. Improving Error Handling
Let's modify our Python application to handle missing environment variables gracefully:
cd ~/project/docker-exit-test
nano app.py
Update the content to:
import os
import time
import sys
## Get environment variables with defaults
database_url = os.environ.get('DATABASE_URL', 'sqlite:///default.db')
print(f"Connecting to database: {database_url}")
## Simulate a long-running process
try:
print("Application running... Press Ctrl+C to exit")
counter = 0
while True:
counter += 1
print(f"Application heartbeat: {counter}")
time.sleep(10)
except KeyboardInterrupt:
print("Application shutting down gracefully...")
sys.exit(0)
Save and exit the file.
This improved version:
- Uses
os.environ.get()
with a default value instead of raising an exception
- Implements a long-running loop to keep the container alive
- Handles graceful shutdown when terminated
Let's rebuild the image:
docker build -t exit-test-app:v2 .
And run the improved container:
docker run -d --name improved-app exit-test-app:v2
Check that the container is running:
docker ps
You should see your container running:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c1d2e3f4g5h6 exit-test-app:v2 "python app.py" 10 seconds ago Up 10 seconds improved-app
View the logs to confirm it's working:
docker logs improved-app
You should see output like:
Connecting to database: sqlite:///default.db
Application running... Press Ctrl+C to exit
Application heartbeat: 1
Application heartbeat: 2
2. Implementing a Health Check in Dockerfile
Health checks allow Docker to monitor the health of your container. Let's update our Dockerfile to include a health check:
nano Dockerfile
Update it to:
FROM python:3.9-slim
WORKDIR /app
## Install curl for health checks
RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*
COPY app.py .
## Add a health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1
## Expose port for the health check endpoint
EXPOSE 8080
CMD ["python", "app.py"]
Now we need to add a health check endpoint to our application:
nano app.py
Replace the content with:
import os
import time
import sys
import threading
from http.server import HTTPServer, BaseHTTPRequestHandler
## Get environment variables with defaults
database_url = os.environ.get('DATABASE_URL', 'sqlite:///default.db')
## Simple HTTP server for health checks
class HealthRequestHandler(BaseHTTPRequestHandler):
def do_GET(self):
if self.path == '/health':
self.send_response(200)
self.send_header('Content-type', 'text/plain')
self.end_headers()
self.wfile.write(b'OK')
else:
self.send_response(404)
self.end_headers()
def run_health_server():
server = HTTPServer(('0.0.0.0', 8080), HealthRequestHandler)
print("Starting health check server on port 8080")
server.serve_forever()
## Start health check server in a separate thread
health_thread = threading.Thread(target=run_health_server, daemon=True)
health_thread.start()
print(f"Connecting to database: {database_url}")
## Main application loop
try:
print("Application running... Press Ctrl+C to exit")
counter = 0
while True:
counter += 1
print(f"Application heartbeat: {counter}")
time.sleep(10)
except KeyboardInterrupt:
print("Application shutting down gracefully...")
sys.exit(0)
Save and exit the file.
Rebuild the image with our health check:
docker build -t exit-test-app:v3 .
Run the container with the new version:
docker run -d --name healthcheck-app -p 8080:8080 exit-test-app:v3
After about 30 seconds, check the health status:
docker inspect --format='{{.State.Health.Status}}' healthcheck-app
You should see:
healthy
You can also test the health endpoint directly:
curl http://localhost:8080/health
This should return OK
.
3. Using Docker Restart Policies
Docker provides restart policies to automatically restart containers when they exit or encounter errors:
docker run -d --restart=on-failure:5 --name restart-app exit-test-app:v3
This policy will restart the container up to 5 times if it exits with a non-zero code.
Available restart policies:
no
: Never restart (default)
always
: Always restart regardless of exit status
unless-stopped
: Always restart unless manually stopped
on-failure[:max-retries]
: Restart only on non-zero exit
4. Setting Resource Limits
To prevent container crashes due to resource exhaustion, set appropriate resource limits:
docker run -d --name resource-limited-app \
--memory=256m \
--cpus=0.5 \
exit-test-app:v3
This limits the container to 256MB of memory and half of a CPU core.
5. Proper Logging Configuration
For better debugging, configure your application to output structured logs:
docker run -d --name logging-app \
--log-driver=json-file \
--log-opt max-size=10m \
--log-opt max-file=3 \
exit-test-app:v3
This configures the container to use the JSON log driver with log rotation (maximum 3 files of 10MB each).
Summary of Best Practices
By implementing these best practices, you've significantly improved the reliability of your Docker containers:
- Graceful error handling with default values
- Container health checks for monitoring
- Appropriate restart policies for automatic recovery
- Resource limits to prevent resource exhaustion
- Proper logging configuration for easier troubleshooting
These techniques will help you create resilient containers that can recover from failures and provide better observability when issues occur.