Comment configurer l'option gather_facts dans un playbook Ansible

Introduction

Ansible is a powerful IT automation tool that helps system administrators and developers manage infrastructure efficiently. One of its key features is the ability to gather information about target systems, known as "facts." The gather_facts option in Ansible determines whether and how this information is collected during playbook execution.

In this hands-on lab, you will learn how to configure the gather_facts option in Ansible playbooks. You will explore different settings, understand when to enable or disable fact gathering, and discover how to use gathered facts to make your playbooks more dynamic and efficient. By the end of this lab, you will be able to optimize your Ansible workflows by controlling the fact gathering process according to your specific needs.

Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL ansible(("Ansible")) -.-> ansible/ModuleOperationsGroup(["Module Operations"]) ansible(("Ansible")) -.-> ansible/PlaybookEssentialsGroup(["Playbook Essentials"]) ansible/ModuleOperationsGroup -.-> ansible/apt("Package Manager") ansible/ModuleOperationsGroup -.-> ansible/command("Execute Commands") ansible/ModuleOperationsGroup -.-> ansible/copy("Transfer Files") ansible/ModuleOperationsGroup -.-> ansible/debug("Test Output") ansible/ModuleOperationsGroup -.-> ansible/file("Manage Files/Directories") ansible/PlaybookEssentialsGroup -.-> ansible/playbook("Execute Playbook") subgraph Lab Skills ansible/apt -.-> lab-414866{{"Comment configurer l'option gather_facts dans un playbook Ansible"}} ansible/command -.-> lab-414866{{"Comment configurer l'option gather_facts dans un playbook Ansible"}} ansible/copy -.-> lab-414866{{"Comment configurer l'option gather_facts dans un playbook Ansible"}} ansible/debug -.-> lab-414866{{"Comment configurer l'option gather_facts dans un playbook Ansible"}} ansible/file -.-> lab-414866{{"Comment configurer l'option gather_facts dans un playbook Ansible"}} ansible/playbook -.-> lab-414866{{"Comment configurer l'option gather_facts dans un playbook Ansible"}} end

Installing Ansible and exploring the gather_facts option

Let's start by installing Ansible and exploring what the gather_facts option does. In this step, we'll install Ansible, create a simple inventory, and run a command to see what facts are gathered.

Installing Ansible

First, let's install Ansible on our system:

sudo apt update
sudo apt install -y ansible

After installation completes, verify that Ansible is installed correctly:

ansible --version

You should see output similar to this:

ansible [core 2.12.x]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/home/labex/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3/dist-packages/ansible
  ansible collection location = /home/labex/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.10.x (default, Aug 14 2022, 00:00:00) [GCC 11.2.0]
  jinja version = 3.0.3
  libyaml = True

Creating a simple inventory

Now, let's create a simple inventory file to work with. The inventory file defines the hosts that Ansible will manage. For this lab, we'll create a local inventory:

mkdir -p ~/project/ansible
cd ~/project/ansible

Create an inventory file named hosts using the editor:

Click on the Explorer icon in the WebIDE
Navigate to the /home/labex/project/ansible directory
Right-click and select "New File"
Name the file hosts
Add the following content:

[local]
localhost ansible_connection=local

This inventory sets up a group called local with only one host - localhost. The ansible_connection=local parameter tells Ansible to execute commands directly on the local machine without using SSH.

Exploring gather_facts

Let's run a simple Ansible command to see what facts are gathered by default:

cd ~/project/ansible
ansible local -i hosts -m setup

The command above uses:

local: the group from our inventory
-i hosts: specifies our inventory file
-m setup: runs the setup module, which gathers facts

You'll see a large JSON output with detailed information about your system, including:

Hardware information (CPU, memory)
Network configuration
Operating system details
Environment variables
And much more

This information is what Ansible collects when gather_facts is enabled (which is the default behavior). These facts can be used in playbooks to make decisions or customize tasks based on the target system's characteristics.

Creating a basic playbook with default fact gathering

In this step, we'll create a basic Ansible playbook that uses the default fact gathering behavior and displays some of the gathered information.

Understanding Ansible playbooks

An Ansible playbook is a YAML file containing a list of tasks to be executed on managed hosts. Playbooks provide a way to define configuration, deployment, and orchestration steps in a simple, human-readable format.

Creating your first playbook

Let's create a simple playbook that will display some of the facts Ansible gathers by default:

In the WebIDE, navigate to the /home/labex/project/ansible directory
Create a new file named facts_playbook.yml
Add the following content:

---
- name: Show System Facts
  hosts: local
  ## By default, gather_facts is set to 'true'

  tasks:
    - name: Display operating system
      debug:
        msg: "Operating System: {{ ansible_distribution }} {{ ansible_distribution_version }}"

    - name: Display CPU information
      debug:
        msg: "CPU: {{ ansible_processor[1] }} with {{ ansible_processor_cores }} cores"

    - name: Display memory information
      debug:
        msg: "Total Memory: {{ ansible_memtotal_mb }} MB"

    - name: Display Python version
      debug:
        msg: "Python version: {{ ansible_python_version }}"

This playbook:

Targets the local group defined in our inventory
Implicitly enables fact gathering (default behavior)
Contains four tasks that display different pieces of information gathered by Ansible

Running the playbook

Now let's run the playbook to see the gathered facts in action:

cd ~/project/ansible
ansible-playbook -i hosts facts_playbook.yml

You should see output similar to this:

PLAY [Show System Facts] *****************************************************

TASK [Gathering Facts] *******************************************************
ok: [localhost]

TASK [Display operating system] **********************************************
ok: [localhost] => {
    "msg": "Operating System: Ubuntu 22.04"
}

TASK [Display CPU information] ***********************************************
ok: [localhost] => {
    "msg": "CPU: Intel(R) Xeon(R) CPU with 2 cores"
}

TASK [Display memory information] ********************************************
ok: [localhost] => {
    "msg": "Total Memory: 3907 MB"
}

TASK [Display Python version] ************************************************
ok: [localhost] => {
    "msg": "Python version: 3.10.6"
}

PLAY RECAP *******************************************************************
localhost                  : ok=5    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Notice the first task in the output: TASK [Gathering Facts]. This is Ansible automatically gathering facts before running any of our defined tasks, because the default value of gather_facts is true.

The playbook then successfully displays information about your system using the gathered facts. Each fact is referenced using a variable with the ansible_ prefix.

Disabling fact gathering for improved performance

In this step, we'll learn how to disable fact gathering to improve playbook performance in situations where the facts aren't needed.

Understanding when to disable fact gathering

While gathering facts is useful, it can add unnecessary overhead in certain scenarios:

When you're running simple tasks that don't require system information
When you're executing playbooks frequently and the facts don't change
When you want to optimize playbook execution time

Disabling fact gathering can significantly improve playbook execution speed, especially when managing many hosts.

Creating a playbook with disabled fact gathering

Let's create a new playbook that has fact gathering disabled:

In the WebIDE, navigate to the /home/labex/project/ansible directory
Create a new file named no_facts_playbook.yml
Add the following content:

---
- name: Playbook with Disabled Fact Gathering
  hosts: local
  gather_facts: false

  tasks:
    - name: Display current time
      command: date
      register: current_time

    - name: Show the current time
      debug:
        msg: "Current time is: {{ current_time.stdout }}"

    - name: List files in the project directory
      command: ls -la ~/project
      register: file_list

    - name: Show file list
      debug:
        msg: "Project directory contents:\n{{ file_list.stdout }}"

This playbook:

Explicitly disables fact gathering with gather_facts: false
Runs commands that don't depend on system facts
Uses the register keyword to capture command outputs
Displays the captured information using the debug module

Running the playbook with disabled fact gathering

Let's run the playbook and observe the differences:

cd ~/project/ansible
ansible-playbook -i hosts no_facts_playbook.yml

You should see output similar to this:

PLAY [Playbook with Disabled Fact Gathering] *********************************

TASK [Display current time] **************************************************
changed: [localhost]

TASK [Show the current time] *************************************************
ok: [localhost] => {
    "msg": "Current time is: Wed May 17 15:30:45 UTC 2023"
}

TASK [List files in the project directory] ***********************************
changed: [localhost]

TASK [Show file list] ********************************************************
ok: [localhost] => {
    "msg": "Project directory contents:\ntotal 20\ndrwxr-xr-x 3 labex labex 4096 May 17 15:25 .\ndrwxr-xr-x 4 labex labex 4096 May 17 15:20 ..\ndrwxr-xr-x 2 labex labex 4096 May 17 15:25 ansible\n"
}

PLAY RECAP *******************************************************************
localhost                  : ok=4    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Notice that there's no Gathering Facts task in the output this time. The playbook starts directly with our first defined task.

Comparing execution times

To see the performance difference, let's create a simple timing script:

In the WebIDE, navigate to the /home/labex/project/ansible directory
Create a new file named compare_timing.sh
Add the following content:

#!/bin/bash

echo "Running playbook with fact gathering enabled..."
time ansible-playbook -i hosts facts_playbook.yml > /dev/null

echo -e "\nRunning playbook with fact gathering disabled..."
time ansible-playbook -i hosts no_facts_playbook.yml > /dev/null

Make the script executable:

chmod +x compare_timing.sh

Run the comparison script:

./compare_timing.sh

You should see output showing that the playbook with fact gathering disabled runs faster than the one with fact gathering enabled. The difference might be small in our simple example, but it can be significant when running complex playbooks on multiple remote hosts.

Using selective fact gathering

In some cases, you might need only specific facts rather than all system information. Ansible allows selective fact gathering to optimize performance while still collecting the information you need.

Understanding fact subsets

Ansible organizes facts into subsets, such as:

all: All facts (default)
min / minimal: A minimal set of facts
hardware: CPU, memory, and device information
network: Network interface and routing information
virtual: Virtualization details
ohai: Facts from Ohai (if available)
facter: Facts from Facter (if available)

By selecting only the facts you need, you can improve playbook performance while still having access to necessary information.

Creating a playbook with selective fact gathering

Let's create a playbook that gathers only hardware-related facts:

In the WebIDE, navigate to the /home/labex/project/ansible directory
Create a new file named selective_facts_playbook.yml
Add the following content:

---
- name: Selective Fact Gathering
  hosts: local
  gather_facts: true
  gather_subset:
    - "!all" ## Exclude all facts by default
    - "hardware" ## Then include only hardware facts

  tasks:
    - name: Display CPU information
      debug:
        msg: "CPU: {{ ansible_processor[1] }} with {{ ansible_processor_cores }} cores"

    - name: Display memory information
      debug:
        msg: "Total Memory: {{ ansible_memtotal_mb }} MB"

    - name: Try to access network facts (should fail)
      debug:
        msg: "Default IPv4 Address: {{ ansible_default_ipv4.address }}"
      ignore_errors: true

This playbook:

Enables fact gathering with gather_facts: true
Uses gather_subset to restrict which facts are collected
First excludes all facts with !all
Then includes only hardware facts with hardware
Tries to access network facts (which weren't gathered) to demonstrate the limitation

Running the playbook with selective fact gathering

Let's run the playbook to see selective fact gathering in action:

cd ~/project/ansible
ansible-playbook -i hosts selective_facts_playbook.yml

You should see output similar to this:

PLAY [Selective Fact Gathering] **********************************************

TASK [Gathering Facts] *******************************************************
ok: [localhost]

TASK [Display CPU information] ***********************************************
ok: [localhost] => {
    "msg": "CPU: Intel(R) Xeon(R) CPU with 2 cores"
}

TASK [Display memory information] ********************************************
ok: [localhost] => {
    "msg": "Total Memory: 3907 MB"
}

TASK [Try to access network facts (should fail)] *****************************
fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'address'..."}
...ignoring

PLAY RECAP *******************************************************************
localhost                  : ok=4    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=1

Notice that the first two tasks succeed because they access hardware facts that were gathered, but the third task fails because network facts weren't collected. We used ignore_errors: true to continue playbook execution despite this error.

Creating a playbook with multiple fact subsets

Now let's create a playbook that gathers both hardware and network facts:

In the WebIDE, create a new file named multiple_subsets_playbook.yml
Add the following content:

---
- name: Multiple Fact Subsets
  hosts: local
  gather_facts: true
  gather_subset:
    - "!all" ## Exclude all facts by default
    - "hardware" ## Include hardware facts
    - "network" ## Include network facts

  tasks:
    - name: Display CPU information
      debug:
        msg: "CPU: {{ ansible_processor[1] }} with {{ ansible_processor_cores }} cores"

    - name: Display memory information
      debug:
        msg: "Total Memory: {{ ansible_memtotal_mb }} MB"

    - name: Display network information
      debug:
        msg: "Default IPv4 Address: {{ ansible_default_ipv4.address }}"

Run this playbook:

ansible-playbook -i hosts multiple_subsets_playbook.yml

This time, all tasks should succeed because we've gathered both hardware and network facts.

Using gathered facts in conditional tasks

One of the most powerful uses of gathered facts is implementing conditional logic in your playbooks. In this step, we'll create a playbook that uses facts to make decisions about which tasks to run.

Understanding conditional tasks in Ansible

Ansible allows you to use the when keyword to conditionally execute tasks based on variables, facts, or task results. This enables you to create more dynamic and adaptable playbooks.

Creating a playbook with conditional tasks

Let's create a playbook that performs different actions based on the operating system:

In the WebIDE, navigate to the /home/labex/project/ansible directory
Create a new file named conditional_facts_playbook.yml
Add the following content:

---
- name: Conditional Tasks Based on Facts
  hosts: local
  gather_facts: true

  tasks:
    - name: Display OS information
      debug:
        msg: "Running on {{ ansible_distribution }} {{ ansible_distribution_version }}"

    - name: Task for Ubuntu systems
      debug:
        msg: "This is an Ubuntu system. Would run apt commands here."
      when: ansible_distribution == "Ubuntu"

    - name: Task for CentOS systems
      debug:
        msg: "This is a CentOS system. Would run yum commands here."
      when: ansible_distribution == "CentOS"

    - name: Task for systems with at least 2GB RAM
      debug:
        msg: "This system has {{ ansible_memtotal_mb }} MB RAM, which is sufficient for our application."
      when: ansible_memtotal_mb >= 2048

    - name: Task for systems with less than 2GB RAM
      debug:
        msg: "This system has only {{ ansible_memtotal_mb }} MB RAM, which may not be sufficient."
      when: ansible_memtotal_mb < 2048

This playbook:

Gathers all facts about the system
Displays the operating system information
Conditionally executes tasks based on the operating system type
Conditionally executes tasks based on the amount of RAM

Running the conditional playbook

Let's run the playbook to see conditional tasks in action:

cd ~/project/ansible
ansible-playbook -i hosts conditional_facts_playbook.yml

Since we're running on Ubuntu, you should see output similar to this:

PLAY [Conditional Tasks Based on Facts] **************************************

TASK [Gathering Facts] *******************************************************
ok: [localhost]

TASK [Display OS information] ************************************************
ok: [localhost] => {
    "msg": "Running on Ubuntu 22.04"
}

TASK [Task for Ubuntu systems] ***********************************************
ok: [localhost] => {
    "msg": "This is an Ubuntu system. Would run apt commands here."
}

TASK [Task for CentOS systems] ***********************************************
skipping: [localhost]

TASK [Task for systems with at least 2GB RAM] ********************************
ok: [localhost] => {
    "msg": "This system has 3907 MB RAM, which is sufficient for our application."
}

TASK [Task for systems with less than 2GB RAM] *******************************
skipping: [localhost]

PLAY RECAP *******************************************************************
localhost                  : ok=4    changed=0    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0

Notice how some tasks are executed while others are skipped based on the conditions. The CentOS task is skipped because we're running on Ubuntu, and the "less than 2GB RAM" task is skipped because our system has more than 2GB RAM.

Creating a more practical example

Now let's create a more practical example that could be used in a real environment:

In the WebIDE, create a new file named practical_conditional_playbook.yml
Add the following content:

---
- name: Practical Conditional Playbook
  hosts: local
  gather_facts: true

  vars:
    app_dir: "/home/labex/project/app"

  tasks:
    - name: Create application directory
      file:
        path: "{{ app_dir }}"
        state: directory
        mode: "0755"

    - name: Configure for production environment
      copy:
        dest: "{{ app_dir }}/config.yml"
        content: |
          environment: production
          memory_limit: high
          debug: false
      when: ansible_memtotal_mb >= 4096

    - name: Configure for development environment
      copy:
        dest: "{{ app_dir }}/config.yml"
        content: |
          environment: development
          memory_limit: low
          debug: true
      when: ansible_memtotal_mb < 4096

    - name: Display configuration
      command: cat {{ app_dir }}/config.yml
      register: config_content

    - name: Show configuration
      debug:
        msg: "{{ config_content.stdout_lines }}"

This playbook:

Creates a directory for an application
Writes a different configuration file based on the available system memory
Displays the resulting configuration

Run the practical playbook:

ansible-playbook -i hosts practical_conditional_playbook.yml

This example demonstrates how you can use gathered facts to automatically adapt configurations based on system characteristics.

Summary

In this lab, you have learned how to effectively configure and use the gather_facts option in Ansible playbooks. Here's a recap of what you've accomplished:

Basic fact gathering: You've installed Ansible and explored the default fact gathering behavior, seeing the wide range of system information that Ansible collects.
Disabling fact gathering: You've learned how to disable fact gathering to improve playbook performance when the facts aren't needed.
Selective fact gathering: You've discovered how to gather only specific subsets of facts to balance between performance and having necessary information.
Conditional tasks: You've implemented conditional logic in your playbooks based on gathered facts, allowing for dynamic behavior depending on system characteristics.
Practical applications: You've created practical examples that demonstrate how to use gathered facts in real-world scenarios.

By mastering the gather_facts option, you can optimize your Ansible playbooks for better performance while still having access to the system information you need. This knowledge will help you create more efficient, flexible, and powerful automation workflows.

Some best practices to remember:

Enable fact gathering only when necessary
Use selective fact gathering when you need only specific information
Leverage gathered facts for conditional tasks to make your playbooks more adaptable
Consider caching facts when running playbooks frequently on the same hosts

With these skills, you're well-equipped to create more sophisticated and efficient Ansible automation for your infrastructure management needs.