Introduction
Ansible is a powerful IT automation tool that helps system administrators and developers manage infrastructure efficiently. One of its key features is the ability to gather information about target systems, known as "facts." The gather_facts option in Ansible determines whether and how this information is collected during playbook execution.
In this hands-on lab, you will learn how to configure the gather_facts option in Ansible playbooks. You will explore different settings, understand when to enable or disable fact gathering, and discover how to use gathered facts to make your playbooks more dynamic and efficient. By the end of this lab, you will be able to optimize your Ansible workflows by controlling the fact gathering process according to your specific needs.
Installing Ansible and exploring the gather_facts option
Let's start by installing Ansible and exploring what the gather_facts option does. In this step, we'll install Ansible, create a simple inventory, and run a command to see what facts are gathered.
Installing Ansible
First, let's install Ansible on our system:
sudo apt update
sudo apt install -y ansible
After installation completes, verify that Ansible is installed correctly:
ansible --version
You should see output similar to this:
ansible [core 2.12.x]
config file = /etc/ansible/ansible.cfg
configured module search path = ['/home/labex/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python3/dist-packages/ansible
ansible collection location = /home/labex/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/bin/ansible
python version = 3.10.x (default, Aug 14 2022, 00:00:00) [GCC 11.2.0]
jinja version = 3.0.3
libyaml = True
Creating a simple inventory
Now, let's create a simple inventory file to work with. The inventory file defines the hosts that Ansible will manage. For this lab, we'll create a local inventory:
mkdir -p ~/project/ansible
cd ~/project/ansible
Create an inventory file named hosts using the editor:
- Click on the Explorer icon in the WebIDE
- Navigate to the
/home/labex/project/ansibledirectory - Right-click and select "New File"
- Name the file
hosts - Add the following content:
[local]
localhost ansible_connection=local
This inventory sets up a group called local with only one host - localhost. The ansible_connection=local parameter tells Ansible to execute commands directly on the local machine without using SSH.
Exploring gather_facts
Let's run a simple Ansible command to see what facts are gathered by default:
cd ~/project/ansible
ansible local -i hosts -m setup
The command above uses:
local: the group from our inventory-i hosts: specifies our inventory file-m setup: runs the setup module, which gathers facts
You'll see a large JSON output with detailed information about your system, including:
- Hardware information (CPU, memory)
- Network configuration
- Operating system details
- Environment variables
- And much more
This information is what Ansible collects when gather_facts is enabled (which is the default behavior). These facts can be used in playbooks to make decisions or customize tasks based on the target system's characteristics.
Creating a basic playbook with default fact gathering
In this step, we'll create a basic Ansible playbook that uses the default fact gathering behavior and displays some of the gathered information.
Understanding Ansible playbooks
An Ansible playbook is a YAML file containing a list of tasks to be executed on managed hosts. Playbooks provide a way to define configuration, deployment, and orchestration steps in a simple, human-readable format.
Creating your first playbook
Let's create a simple playbook that will display some of the facts Ansible gathers by default:
- In the WebIDE, navigate to the
/home/labex/project/ansibledirectory - Create a new file named
facts_playbook.yml - Add the following content:
---
- name: Show System Facts
hosts: local
## By default, gather_facts is set to 'true'
tasks:
- name: Display operating system
debug:
msg: "Operating System: {{ ansible_distribution }} {{ ansible_distribution_version }}"
- name: Display CPU information
debug:
msg: "CPU: {{ ansible_processor[1] }} with {{ ansible_processor_cores }} cores"
- name: Display memory information
debug:
msg: "Total Memory: {{ ansible_memtotal_mb }} MB"
- name: Display Python version
debug:
msg: "Python version: {{ ansible_python_version }}"
This playbook:
- Targets the
localgroup defined in our inventory - Implicitly enables fact gathering (default behavior)
- Contains four tasks that display different pieces of information gathered by Ansible
Running the playbook
Now let's run the playbook to see the gathered facts in action:
cd ~/project/ansible
ansible-playbook -i hosts facts_playbook.yml
You should see output similar to this:
PLAY [Show System Facts] *****************************************************
TASK [Gathering Facts] *******************************************************
ok: [localhost]
TASK [Display operating system] **********************************************
ok: [localhost] => {
"msg": "Operating System: Ubuntu 22.04"
}
TASK [Display CPU information] ***********************************************
ok: [localhost] => {
"msg": "CPU: Intel(R) Xeon(R) CPU with 2 cores"
}
TASK [Display memory information] ********************************************
ok: [localhost] => {
"msg": "Total Memory: 3907 MB"
}
TASK [Display Python version] ************************************************
ok: [localhost] => {
"msg": "Python version: 3.10.6"
}
PLAY RECAP *******************************************************************
localhost : ok=5 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Notice the first task in the output: TASK [Gathering Facts]. This is Ansible automatically gathering facts before running any of our defined tasks, because the default value of gather_facts is true.
The playbook then successfully displays information about your system using the gathered facts. Each fact is referenced using a variable with the ansible_ prefix.
Disabling fact gathering for improved performance
In this step, we'll learn how to disable fact gathering to improve playbook performance in situations where the facts aren't needed.
Understanding when to disable fact gathering
While gathering facts is useful, it can add unnecessary overhead in certain scenarios:
- When you're running simple tasks that don't require system information
- When you're executing playbooks frequently and the facts don't change
- When you want to optimize playbook execution time
Disabling fact gathering can significantly improve playbook execution speed, especially when managing many hosts.
Creating a playbook with disabled fact gathering
Let's create a new playbook that has fact gathering disabled:
- In the WebIDE, navigate to the
/home/labex/project/ansibledirectory - Create a new file named
no_facts_playbook.yml - Add the following content:
---
- name: Playbook with Disabled Fact Gathering
hosts: local
gather_facts: false
tasks:
- name: Display current time
command: date
register: current_time
- name: Show the current time
debug:
msg: "Current time is: {{ current_time.stdout }}"
- name: List files in the project directory
command: ls -la ~/project
register: file_list
- name: Show file list
debug:
msg: "Project directory contents:\n{{ file_list.stdout }}"
This playbook:
- Explicitly disables fact gathering with
gather_facts: false - Runs commands that don't depend on system facts
- Uses the
registerkeyword to capture command outputs - Displays the captured information using the
debugmodule
Running the playbook with disabled fact gathering
Let's run the playbook and observe the differences:
cd ~/project/ansible
ansible-playbook -i hosts no_facts_playbook.yml
You should see output similar to this:
PLAY [Playbook with Disabled Fact Gathering] *********************************
TASK [Display current time] **************************************************
changed: [localhost]
TASK [Show the current time] *************************************************
ok: [localhost] => {
"msg": "Current time is: Wed May 17 15:30:45 UTC 2023"
}
TASK [List files in the project directory] ***********************************
changed: [localhost]
TASK [Show file list] ********************************************************
ok: [localhost] => {
"msg": "Project directory contents:\ntotal 20\ndrwxr-xr-x 3 labex labex 4096 May 17 15:25 .\ndrwxr-xr-x 4 labex labex 4096 May 17 15:20 ..\ndrwxr-xr-x 2 labex labex 4096 May 17 15:25 ansible\n"
}
PLAY RECAP *******************************************************************
localhost : ok=4 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Notice that there's no Gathering Facts task in the output this time. The playbook starts directly with our first defined task.
Comparing execution times
To see the performance difference, let's create a simple timing script:
- In the WebIDE, navigate to the
/home/labex/project/ansibledirectory - Create a new file named
compare_timing.sh - Add the following content:
#!/bin/bash
echo "Running playbook with fact gathering enabled..."
time ansible-playbook -i hosts facts_playbook.yml > /dev/null
echo -e "\nRunning playbook with fact gathering disabled..."
time ansible-playbook -i hosts no_facts_playbook.yml > /dev/null
- Make the script executable:
chmod +x compare_timing.sh
- Run the comparison script:
./compare_timing.sh
You should see output showing that the playbook with fact gathering disabled runs faster than the one with fact gathering enabled. The difference might be small in our simple example, but it can be significant when running complex playbooks on multiple remote hosts.
Using selective fact gathering
In some cases, you might need only specific facts rather than all system information. Ansible allows selective fact gathering to optimize performance while still collecting the information you need.
Understanding fact subsets
Ansible organizes facts into subsets, such as:
all: All facts (default)min/minimal: A minimal set of factshardware: CPU, memory, and device informationnetwork: Network interface and routing informationvirtual: Virtualization detailsohai: Facts from Ohai (if available)facter: Facts from Facter (if available)
By selecting only the facts you need, you can improve playbook performance while still having access to necessary information.
Creating a playbook with selective fact gathering
Let's create a playbook that gathers only hardware-related facts:
- In the WebIDE, navigate to the
/home/labex/project/ansibledirectory - Create a new file named
selective_facts_playbook.yml - Add the following content:
---
- name: Selective Fact Gathering
hosts: local
gather_facts: true
gather_subset:
- "!all" ## Exclude all facts by default
- "hardware" ## Then include only hardware facts
tasks:
- name: Display CPU information
debug:
msg: "CPU: {{ ansible_processor[1] }} with {{ ansible_processor_cores }} cores"
- name: Display memory information
debug:
msg: "Total Memory: {{ ansible_memtotal_mb }} MB"
- name: Try to access network facts (should fail)
debug:
msg: "Default IPv4 Address: {{ ansible_default_ipv4.address }}"
ignore_errors: true
This playbook:
- Enables fact gathering with
gather_facts: true - Uses
gather_subsetto restrict which facts are collected - First excludes all facts with
!all - Then includes only hardware facts with
hardware - Tries to access network facts (which weren't gathered) to demonstrate the limitation
Running the playbook with selective fact gathering
Let's run the playbook to see selective fact gathering in action:
cd ~/project/ansible
ansible-playbook -i hosts selective_facts_playbook.yml
You should see output similar to this:
PLAY [Selective Fact Gathering] **********************************************
TASK [Gathering Facts] *******************************************************
ok: [localhost]
TASK [Display CPU information] ***********************************************
ok: [localhost] => {
"msg": "CPU: Intel(R) Xeon(R) CPU with 2 cores"
}
TASK [Display memory information] ********************************************
ok: [localhost] => {
"msg": "Total Memory: 3907 MB"
}
TASK [Try to access network facts (should fail)] *****************************
fatal: [localhost]: FAILED! => {"msg": "The task includes an option with an undefined variable. The error was: 'dict object' has no attribute 'address'..."}
...ignoring
PLAY RECAP *******************************************************************
localhost : ok=4 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=1
Notice that the first two tasks succeed because they access hardware facts that were gathered, but the third task fails because network facts weren't collected. We used ignore_errors: true to continue playbook execution despite this error.
Creating a playbook with multiple fact subsets
Now let's create a playbook that gathers both hardware and network facts:
- In the WebIDE, create a new file named
multiple_subsets_playbook.yml - Add the following content:
---
- name: Multiple Fact Subsets
hosts: local
gather_facts: true
gather_subset:
- "!all" ## Exclude all facts by default
- "hardware" ## Include hardware facts
- "network" ## Include network facts
tasks:
- name: Display CPU information
debug:
msg: "CPU: {{ ansible_processor[1] }} with {{ ansible_processor_cores }} cores"
- name: Display memory information
debug:
msg: "Total Memory: {{ ansible_memtotal_mb }} MB"
- name: Display network information
debug:
msg: "Default IPv4 Address: {{ ansible_default_ipv4.address }}"
Run this playbook:
ansible-playbook -i hosts multiple_subsets_playbook.yml
This time, all tasks should succeed because we've gathered both hardware and network facts.
Using gathered facts in conditional tasks
One of the most powerful uses of gathered facts is implementing conditional logic in your playbooks. In this step, we'll create a playbook that uses facts to make decisions about which tasks to run.
Understanding conditional tasks in Ansible
Ansible allows you to use the when keyword to conditionally execute tasks based on variables, facts, or task results. This enables you to create more dynamic and adaptable playbooks.
Creating a playbook with conditional tasks
Let's create a playbook that performs different actions based on the operating system:
- In the WebIDE, navigate to the
/home/labex/project/ansibledirectory - Create a new file named
conditional_facts_playbook.yml - Add the following content:
---
- name: Conditional Tasks Based on Facts
hosts: local
gather_facts: true
tasks:
- name: Display OS information
debug:
msg: "Running on {{ ansible_distribution }} {{ ansible_distribution_version }}"
- name: Task for Ubuntu systems
debug:
msg: "This is an Ubuntu system. Would run apt commands here."
when: ansible_distribution == "Ubuntu"
- name: Task for CentOS systems
debug:
msg: "This is a CentOS system. Would run yum commands here."
when: ansible_distribution == "CentOS"
- name: Task for systems with at least 2GB RAM
debug:
msg: "This system has {{ ansible_memtotal_mb }} MB RAM, which is sufficient for our application."
when: ansible_memtotal_mb >= 2048
- name: Task for systems with less than 2GB RAM
debug:
msg: "This system has only {{ ansible_memtotal_mb }} MB RAM, which may not be sufficient."
when: ansible_memtotal_mb < 2048
This playbook:
- Gathers all facts about the system
- Displays the operating system information
- Conditionally executes tasks based on the operating system type
- Conditionally executes tasks based on the amount of RAM
Running the conditional playbook
Let's run the playbook to see conditional tasks in action:
cd ~/project/ansible
ansible-playbook -i hosts conditional_facts_playbook.yml
Since we're running on Ubuntu, you should see output similar to this:
PLAY [Conditional Tasks Based on Facts] **************************************
TASK [Gathering Facts] *******************************************************
ok: [localhost]
TASK [Display OS information] ************************************************
ok: [localhost] => {
"msg": "Running on Ubuntu 22.04"
}
TASK [Task for Ubuntu systems] ***********************************************
ok: [localhost] => {
"msg": "This is an Ubuntu system. Would run apt commands here."
}
TASK [Task for CentOS systems] ***********************************************
skipping: [localhost]
TASK [Task for systems with at least 2GB RAM] ********************************
ok: [localhost] => {
"msg": "This system has 3907 MB RAM, which is sufficient for our application."
}
TASK [Task for systems with less than 2GB RAM] *******************************
skipping: [localhost]
PLAY RECAP *******************************************************************
localhost : ok=4 changed=0 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
Notice how some tasks are executed while others are skipped based on the conditions. The CentOS task is skipped because we're running on Ubuntu, and the "less than 2GB RAM" task is skipped because our system has more than 2GB RAM.
Creating a more practical example
Now let's create a more practical example that could be used in a real environment:
- In the WebIDE, create a new file named
practical_conditional_playbook.yml - Add the following content:
---
- name: Practical Conditional Playbook
hosts: local
gather_facts: true
vars:
app_dir: "/home/labex/project/app"
tasks:
- name: Create application directory
file:
path: "{{ app_dir }}"
state: directory
mode: "0755"
- name: Configure for production environment
copy:
dest: "{{ app_dir }}/config.yml"
content: |
environment: production
memory_limit: high
debug: false
when: ansible_memtotal_mb >= 4096
- name: Configure for development environment
copy:
dest: "{{ app_dir }}/config.yml"
content: |
environment: development
memory_limit: low
debug: true
when: ansible_memtotal_mb < 4096
- name: Display configuration
command: cat {{ app_dir }}/config.yml
register: config_content
- name: Show configuration
debug:
msg: "{{ config_content.stdout_lines }}"
This playbook:
- Creates a directory for an application
- Writes a different configuration file based on the available system memory
- Displays the resulting configuration
Run the practical playbook:
ansible-playbook -i hosts practical_conditional_playbook.yml
This example demonstrates how you can use gathered facts to automatically adapt configurations based on system characteristics.
Summary
In this lab, you have learned how to effectively configure and use the gather_facts option in Ansible playbooks. Here's a recap of what you've accomplished:
Basic fact gathering: You've installed Ansible and explored the default fact gathering behavior, seeing the wide range of system information that Ansible collects.
Disabling fact gathering: You've learned how to disable fact gathering to improve playbook performance when the facts aren't needed.
Selective fact gathering: You've discovered how to gather only specific subsets of facts to balance between performance and having necessary information.
Conditional tasks: You've implemented conditional logic in your playbooks based on gathered facts, allowing for dynamic behavior depending on system characteristics.
Practical applications: You've created practical examples that demonstrate how to use gathered facts in real-world scenarios.
By mastering the gather_facts option, you can optimize your Ansible playbooks for better performance while still having access to the system information you need. This knowledge will help you create more efficient, flexible, and powerful automation workflows.
Some best practices to remember:
- Enable fact gathering only when necessary
- Use selective fact gathering when you need only specific information
- Leverage gathered facts for conditional tasks to make your playbooks more adaptable
- Consider caching facts when running playbooks frequently on the same hosts
With these skills, you're well-equipped to create more sophisticated and efficient Ansible automation for your infrastructure management needs.


