介绍
Ansible 是一个强大的开源自动化工具,被系统管理员和 DevOps 专业人士广泛使用。它的一个关键功能是在远程主机上运行 shell 命令并处理它们的输出。在这个实践教程中,你将学习如何在 Ansible playbook 中有效地捕获、显示和处理 shell 命令的输出。这项技能对于创建能够适应不同系统条件并提供有用反馈的强大自动化工作流程至关重要。
Ansible 是一个强大的开源自动化工具,被系统管理员和 DevOps 专业人士广泛使用。它的一个关键功能是在远程主机上运行 shell 命令并处理它们的输出。在这个实践教程中,你将学习如何在 Ansible playbook 中有效地捕获、显示和处理 shell 命令的输出。这项技能对于创建能够适应不同系统条件并提供有用反馈的强大自动化工作流程至关重要。
在这一步中,我们将设置一个基本的 Ansible playbook,它执行 shell 命令并捕获它们的输出。这将为后续步骤中更高级的技术奠定基础。
首先,让我们在我们的系统上安装 Ansible:
sudo apt update
sudo apt install -y ansible
现在,验证 Ansible 是否已正确安装:
ansible --version
你应该看到类似这样的输出:
ansible [core 2.12.x]
config file = /etc/ansible/ansible.cfg
configured module search path = ['/home/labex/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python3/dist-packages/ansible
ansible collection location = /home/labex/.ansible/collections:/usr/share/ansible/collections
executable location = /usr/bin/ansible
python version = 3.10.x (default, Mar 15 2022, 12:22:08) [GCC 11.2.0]
jinja version = 3.0.3
libyaml = True
Ansible 使用一个 inventory 文件来了解要管理哪些主机。对于这个实验,我们将创建一个简单的 inventory,其中仅包含本地机器:
/home/labex/project 目录中创建一个名为 inventory.ini 的新文件。[local]
localhost ansible_connection=local
这个 inventory 定义了一个名为 local 的组,其中只包含 localhost,并告诉 Ansible 直接连接,无需 SSH。
现在,让我们创建一个简单的 playbook 来执行 shell 命令:
/home/labex/project 目录中创建一个名为 first_playbook.yml 的新文件。---
- name: Shell Command Example
hosts: local
gather_facts: no
tasks:
- name: Run a simple shell command
shell: echo "Hello from Ansible shell command"
register: hello_output
- name: Display the output
debug:
msg: "{{ hello_output.stdout }}"
这个 playbook 执行以下操作:
local 组register 关键字将输出存储在一个变量中debug 模块显示输出现在让我们运行 playbook:
ansible-playbook -i inventory.ini first_playbook.yml
你应该看到类似这样的输出:
PLAY [Shell Command Example] **************************************************
TASK [Run a simple shell command] *********************************************
changed: [localhost]
TASK [Display the output] *****************************************************
ok: [localhost] => {
"msg": "Hello from Ansible shell command"
}
PLAY RECAP ********************************************************************
localhost : ok=2 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
输出显示我们的 playbook 运行成功,执行了 shell 命令并显示了它的输出。
从这个练习中,请注意这些重要的概念:
shell 模块允许你运行 shell 命令register 指令将任务的输出捕获到一个变量中debug 模块有助于显示变量值stdout(标准输出)和 stderr(错误输出)在下一步中,我们将探讨如何更有效地处理和格式化 shell 命令输出。
现在你已经了解了在 Ansible 中执行 shell 命令的基础知识,让我们探讨如何处理结构化的命令输出并以不同的格式显示它。
让我们创建一个更实用的 playbook,它收集系统信息并以结构化的方式呈现它:
/home/labex/project 目录中创建一个名为 system_info.yml 的新文件。---
- name: Gather and Display System Information
hosts: local
gather_facts: no
tasks:
- name: Gather system information
shell: |
echo "OS Information: $(cat /etc/os-release | grep PRETTY_NAME | cut -d= -f2)"
echo "Kernel Version: $(uname -r)"
echo "CPU Information: $(grep "model name" /proc/cpuinfo | head -1 | cut -d: -f2 | xargs)"
echo "Memory Information: $(free -h | grep Mem | awk '{print $2}')"
register: system_info
- name: Display raw system information
debug:
msg: "{{ system_info.stdout }}"
- name: Display information as a list
debug:
msg: "{{ system_info.stdout_lines }}"
这个 playbook:
system_info 变量中运行 playbook:
ansible-playbook -i inventory.ini system_info.yml
你应该看到类似这样的输出:
PLAY [Gather and Display System Information] **********************************
TASK [Gather system information] **********************************************
changed: [localhost]
TASK [Display raw system information] *****************************************
ok: [localhost] => {
"msg": "OS Information: \"Ubuntu 22.04.1 LTS\"\nKernel Version: 5.15.0-1023-azure\nCPU Information: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz\nMemory Information: 4.0Gi"
}
TASK [Display information as a list] *****************************************
ok: [localhost] => {
"msg": [
"OS Information: \"Ubuntu 22.04.1 LTS\"",
"Kernel Version: 5.15.0-1023-azure",
"CPU Information: Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz",
"Memory Information: 4.0Gi"
]
}
PLAY RECAP ********************************************************************
localhost : ok=3 changed=1 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
注意第二个显示任务如何将输出显示为列表,而不是带有换行符的字符串。这种列表格式使处理多行输出更容易。
Linux 中的 shell 命令返回退出代码以指示成功 (0) 或失败 (非零)。Ansible 将这些捕获在已注册变量的 rc(返回码)属性中。
让我们创建一个 playbook 来演示如何使用返回码:
/home/labex/project 目录中创建一个名为 command_results.yml 的新文件。---
- name: Working with Command Results
hosts: local
gather_facts: no
tasks:
- name: Check if a file exists
shell: test -f /etc/hosts
register: file_check
ignore_errors: yes
- name: Show command result details
debug:
msg: |
Return code: {{ file_check.rc }}
Succeeded: {{ file_check.rc == 0 }}
Failed: {{ file_check.rc != 0 }}
- name: Check if a non-existent file exists
shell: test -f /file/does/not/exist
register: missing_file
ignore_errors: yes
- name: Show command result for missing file
debug:
msg: |
Return code: {{ missing_file.rc }}
Succeeded: {{ missing_file.rc == 0 }}
Failed: {{ missing_file.rc != 0 }}
这个 playbook:
ignore_errors: yes 来防止 playbook 在命令失败时停止运行 playbook:
ansible-playbook -i inventory.ini command_results.yml
你应该看到类似这样的输出:
PLAY [Working with Command Results] *******************************************
TASK [Check if a file exists] *************************************************
changed: [localhost]
TASK [Show command result details] ********************************************
ok: [localhost] => {
"msg": "Return code: 0\nSucceeded: True\nFailed: False\n"
}
TASK [Check if a non-existent file exists] ************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "test -f /file/does/not/exist", "delta": "0:00:00.003183", "end": "2023-07-14 15:24:33.931406", "msg": "non-zero return code", "rc": 1, "start": "2023-07-14 15:24:33.928223", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
...ignoring
TASK [Show command result for missing file] ***********************************
ok: [localhost] => {
"msg": "Return code: 1\nSucceeded: False\nFailed: True\n"
}
PLAY RECAP ********************************************************************
localhost : ok=4 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=1
注意现有文件的返回码为 0,而对于不存在的文件,返回码为 1。这演示了你如何使用返回码在你的 playbooks 中做出决策。
从 shell 命令注册的变量包含几个有用的属性:
stdout:标准输出,作为单个字符串stdout_lines:标准输出,拆分为行列表stderr:标准错误输出,作为单个字符串stderr_lines:标准错误输出,拆分为行列表rc:返回码(0 表示成功,非零表示失败)cmd:已执行的命令start 和 end:命令开始和结束的时间戳delta:命令执行的持续时间理解这种结构对于在 Ansible 中有效地使用 shell 命令输出至关重要。
Ansible 中最强大的功能之一是能够根据 shell 命令的输出做出决策。在这一步中,我们将学习如何使用条件和过滤器来处理 shell 命令输出,并使 playbooks 更加动态。
让我们创建一个 playbook,它根据 shell 命令的输出做出决策:
/home/labex/project 目录中创建一个名为 conditional_playbook.yml 的新文件。---
- name: Conditional Tasks Based on Command Output
hosts: local
gather_facts: no
tasks:
- name: Check disk space
shell: df -h / | grep -v Filesystem | awk '{print $5}' | sed 's/%//'
register: disk_usage
- name: Display disk usage
debug:
msg: "Current disk usage: {{ disk_usage.stdout }}%"
- name: Disk usage warning
debug:
msg: "WARNING: Disk usage is high"
when: disk_usage.stdout|int > 50
- name: Disk usage normal
debug:
msg: "Disk usage is normal"
when: disk_usage.stdout|int <= 50
这个 playbook:
when 条件,基于命令输出int 过滤器将字符串输出转换为整数以进行比较运行 playbook:
ansible-playbook -i inventory.ini conditional_playbook.yml
输出将根据你的实际磁盘使用情况而有所不同,但看起来会像这样:
PLAY [Conditional Tasks Based on Command Output] ******************************
TASK [Check disk space] *******************************************************
changed: [localhost]
TASK [Display disk usage] *****************************************************
ok: [localhost] => {
"msg": "Current disk usage: 38%"
}
TASK [Disk usage warning] *****************************************************
skipped: [localhost]
TASK [Disk usage normal] ******************************************************
ok: [localhost] => {
"msg": "Disk usage is normal"
}
PLAY RECAP ********************************************************************
localhost : ok=3 changed=1 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
注意 Ansible 如何仅根据实际磁盘使用值执行条件任务之一。
许多现代 CLI 工具以 JSON 格式返回数据。Ansible 具有处理 JSON 输出的内置功能:
/home/labex/project 目录中创建一个名为 json_output.yml 的新文件。---
- name: Handling JSON Output
hosts: local
gather_facts: no
tasks:
- name: Create a JSON file for testing
copy:
dest: /tmp/services.json
content: |
{
"services": [
{
"name": "web",
"status": "running",
"port": 80
},
{
"name": "database",
"status": "stopped",
"port": 5432
},
{
"name": "cache",
"status": "running",
"port": 6379
}
]
}
- name: Read JSON file with shell
shell: cat /tmp/services.json
register: json_output
- name: Parse and display JSON content
debug:
msg: "{{ json_output.stdout | from_json }}"
- name: Extract and display service information
debug:
msg: "Service: {{ item.name }}, Status: {{ item.status }}, Port: {{ item.port }}"
loop: "{{ (json_output.stdout | from_json).services }}"
- name: Show only running services
debug:
msg: "Running service: {{ item.name }} on port {{ item.port }}"
loop: "{{ (json_output.stdout | from_json).services }}"
when: item.status == "running"
这个 playbook:
from_json 过滤器将 JSON 字符串解析为数据结构运行 playbook:
ansible-playbook -i inventory.ini json_output.yml
你应该看到类似这样的输出:
PLAY [Handling JSON Output] ***************************************************
TASK [Create a JSON file for testing] *****************************************
changed: [localhost]
TASK [Read JSON file with shell] **********************************************
changed: [localhost]
TASK [Parse and display JSON content] *****************************************
ok: [localhost] => {
"msg": {
"services": [
{
"name": "web",
"port": 80,
"status": "running"
},
{
"name": "database",
"port": 5432,
"status": "stopped"
},
{
"name": "cache",
"port": 6379,
"status": "running"
}
]
}
}
TASK [Extract and display service information] ********************************
ok: [localhost] => (item={'name': 'web', 'status': 'running', 'port': 80}) => {
"msg": "Service: web, Status: running, Port: 80"
}
ok: [localhost] => (item={'name': 'database', 'status': 'stopped', 'port': 5432}) => {
"msg": "Service: database, Status: stopped, Port: 5432"
}
ok: [localhost] => (item={'name': 'cache', 'status': 'running', 'port': 6379}) => {
"msg": "Service: cache, Status: running, Port: 6379"
}
TASK [Show only running services] *********************************************
ok: [localhost] => (item={'name': 'web', 'status': 'running', 'port': 80}) => {
"msg": "Running service: web on port 80"
}
skipped: [localhost] => (item={'name': 'database', 'status': 'stopped', 'port': 5432})
ok: [localhost] => (item={'name': 'cache', 'status': 'running', 'port': 6379}) => {
"msg": "Running service: cache on port 6379"
}
PLAY RECAP ********************************************************************
localhost : ok=5 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
注意 playbook 如何解析 JSON,提取特定信息,并根据条件过滤数据。
运行 shell 命令时,处理潜在错误非常重要:
/home/labex/project 目录中创建一个名为 error_handling.yml 的新文件。---
- name: Error Handling with Shell Commands
hosts: local
gather_facts: no
tasks:
- name: Run a potentially failing command
shell: grep "nonexistent_pattern" /etc/passwd
register: command_result
ignore_errors: yes
- name: Display success or failure
debug:
msg: "Command {{ 'succeeded' if command_result.rc == 0 else 'failed with return code ' + command_result.rc|string }}"
- name: Run a custom failing command
shell: exit 3
register: exit_command
ignore_errors: yes
- name: Display detailed error information
debug:
msg: |
Return code: {{ exit_command.rc }}
Error message: {{ exit_command.stderr if exit_command.stderr else 'No error message' }}
这个 playbook:
ignore_errors: yes 即使命令失败也继续 playbook 执行运行 playbook:
ansible-playbook -i inventory.ini error_handling.yml
你应该看到类似这样的输出:
PLAY [Error Handling with Shell Commands] *************************************
TASK [Run a potentially failing command] **************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "grep \"nonexistent_pattern\" /etc/passwd", "delta": "0:00:00.002916", "end": "2023-07-14 16:10:23.671519", "msg": "non-zero return code", "rc": 1, "start": "2023-07-14 16:10:23.668603", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
...ignoring
TASK [Display success or failure] *********************************************
ok: [localhost] => {
"msg": "Command failed with return code 1"
}
TASK [Run a custom failing command] *******************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": "exit 3", "delta": "0:00:00.002447", "end": "2023-07-14 16:10:23.906121", "msg": "non-zero return code", "rc": 3, "start": "2023-07-14 16:10:23.903674", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
...ignoring
TASK [Display detailed error information] *************************************
ok: [localhost] => {
"msg": "Return code: 3\nError message: No error message\n"
}
PLAY RECAP ********************************************************************
localhost : ok=3 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=2
这演示了在运行 shell 命令时如何捕获和响应不同的错误情况。
在最后一步中,我们将把我们所学的一切结合起来,创建一个实用的 Ansible playbook,它收集系统信息,处理它,并生成一个报告。这代表了一个真实世界的场景,其中 Ansible 的 shell 命令处理能力非常有用。
让我们创建一个全面的系统信息收集工具:
/home/labex/project 目录中创建一个名为 system_report.yml 的新文件。---
- name: Comprehensive System Report
hosts: local
gather_facts: no
vars:
report_file: /tmp/system_report.txt
tasks:
- name: Collect basic system information
shell: |
echo "SYSTEM REPORT" > {{ report_file }}
echo "=============" >> {{ report_file }}
echo "" >> {{ report_file }}
echo "HOSTNAME: $(hostname)" >> {{ report_file }}
echo "TIMESTAMP: $(date)" >> {{ report_file }}
echo "" >> {{ report_file }}
echo "SYSTEM INFORMATION" >> {{ report_file }}
echo "------------------" >> {{ report_file }}
echo "OS: $(cat /etc/os-release | grep PRETTY_NAME | cut -d= -f2)" >> {{ report_file }}
echo "KERNEL: $(uname -r)" >> {{ report_file }}
echo "UPTIME: $(uptime -p)" >> {{ report_file }}
echo "" >> {{ report_file }}
echo "RESOURCE UTILIZATION" >> {{ report_file }}
echo "-------------------" >> {{ report_file }}
echo "CPU LOAD: $(uptime | awk -F'load average:' '{print $2}')" >> {{ report_file }}
echo "MEMORY USAGE:" >> {{ report_file }}
free -h >> {{ report_file }}
echo "" >> {{ report_file }}
echo "DISK USAGE:" >> {{ report_file }}
df -h >> {{ report_file }}
echo "" >> {{ report_file }}
echo "NETWORK INFORMATION" >> {{ report_file }}
echo "-------------------" >> {{ report_file }}
echo "IP ADDRESSES:" >> {{ report_file }}
ip addr | grep "inet " | awk '{print $2}' >> {{ report_file }}
echo "" >> {{ report_file }}
echo "PROCESS INFORMATION" >> {{ report_file }}
echo "-------------------" >> {{ report_file }}
echo "TOP 5 CPU CONSUMING PROCESSES:" >> {{ report_file }}
ps aux --sort=-%cpu | head -6 >> {{ report_file }}
echo "" >> {{ report_file }}
echo "TOP 5 MEMORY CONSUMING PROCESSES:" >> {{ report_file }}
ps aux --sort=-%mem | head -6 >> {{ report_file }}
register: report_generation
- name: Check if report was generated successfully
stat:
path: "{{ report_file }}"
register: report_stat
- name: Display report generation status
debug:
msg: "Report generated successfully at {{ report_file }}"
when: report_stat.stat.exists
- name: Display report content
shell: cat {{ report_file }}
register: report_content
when: report_stat.stat.exists
- name: Show report content
debug:
msg: "{{ report_content.stdout_lines }}"
when: report_stat.stat.exists
- name: Analyze disk usage
shell: df -h / | grep -v Filesystem | awk '{print $5}' | sed 's/%//'
register: disk_usage
when: report_stat.stat.exists
- name: Generate disk usage alert if needed
debug:
msg: "ALERT: Disk usage on / is {{ disk_usage.stdout }}% which exceeds the 80% threshold!"
when:
- report_stat.stat.exists
- disk_usage.stdout|int > 80
- name: Generate disk usage warning if needed
debug:
msg: "WARNING: Disk usage on / is {{ disk_usage.stdout }}% which exceeds the 60% threshold."
when:
- report_stat.stat.exists
- disk_usage.stdout|int > 60
- disk_usage.stdout|int <= 80
- name: Confirm normal disk usage
debug:
msg: "Disk usage on / is normal at {{ disk_usage.stdout }}%."
when:
- report_stat.stat.exists
- disk_usage.stdout|int <= 60
这个 playbook:
运行 playbook:
ansible-playbook -i inventory.ini system_report.yml
你将看到全面的输出,显示 playbook 的执行和完整的系统报告。输出很长,所以这里只是你可能看到的内容的示例:
PLAY [Comprehensive System Report] ********************************************
TASK [Collect basic system information] ***************************************
changed: [localhost]
TASK [Check if report was generated successfully] *****************************
ok: [localhost]
TASK [Display report generation status] ***************************************
ok: [localhost] => {
"msg": "Report generated successfully at /tmp/system_report.txt"
}
TASK [Display report content] *************************************************
changed: [localhost]
TASK [Show report content] ****************************************************
ok: [localhost] => {
"msg": [
"SYSTEM REPORT",
"=============",
"",
"HOSTNAME: ubuntu-vm",
"TIMESTAMP: Fri Jul 14 16:35:42 UTC 2023",
"",
"SYSTEM INFORMATION",
"------------------",
"OS: \"Ubuntu 22.04.1 LTS\"",
"KERNEL: 5.15.0-1023-azure",
"UPTIME: up 3 hours, 25 minutes",
...
让我们检查一下我们生成的系统报告:
cat /tmp/system_report.txt
这将显示由我们的 playbook 生成的完整报告。
对于更复杂的操作,有时更容易创建专用的 shell 脚本并从 Ansible 调用它:
/home/labex/project 目录中创建一个名为 disk_analyzer.sh 的新文件。#!/bin/bash
## disk_analyzer.sh - A simple script to analyze disk usage
echo "DISK USAGE ANALYSIS"
echo "------------------"
## Get overall disk usage
ROOT_USAGE=$(df -h / | grep -v Filesystem | awk '{print $5}' | sed 's/%//')
echo "Root filesystem usage: ${ROOT_USAGE}%"
## Categorize the usage
if [ $ROOT_USAGE -gt 80 ]; then
echo "STATUS: CRITICAL - Immediate action required"
elif [ $ROOT_USAGE -gt 60 ]; then
echo "STATUS: WARNING - Consider cleaning up disk space"
else
echo "STATUS: OK - Disk usage is within normal parameters"
fi
echo ""
## Find largest directories
echo "Top 5 largest directories in /var:"
du -h /var --max-depth=1 2> /dev/null | sort -hr | head -5
echo ""
## Find largest files
echo "Top 5 largest files in /var/log:"
find /var/log -type f -exec du -h {} \; 2> /dev/null | sort -hr | head -5
exit 0
chmod +x /home/labex/project/disk_analyzer.sh
touch /home/labex/project/call_script.yml
---
- name: Call Custom Shell Script
hosts: local
gather_facts: no
tasks:
- name: Run disk analyzer script
shell: /home/labex/project/disk_analyzer.sh
register: script_output
- name: Display script output
debug:
msg: "{{ script_output.stdout_lines }}"
- name: Check for critical status
debug:
msg: "CRITICAL DISK USAGE DETECTED! Immediate action required."
when: script_output.stdout is search("STATUS: CRITICAL")
ansible-playbook -i inventory.ini call_script.yml
你应该看到类似这样的输出:
PLAY [Call Custom Shell Script] ***********************************************
TASK [Run disk analyzer script] ***********************************************
changed: [localhost]
TASK [Display script output] **************************************************
ok: [localhost] => {
"msg": [
"DISK USAGE ANALYSIS",
"------------------",
"Root filesystem usage: 38%",
"STATUS: OK - Disk usage is within normal parameters",
"",
"Top 5 largest directories in /var:",
"60M\t/var/lib",
"60M\t/var/cache",
"12M\t/var/log",
"4.0K\t/var/tmp",
"4.0K\t/var/mail",
"",
"Top 5 largest files in /var/log:",
"4.0M\t/var/log/journal/c75af53674ce472fb9654a1d5cf8cc37/system.journal",
"2.3M\t/var/log/auth.log",
"1.3M\t/var/log/syslog",
"724K\t/var/log/kern.log",
"428K\t/var/log/cloud-init.log"
]
}
TASK [Check for critical status] **********************************************
skipped: [localhost]
PLAY RECAP ********************************************************************
localhost : ok=2 changed=1 unreachable=0 failed=0 skipped=1 rescued=0 ignored=0
这种方法将 shell 脚本的强大功能与 Ansible 的自动化功能相结合。shell 脚本处理磁盘分析的复杂逻辑,而 Ansible 管理结果的执行和进一步处理。
通过这个实验,你已经学习了使用 Ansible 中的 shell 命令输出的几种重要技术:
当你使用 Ansible 构建更复杂的自动化解决方案时,这些技能将非常宝贵。
在这个实验中,你学习了如何在 Ansible playbooks 中有效地使用 shell 命令输出。从执行 shell 命令并捕获其输出的基础知识开始,你逐步掌握了更高级的技术,例如条件执行、错误处理和处理结构化数据格式(如 JSON)。
你已经掌握了几项关键技能:
shell 模块在 Ansible playbooks 中运行 shell 命令register 指令捕获命令输出debug 模块显示输出这些技术使你能够创建更具动态性和响应性的自动化工作流程,这些工作流程可以适应不同的系统条件,并提供有关正在执行的操作的有用反馈。
在你继续 Ansible 之旅时,请记住,虽然 shell 命令提供了极大的灵活性,但 Ansible 的内置模块通常是更强大、更便携的常见任务解决方案。当你需要利用现有的 shell 脚本或执行 Ansible 模块不易处理的复杂操作时,请使用 shell 命令。