Ansible Command 模块：从基础到实际应用

介绍

在本实验中，你将探索 Ansible 的 Command 模块，这是一个用于在远程主机上执行命令的强大工具。Command 模块允许你直接从 Ansible playbook 和任务中与命令行进行交互，为管理远程系统提供了一种多功能的方式。在本实验中，你将学习如何使用 Ansible Command 模块执行各种命令、处理变量和参数，并捕获命令输出。

创建一个简单的 Ansible Playbook

在这一步中，你将使用 Command 模块创建你的第一个 Ansible playbook，以执行一个简单的命令。

首先，导航到项目目录：

cd ~/project

接下来，使用你选择的文本编辑器创建一个名为 simple_command.yml 的新文件。例如，你可以使用 nano 编辑器：

nano simple_command.yml

将以下内容添加到文件中：

---
- name: Execute a simple command
  hosts: localhost
  tasks:
    - name: Run 'ls' command
      command: ls -l

让我们分解一下这个 playbook：

hosts: localhost 行指定 playbook 将在本地机器上运行。
tasks 部分包含要执行的任务列表。
command: ls -l 行使用 Command 模块运行 ls -l 命令，该命令以长格式列出文件和目录。

保存文件并退出编辑器（在 nano 中，按 Ctrl+X，然后按 Y，最后按 Enter）。

现在，使用以下命令运行 playbook：

ansible-playbook simple_command.yml

你应该会看到类似以下的输出：

PLAY [Execute a simple command] ************************************************

TASK [Gathering Facts] *********************************************************
ok: [localhost]

TASK [Run 'ls' command] ********************************************************
changed: [localhost]

PLAY RECAP *********************************************************************
localhost                  : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

此输出表明 playbook 成功运行，并在你的本地机器上执行了 ls -l 命令。

在 Command 模块中使用变量

在这一步中，你将学习如何在 Ansible Command 模块中使用变量。变量可以让你的 playbook 更加灵活和可重用。

创建一个名为 variable_command.yml 的新文件：

nano variable_command.yml

将以下内容添加到文件中：

---
- name: Use variables with the Command module
  hosts: localhost
  vars:
    file_path: /etc/passwd
    line_count: 5
  tasks:
    - name: Display the last few lines of a file
      command: "tail -n {{ line_count }} {{ file_path }}"
      register: command_output

    - name: Show the command output
      debug:
        var: command_output.stdout_lines

这个 playbook 引入了几个新概念：

vars 部分定义了可以在整个 playbook 中使用的变量。
我们使用 {{ variable_name }} 语法在命令中引用变量。
register 关键字将命令的输出保存到名为 command_output 的变量中。
debug 模块用于显示 command_output 变量的内容。

保存文件并退出编辑器。

现在，运行 playbook：

ansible-playbook variable_command.yml

你应该会看到类似以下的输出：

PLAY [Use variables with the Command module] ***********************************

TASK [Gathering Facts] *********************************************************
ok: [localhost]

TASK [Display the last few lines of a file] ************************************
changed: [localhost]

TASK [Show the command output] *************************************************
ok: [localhost] => {
    "command_output.stdout_lines": [
        "games:x:5:60:games:/usr/games:/usr/sbin/nologin",
        "man:x:6:12:man:/var/cache/man:/usr/sbin/nologin",
        "lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin",
        "mail:x:8:8:mail:/var/mail:/usr/sbin/nologin",
        "news:x:9:9:news:/var/spool/news:/usr/sbin/nologin"
    ]
}

PLAY RECAP *********************************************************************
localhost                  : ok=3    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

此输出显示了 /etc/passwd 文件的最后 5 行，展示了如何在 Command 模块中使用变量。

捕获并处理命令输出

在这一步中，你将学习如何捕获命令的输出并使用 Ansible 进一步处理它。

创建一个名为 process_output.yml 的新文件：

nano process_output.yml

将以下内容添加到文件中：

---
- name: Capture and process command output
  hosts: localhost
  tasks:
    - name: Get disk usage information
      command: df -h
      register: df_output

    - name: Display all partitions
      debug:
        msg: "{{ df_output.stdout_lines }}"

    - name: Find root partition
      set_fact:
        root_partition: "{{ df_output.stdout_lines | select('match', '\\s+/$') | first | default('') }}"

    - name: Display root partition information
      debug:
        msg: "Root partition: {{ root_partition }}"
      when: root_partition != ''

    - name: Extract usage percentage
      set_fact:
        root_usage: "{{ root_partition.split()[-2].rstrip('%') | int }}"
      when: root_partition != ''

    - name: Display root partition usage
      debug:
        msg: "Root partition is {{ root_usage }}% full"
      when: root_partition != ''

    - name: Check if root partition is over 80% full
      fail:
        msg: "Warning: Root partition is over 80% full!"
      when: root_partition != '' and root_usage > 80

    - name: Display message if root partition not found
      debug:
        msg: "Root partition (/) not found in df output"
      when: root_partition == ''

这个 playbook 更加健壮，能够处理根分区可能不易检测到的情况：

我们显示所有分区以查看可用的内容。
我们使用更灵活的模式来查找根分区。
我们添加了检查以处理未找到根分区的情况。
我们使用 default('') 过滤器来避免在未找到根分区时出现错误。

保存文件并退出编辑器。

现在，运行 playbook：

ansible-playbook process_output.yml

你应该会看到类似以下的输出：

PLAY [Capture and process command output] ************************************************

TASK [Gathering Facts] *******************************************************************
ok: [localhost]

TASK [Get disk usage information] ********************************************************
changed: [localhost]

TASK [Display all partitions] ************************************************************
ok: [localhost] => {
    "msg": [
        "Filesystem      Size  Used Avail Use% Mounted on",
        "overlay          20G  618M   20G   4% /",
        "tmpfs            64M     0   64M   0% /dev",
        "tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup",
        "shm              64M  128K   64M   1% /dev/shm",
        "/dev/vdb        100G   17G   84G  17% /etc/hosts"
    ]
}

TASK [Find root partition] ***************************************************************
ok: [localhost]

TASK [Display root partition information] ************************************************
skipping: [localhost]

TASK [Extract usage percentage] **********************************************************
skipping: [localhost]

TASK [Display root partition usage] ******************************************************
skipping: [localhost]

TASK [Check if root partition is over 80% full] ******************************************
skipping: [localhost]

TASK [Display message if root partition not found] ***************************************
ok: [localhost] => {
    "msg": "Root partition (/) not found in df output"
}

PLAY RECAP *******************************************************************************
localhost                  : ok=7    changed=1    unreachable=0    failed=1    skipped=1    rescued=0    ignored=0

此输出显示了所有分区，识别了根分区，并检查了其使用情况。实际值可能因你的系统而异。

使用 Command 模块选项

在这一步中，你将探索 Ansible Command 模块的一些可用选项，以控制其行为。

创建一个名为 command_options.yml 的新文件：

nano command_options.yml

将以下内容添加到文件中：

---
- name: Explore Command module options
  hosts: localhost
  tasks:
    - name: Run a command with a specific working directory
      command: pwd
      args:
        chdir: /tmp

    - name: Run a command with environment variables
      command: echo $MY_VAR
      environment:
        MY_VAR: "Hello from Ansible"

    - name: Run a command and ignore errors
      command: ls /nonexistent_directory
      ignore_errors: yes

    - name: Run a command with a timeout
      command: sleep 2
      async: 5
      poll: 0
      register: sleep_result

    - name: Check sleep command status
      async_status:
        jid: "{{ sleep_result.ansible_job_id }}"
      register: job_result
      until: job_result.finished
      retries: 5
      delay: 1

这个 playbook 展示了 Command 模块的各种选项：

chdir：在执行命令之前更改工作目录。
environment：为命令设置环境变量。
ignore_errors：即使命令失败，也继续执行 playbook。
async 和 poll：以异步方式运行命令并设置超时。

保存文件并退出编辑器。

现在，运行 playbook：

ansible-playbook command_options.yml

你应该会看到类似以下的输出：

PPLAY [Explore Command module options]

TASK [Gathering Facts]
ok: [localhost]

TASK [Run a command with a specific working directory]
changed: [localhost]

TASK [Run a command with environment variables]
changed: [localhost]

TASK [Run a command and ignore errors]
fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["ls", "/nonexistent_directory"], "delta": "0:00:00.006113", "end": "2024-09-06 09:40:43.373350", "msg": "non-zero return code", "rc": 2, "start": "2024-09-06 09:40:43.367237", "stderr": "ls: cannot access '/nonexistent_directory': No such file or directory", "stderr_lines": ["ls: cannot access '/nonexistent_directory': No such file or directory"], "stdout": "", "stdout_lines": []}
...ignoring

TASK [Run a command with a timeout]
changed: [localhost]

TASK [Check sleep command status]
FAILED - RETRYING: Check sleep command status (10 retries left).
FAILED - RETRYING: Check sleep command status (9 retries left).
FAILED - RETRYING: Check sleep command status (8 retries left).
FAILED - RETRYING: Check sleep command status (7 retries left).
FAILED - RETRYING: Check sleep command status (6 retries left).
FAILED - RETRYING: Check sleep command status (5 retries left).
FAILED - RETRYING: Check sleep command status (4 retries left).
FAILED - RETRYING: Check sleep command status (3 retries left).
FAILED - RETRYING: Check sleep command status (2 retries left).
FAILED - RETRYING: Check sleep command status (1 retries left).
fatal: [localhost]: FAILED! => {"ansible_job_id": "5877920468.2517", "attempts": 10, "changed": false, "finished": 0, "started": 1}

PLAY RECAP

此输出展示了我们探索的 Command 模块选项的不同行为。

在 Docker 友好场景中使用 Command 模块

在这最后一步中，你将在一个更贴近实际的 Docker 友好场景中使用 Ansible Command 模块：检查 SSH 服务的状态并在必要时管理它。

创建一个名为 check_service_docker.yml 的新文件：

nano check_service_docker.yml

将以下内容添加到文件中：

---
- name: Check and manage SSH service in Docker
  hosts: localhost
  become: yes ## 这允许 Ansible 使用 sudo
  tasks:
    - name: Check SSH service status
      command: service ssh status
      register: ssh_status
      ignore_errors: yes

    - name: Display SSH service status
      debug:
        msg: "SSH service status: {{ ssh_status.stdout }}"

    - name: Start SSH service if not running
      command: service ssh start
      when: ssh_status.rc != 0

    - name: Verify SSH service is running
      command: service ssh status
      register: ssh_status_after

    - name: Display final SSH service status
      debug:
        msg: "SSH service status is now: {{ ssh_status_after.stdout }}"

    - name: Check if SSH port is listening
      command: netstat -tuln | grep :22
      register: ssh_port_check
      ignore_errors: yes

    - name: Display SSH port status
      debug:
        msg: "SSH port 22 is {{ 'open' if ssh_port_check.rc == 0 else 'closed' }}"

这个 playbook 执行以下操作：

使用 service 命令检查 SSH 服务的状态。
显示服务的当前状态。
如果服务未运行，则启动它。
在可能的启动后验证服务状态。
显示服务的最终状态。
检查 SSH 端口（22）是否在监听。
显示 SSH 端口的状态。

保存文件并退出编辑器。

现在，使用 sudo 权限运行 playbook：

sudo ansible-playbook check_service_docker.yml

你应该会看到类似以下的输出：

PLAY [Check and manage SSH service in Docker] *****************************************

TASK [Gathering Facts] ****************************************************************
ok: [localhost]

TASK [Check SSH service status] *******************************************************
changed: [localhost]

TASK [Display SSH service status] *****************************************************
ok: [localhost] => {
    "msg": "SSH service status: * sshd is running"
}

TASK [Start SSH service if not running] ***********************************************
skipping: [localhost]

TASK [Verify SSH service is running] **************************************************
changed: [localhost]

TASK [Display final SSH service status] ***********************************************
ok: [localhost] => {
    "msg": "SSH service status is now: * sshd is running"
}

TASK [Check if SSH port is listening] *************************************************
changed: [localhost]

TASK [Display SSH port status] ********************************************************
ok: [localhost] => {
    "msg": "SSH port 22 is open"
}

PLAY RECAP ****************************************************************************
localhost                  : ok=6    changed=3    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0

此输出显示 SSH 服务已经在运行，因此不需要启动。playbook 成功检查并验证了服务状态，并确认 SSH 端口是开放的。

总结

在本实验中，你探索了 Ansible Command 模块的多功能性和强大功能。你已经学会了如何：

使用 Command 模块创建简单的 Ansible playbook 来执行基本命令。
在 Command 模块中使用变量，使你的 playbook 更加灵活和可重用。
捕获并处理命令输出，使你能够根据命令结果做出决策。
使用各种 Command 模块选项来控制其行为，例如更改目录、设置环境变量和处理错误。
通过检查和管理系统服务，将 Command 模块应用于实际场景。

这些技能为你使用 Ansible 自动化系统管理任务和高效管理远程主机奠定了坚实的基础。随着你继续使用 Ansible，你会发现 Command 模块是自动化工具包中的一个多功能工具。

请记住，虽然 Command 模块功能强大，但在有可用的情况下，通常最好使用专门的 Ansible 模块（例如用于管理服务的 service 模块）。这些专门的模块提供了更好的幂等性，并且可以开箱即用地处理更复杂的场景。

继续练习和探索 Ansible 的功能，以进一步提升你的自动化技能并简化你的 IT 运维工作。