Linux strace Command with Practical Examples

Introduction

In this lab, we will explore the Linux strace command and learn how to use it to trace and monitor the system calls made by a running process. The strace command is a powerful tool that provides valuable insights into the inner workings of programs, which can be crucial for debugging and troubleshooting issues. We will start by introducing the strace command, then dive deeper into tracing system calls, and finally explore how to use strace for debugging processes. This lab will equip you with the skills to effectively utilize the strace command in your Linux system administration and development tasks.

Linux Commands Cheat Sheet

Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL linux(("Linux")) -.-> linux/BasicSystemCommandsGroup(["Basic System Commands"]) linux(("Linux")) -.-> linux/BasicFileOperationsGroup(["Basic File Operations"]) linux(("Linux")) -.-> linux/SystemInformationandMonitoringGroup(["System Information and Monitoring"]) linux/BasicSystemCommandsGroup -.-> linux/sleep("Execution Delaying") linux/BasicFileOperationsGroup -.-> linux/ls("Content Listing") linux/SystemInformationandMonitoringGroup -.-> linux/time("Command Timing") subgraph Lab Skills linux/sleep -.-> lab-422933{{"Linux strace Command with Practical Examples"}} linux/ls -.-> lab-422933{{"Linux strace Command with Practical Examples"}} linux/time -.-> lab-422933{{"Linux strace Command with Practical Examples"}} end

Introduction to strace Command

In this step, we will explore the strace command, a powerful tool in Linux that allows you to trace and monitor the system calls made by a running process. System calls are the interface between a process and the operating system, and understanding them can be crucial for debugging and troubleshooting issues.

Let's start by installing the strace package:

sudo apt-get update
sudo apt-get install -y strace

Example output:

Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libunwind8
Suggested packages:
  fakeroot
The following NEW packages will be installed:
  libunwind8 strace
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 292 kB of archives.
After this operation, 1,054 kB of additional disk space will be used.
Do you want to continue? [Y/n] Y
...

Now, let's try using the strace command to trace a simple program. We'll use the ls command as an example:

strace ls

Example output:

execve("/usr/bin/ls", ["ls"], 0x7ffee4f7a0f0 /* 23 vars */) = 0
brk(NULL)                               = 0x55b7d6c23000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
...

The output shows the sequence of system calls made by the ls command, including execve to execute the command, brk to allocate memory, access to check file permissions, and openat to open the dynamic linker cache file.

By analyzing the strace output, you can gain insights into how a program interacts with the operating system, which can be helpful for debugging and understanding program behavior.

Tracing System Calls with strace

In this step, we will dive deeper into using the strace command to trace system calls made by a running process.

Let's start by creating a simple Python script that we can use for tracing:

cat > ~/project/example.py << EOF
import time

print("Hello, World!")
time.sleep(5)
EOF

Now, let's trace the execution of this script using strace:

strace python ~/project/example.py

Example output:

execve("/usr/bin/python", ["python", "/home/labex/project/example.py"], 0x7ffee4f7a0f0 /* 23 vars */) = 0
brk(NULL)                               = 0x55b7d6c23000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
...
write(1, "Hello, World!\n", 14)         = 14
time(NULL)                              = 1618304400
nanosleep({5, 0}, NULL)                 = 0
exit_group(0)                           = ?
+++ exited with 0 +++

The output shows the sequence of system calls made by the Python script, including execve to execute the Python interpreter, write to output the "Hello, World!" message, time to get the current time, and nanosleep to pause the script for 5 seconds.

You can use the strace output to understand how your program interacts with the operating system and identify any potential issues or performance bottlenecks.

Let's try another example, this time tracing the execution of the ls command with some additional options:

strace -c ls -l ~/project

Example output:

total 4
-rw-r--r-- 1 labex labex 59 Apr 12 13:33 example.py
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 45.45    0.000005           5         1           execve
 27.27    0.000003           3         1           brk
  9.09    0.000001           1         1           access
  9.09    0.000001           1         1           openat
  9.09    0.000001           1         1           close
  0.00    0.000000           0         4           read
  0.00    0.000000           0         2           fstat
  0.00    0.000000           0         1           mmap
  0.00    0.000000           0         1           mprotect
  0.00    0.000000           0         1           munmap
  0.00    0.000000           0         2           ioctl
  0.00    0.000000           0         1           statfs
  0.00    0.000000           0         1           access
  0.00    0.000000           0         2           newfstatat
  0.00    0.000000           0         2           close
------ ----------- ----------- --------- --------- ----------------
100.00    0.000011                    22           total

In this example, we used the -c option to get a summary of the system calls made by the ls command. The output shows the percentage of time spent in each system call, the number of calls, and the number of errors.

This information can be useful for identifying performance bottlenecks or understanding the behavior of a program.

Debugging Processes with strace

In this step, we will learn how to use the strace command to debug running processes and identify potential issues.

Let's start by creating a simple C program that we can use for debugging:

cat > ~/project/example.c << EOF
#include <stdio.h>
#include <unistd.h>

int main() {
    printf("Hello, World!\n");
    sleep(5);
    return 0;
}
EOF

Now, let's compile the program and run it with strace:

gcc -o ~/project/example ~/project/example.c
strace ~/project/example

Example output:

execve("/home/labex/project/example", ["/home/labex/project/example"], 0x7ffee4f7a0f0 /* 23 vars */) = 0
brk(NULL)                               = 0x55b7d6c23000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
...
write(1, "Hello, World!\n", 14)         = 14
time(NULL)                              = 1618304400
sleep(5)                                = 5
exit_group(0)                           = ?
+++ exited with 0 +++

The output shows the sequence of system calls made by the C program, including execve to execute the program, write to output the "Hello, World!" message, and sleep to pause the program for 5 seconds.

Now, let's say we want to debug a problem with the program. We can use strace to identify the issue. For example, let's assume the program is not writing the expected output to a file. We can trace the file-related system calls to see what's happening:

strace -e trace=file ~/project/example

Example output:

execve("/home/labex/project/example", ["/home/labex/project/example"], 0x7ffee4f7a0f0 /* 23 vars */) = 0
write(1, "Hello, World!\n", 14)         = 14
time(NULL)                              = 1618304400
sleep(5)                                = 5
exit_group(0)                           = ?
+++ exited with 0 +++

The output shows that the program is not making any file-related system calls, which suggests that the issue is not related to file operations.

By using strace to trace specific system calls or the overall system call activity, you can often identify the root cause of issues in your programs and debug them more effectively.

Summary

In this lab, we explored the powerful Linux strace command, which allows us to trace and monitor the system calls made by a running process. We started by introducing the strace command and installing it on our system. We then used strace to trace the system calls made by the simple ls command, gaining insights into how the program interacts with the operating system. Next, we delved deeper into using strace to trace system calls, creating a simple Python script and observing the sequence of system calls it makes. By analyzing the strace output, we can better understand program behavior and debug issues that may arise.

Linux Commands Cheat Sheet