DAY 06: The Process Overseer

LinuxBeginner
Practice Now

Introduction

Welcome, Junior System Administrator! It's a busy Monday morning at "LabEx," and a critical alert just came in: the main application server is experiencing a significant slowdown, affecting all users. The senior administrators are tied up in an emergency meeting, and it's up to you to investigate and stabilize the system.

This is your moment to shine. Your mission is to dive into the server's command line, diagnose the issue by inspecting the running processes, neutralize any resource-hogging culprits, and ensure essential services remain operational. By the end of this challenge, you'll have proven your ability to manage a live Linux environment under pressure, a core skill for any system administrator.

Important Notice
The upcoming challenges may exceed the scope of the Quick Start with Linux course.
If you encounter difficulties during the challenge:
  1. Temporarily skip the challenge and continue with subsequent Guided Labs in the Linux learning path.
  2. Discuss with Labby or view the solution.

Listing Active System Processes

Your first step as the Process Overseer is to get a complete picture of what's currently running on the server. A static snapshot of all active processes will help you begin your investigation and identify anything unusual.

Tasks

  • Use a single command to generate a detailed list of all processes running on the system.

Requirements

  • The command must display processes for all users, not just your own.
  • The output format should be user-oriented, showing details like the user who owns the process, CPU/memory usage, and the full command that started it.

Examples

After running the command, you should see output similar to this:

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root         1  0.0  0.1 169848  9064 ?        Ss   08:30   0:02 /sbin/init
labex     1234  0.0  0.0   2324   564 pts/0    S+   08:35   0:00 bash /home/labex/project/resource_hog.sh
labex     1235  0.0  0.0   2324   564 ?        S    08:35   0:00 bash /home/labex/project/critical_service.sh
...

The output will show multiple processes with columns for user, process ID, CPU usage, memory usage, and the command that started each process.

Hints

  • The most common command for this task is ps.
  • Think about which options for the ps command would show processes for all users, in a user-friendly format, and include processes not attached to a texminal.

Monitoring Process Resource Usage

The static list from ps was a good start, but the server's load is changing every second. You need a live, dynamic view to see which process is actively causing the slowdown. It's time to bring out a more powerful monitoring tool.

Tasks

  • Launch an interactive command-line utility to monitor system processes and their resource usage in real-time.
  • Identify the name of the script that is consuming the most CPU.

Requirements

  • You must use a tool that provides a continuously updated, real-time view of system processes.
  • The tool should allow you to sort processes by CPU usage by default.
  • Once you have identified the top consumer, exit the tool to proceed to the next step.

Examples

When you launch the monitoring tool, you should see an interactive display that updates automatically, showing something like:

top - 09:15:30 up  1:45,  1 user,  load average: 1.50, 1.20, 0.85
Tasks: 105 total,   2 running, 103 sleeping,   0 stopped,   0 zombie
%Cpu(s): 45.0 us,  5.0 sy,  0.0 ni, 50.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   2048.0 total,    850.4 free,    950.2 used,    247.4 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used,      0.0 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 1234 labex     20   0   12884   1564   1320 R  95.0   0.1   2:15.30 bash /home/labex/project/resource_hog.sh
 1235 labex     20   0   12884   1564   1320 S   0.0   0.1   0:00.00 bash /home/labex/project/critical_service.sh
    1 root      20   0  169848   9064   6868 S   0.0   0.4   0:02.15 systemd
...

The display will show system statistics at the top and a list of processes sorted by CPU usage, with the highest CPU-consuming process at the top.

Hints

  • This popular command is often referred to as the "Task Manager" of the Linux world.
  • You can exit this interactive tool by pressing the q key.

Identifying Key Processes

You've found the troublemaker: resource_hog.sh. However, a good sysadmin doesn't just terminate processes wildly. You also noticed critical_service.sh running. Before you take any action against the hog, you should identify and understand all the key processes running on the system.

Tasks

  • Find the Process ID (PID) of the critical_service.sh script.
  • Verify that the critical service is running properly.

Requirements

  • You must use the pgrep command to find the PID of the process running critical_service.sh.
  • The command should successfully locate the running process and display its PID.

Examples

After finding the PID with pgrep, you should see output like:

1235

This number (1235 in this example) is the Process ID of the critical service process.

You can verify the process details using:

ps -p 1235 -o pid,ppid,cmd

Which should show output similar to:

PID PPID CMD
1235 1 /bin/bash /home/labex/project/critical_service.sh

Hints

  • pgrep can find a PID based on a process name.
  • Use pgrep -f to match against the full command line.

Terminating a Misbehaving Process

Now that you've identified the key processes, it's time to deal with the resource_hog.sh that has been slowing down the server. You need to terminate this process to restore normal operation.

Tasks

  • Terminate the resource_hog.sh process.

Requirements

  • You must use a command that can terminate a process based on its name, without needing to find its PID first.
  • Use the pkill command to stop the resource_hog.sh script.

Examples

To verify that the process has been terminated, you can check the process list afterward. Before termination, you might see:

labex 1234 95.0 0.0 2324 564 pts/0 R+ 09:15 5:00 bash /home/labex/project/resource_hog.sh

After successful termination, running the same check command should show no matching processes (only the grep command itself):

labex 2345 0.0 0.0 2324 564 pts/0 S+ 09:20 0:00 grep resource_hog

Hints

  • The pkill command sends a termination signal to processes based on their name.
  • After running the command, you can use ps aux | grep resource_hog to verify that the process is no longer running.

Starting and Managing Background Processes

The server is stable again! Excellent work. Just as you're about to take a break, a developer sends you a message. They need you to run a long-running script, data_processor.sh, on the server. You can't keep your terminal session open for hours just for this script. You need to run it in the background so it continues even after you log out.

Tasks

  • Start the data_processor.sh script so that it runs in the background and is immune to hangups (i.e., it won't stop if you close your terminal).

Requirements

  • You must be in the ~/project directory.
  • Use the nohup command to run the script.
  • Use the & operator to send the process to the background.
  • Redirect all output (both standard output and standard error) from the script to a file named processor.log.

Examples

After successfully starting the script in the background, you should see output similar to:

[1] 3456
nohup: ignoring input and appending output to 'processor.log'

The [1] 3456 indicates the job number and process ID. You can verify the script is running by checking the log file:

cat processor.log

This might show output like:

Starting data processing at Mon Sep 11 10:30:00 UTC 2025

And you can confirm the process is still running:

ps aux | grep data_processor

Which should show the background process is active.

Hints

  • The nohup command stands for "no hang up."
  • The & symbol at the end of a command tells the shell to run it as a background job.
  • You can redirect standard output with > and standard error with 2>&1.

Summary

Congratulations, Administrator! You have successfully navigated a critical server performance issue and demonstrated your mastery of Linux process management. The server is stable, critical services are prioritized, and long-running tasks are humming along in the background.

In this challenge, you have proven your ability to:

  • List and inspect all running processes using ps.
  • Monitor system resources in real-time with top.
  • Identify important processes using pgrep.
  • Terminate misbehaving processes cleanly with pkill.
  • Run and manage background jobs that persist after logout using nohup and &.

These are fundamental, high-value skills that are essential for any role in system administration, DevOps, or backend development. You've turned a potential crisis into an opportunity to showcase your expertise. Well done!

✨ Check Solution and Practice✨ Check Solution and Practice✨ Check Solution and Practice✨ Check Solution and Practice✨ Check Solution and Practice