Tune System Performance in RHEL

Introduction

In this lab, you will learn how to optimize RHEL system performance using tuned and manage process priorities with nice and renice. You will begin by verifying tuned installation and listing available profiles, then observe how changing tuned profiles impacts system parameters.

The lab will guide you through starting and monitoring CPU-intensive processes, followed by adjusting their priorities using nice and renice to understand their effect on resource allocation. Finally, you will learn how to clean up running processes, ensuring a complete understanding of performance tuning on RHEL.

Verify tuned Status and List Available Profiles

In this step, you will learn how to verify the status of the tuned daemon and list available tuning profiles on your RHEL system. tuned is a dynamic adaptive system tuning daemon that tunes system settings to optimize performance for specific workloads. It uses tuning profiles to apply a set of system-wide settings.

Verify that the tuned daemon is running. In this container environment, we'll check if the tuned daemon is running by looking for its process. We can also verify its functionality by checking if it responds to commands.

First, check if the tuned process is running:
```
pgrep tuned
```
If tuned is running, this command will return its Process ID (PID). If no PID is returned, you can start the daemon manually:
```
sudo /usr/sbin/tuned &
```
Then verify it's running:
```
pgrep tuned
```
You should see output similar to:
```
739
```
(Note: The PID value in your output will vary.)

Additionally, you can verify that tuned is functional by checking if it responds to status queries:
```
sudo tuned-adm active
```
This should return the currently active profile without errors.

List the available tuning profiles and identify the active profile. The tuned-adm list command displays all available tuning profiles and highlights the currently active one.

sudo tuned-adm list

You will be prompted for your password. Note the Current active profile in the output.

Available profiles:
- accelerator-performance     - Throughput performance based tuning with disabled higher latency STOP states
- aws                         - Optimize for aws ec2 instances
- balanced                    - General non-specialized tuned profile
- balanced-battery            - Balanced profile biased towards power savings changes for battery
- desktop                     - Optimize for the desktop use-case
- hpc-compute                 - Optimize for HPC compute workloads
- intel-sst                   - Configure for Intel Speed Select Base Frequency
- latency-performance         - Optimize for deterministic performance at the cost of increased power consumption
- network-latency             - Optimize for deterministic performance at the cost of increased power consumption, focused on low latency network performance
- network-throughput          - Optimize for streaming network throughput, generally only necessary on older CPUs or 40G+ networks
- optimize-serial-console     - Optimize for serial console use.
- powersave                   - Optimize for low power consumption
- throughput-performance      - Broadly applicable tuning that provides excellent performance across a variety of common server workloads
- virtual-guest               - Optimize for running inside a virtual guest
- virtual-host                - Optimize for running KVM guests
Current active profile: virtual-guest

Review the virtual-guest profile configuration. The virtual-guest profile is often the default for virtual machines. You can inspect its configuration file to understand what settings it applies.

cat /usr/lib/tuned/virtual-guest/tuned.conf

This command shows the tuned configuration for the virtual-guest profile, including parameters it inherits from other profiles.

#
## tuned configuration
#

[main]
summary=Optimize for running inside a virtual guest
include=throughput-performance

[vm]
## If a workload mostly uses anonymous memory and it hits this limit, the entire
## working set is buffered for I/O, and any more write buffering would require
## swapping, so it's time to throttle writes until I/O can catch up.  Workloads
## that mostly use file mappings may be able to use even higher values.
#
## The generator of dirty data starts writeback at this percentage (system default
## is 20%)
dirty_ratio = 30

[sysctl]
## Filesystem I/O is usually much more efficient than swapping, so try to keep
## swapping low.  It's usually safe to go even lower than this on systems with
## server-grade storage.
vm.swappiness = 30

Verify that the vm.dirty_background_ratio parameter is applied. The virtual-guest profile includes throughput-performance. Let's check a parameter that throughput-performance typically sets, such as vm.dirty_background_ratio. This parameter controls when the system starts writing dirty pages to disk in the background.
```
sysctl vm.dirty_background_ratio
```
The output will show the current value of this kernel parameter.
```
vm.dirty_background_ratio = 10
```

Change tuned Profile and Observe System Parameter Changes

In this step, you will learn how to change the active tuned profile and observe the immediate effects on system parameters. Changing a tuned profile allows you to quickly apply a set of performance optimizations tailored for different workloads, such as throughput-intensive tasks or power saving.

Change the current active tuning profile to throughput-performance. The throughput-performance profile is designed for systems that require high throughput, often by sacrificing some latency. It typically optimizes for disk I/O and network performance. Use the tuned-adm profile command to switch profiles.
```
sudo tuned-adm profile throughput-performance
```
You will be prompted for your password.
```
$ sudo tuned-adm profile throughput-performance
[sudo] password for user:
```
Confirm the new active profile. After changing the profile, it's good practice to verify that the new profile is indeed active. You can do this using tuned-adm active.
```
sudo tuned-adm active
```
The output should now show throughput-performance as the active profile.
```
Current active profile: throughput-performance
```
Verify the vm.dirty_ratio and vm.swappiness parameters have changed. The throughput-performance profile modifies kernel parameters related to memory management, such as vm.dirty_ratio and vm.swappiness. Even though the virtual-guest profile inherits from throughput-performance, switching directly to the throughput-performance profile applies the base values without the virtual-guest specific modifications.
- vm.dirty_ratio: This parameter defines the maximum percentage of system memory that can be filled with dirty pages (pages that have been modified but not yet written to disk) before the system starts writing them to disk. A higher value can improve throughput by allowing more data to be buffered in memory.
- vm.swappiness: This parameter controls how aggressively the kernel swaps out anonymous memory (application data) from RAM to swap space. A lower value means the kernel will try to keep more application data in RAM, which is generally better for performance.
Let's check their current values using sysctl.
```
sysctl vm.dirty_ratio
sysctl vm.swappiness
```
You should observe that the values have changed from the virtual-guest profile settings (dirty_ratio = 30, vm.swappiness = 30) to the base throughput-performance profile values:
```
vm.dirty_ratio = 40
vm.swappiness = 10
```
(Note: These values reflect the base throughput-performance optimizations without the virtual-guest specific modifications.)

Start and Monitor CPU-Intensive Processes on RHEL

In this step, you will learn how to start CPU-intensive processes and monitor their resource usage. This is crucial for understanding how processes consume system resources and how to identify bottlenecks. We will use the sha1sum /dev/zero command, which continuously calculates the SHA1 checksum of an endless stream of zeros, effectively consuming CPU cycles.

Important: This exercise uses commands that perform endless checksums on a device file, intentionally consuming significant CPU resources. You must terminate all exercise processes before leaving this exercise or moving to the next lab.

Determine the number of CPU cores on your system. Understanding the number of CPU cores helps you decide how many CPU-intensive processes to run to fully utilize the system. You can find this information in /proc/cpuinfo.
```
grep -c '^processor' /proc/cpuinfo
```
This command counts the number of lines that start with processor, which corresponds to the number of logical CPU cores (or virtual processors).
```
2
```
(Note: Your output might show a different number of cores depending on the system's configuration.)
Start two instances of the sha1sum /dev/zero & command for each CPU core. To simulate a heavily loaded system, we will start multiple instances of sha1sum /dev/zero &. The & at the end of the command runs the process in the background, allowing you to continue using the terminal. For example, if you have 2 CPU cores, you would start 4 instances (2 instances/core * 2 cores).
```
for i in $(seq 1 $(grep -c '^processor' /proc/cpuinfo | awk '{print $1 * 2}')); do sha1sum /dev/zero & done
```
This command dynamically calculates the number of processes to start based on your CPU core count.
```
[1] 1234
[2] 1235
[3] 1236
[4] 1237
```
(Note: PID values in your output will vary from the example.)

Verify that the background jobs are running. The jobs command lists all processes currently running in the background from your shell session.

jobs

You should see a list of the sha1sum processes you just started.

[1]   Running                 sha1sum /dev/zero &
[2]   Running                 sha1sum /dev/zero &
[3]   Running                 sha1sum /dev/zero &
[4]-  Running                 sha1sum /dev/zero &

Use the ps and pgrep commands to display the percentage of CPU usage for each sha1sum process. The ps command reports a snapshot of the current processes. We will combine it with pgrep to filter for sha1sum processes.
- ps -o pid,pcpu,nice,comm: This specifies the output format: Process ID (pid), percentage of CPU usage (pcpu), nice value (nice), and command name (comm).
- $(pgrep sha1sum): This command substitution finds the PIDs of all processes named sha1sum and passes them as arguments to ps.
```
ps -o pid,pcpu,nice,comm $(pgrep sha1sum)
```
You should see each sha1sum process consuming a significant percentage of CPU.
```
    PID %CPU  NI COMMAND
   5248 48.8   0 sha1sum
   5249 48.7   0 sha1sum
   5250 48.8   0 sha1sum
   5251 48.8   0 sha1sum
```
(Note: The %CPU values might fluctuate but should be high, indicating heavy CPU usage. The NI column shows the nice value.)
Terminate all running sha1sum processes and verify none are left. It's crucial to clean up these CPU-intensive processes before proceeding. The pkill command terminates processes based on their name.
```
pkill sha1sum
```
Now, verify that no sha1sum jobs are running in the background.
```
jobs
```
The output should be empty, or indicate that all jobs are terminated.
```
[1]   Terminated              sha1sum /dev/zero
[2]   Terminated              sha1sum /dev/zero
[3]   Terminated              sha1sum /dev/zero
[4]-  Terminated              sha1sum /dev/zero
```
(Note: You might see "Terminated" messages, which is expected as the processes are being stopped.)

Adjust Process Priority with nice and renice on RHEL

In this step, you will learn how to influence the scheduling priority of processes using the nice and renice commands. The nice value (also known as niceness) of a process indicates its priority to the Linux scheduler. A lower nice value (more negative) means higher priority, while a higher nice value (more positive) means lower priority. The range for nice values is typically from -20 (highest priority) to 19 (lowest priority), with 0 being the default.

Start multiple instances of sha1sum /dev/zero & and then start one additional instance with a nice level of 10. We will start several sha1sum processes to simulate a busy system. Then, we'll start one with a deliberately lower priority (higher nice value) to observe the effect.

First, start three regular instances (adjust based on your CPU core count if desired, but at least as many as virtual processors to create contention):
```
for i in {1..3}; do sha1sum /dev/zero & done
```
Next, start the fourth instance with a nice level of 10. This process will have a lower priority compared to the others.
```
nice -n 10 sha1sum /dev/zero &
```
You will see output similar to this, indicating the PIDs of the background processes:
```
[1] 5443
[2] 5444
[3] 5445
[4] 5446
```
(Note: PID values in your output will vary.)
Use ps and pgrep commands to display the PID, percentage of CPU usage, nice value, and executable name for each process. Observe the %CPU and NI columns. The instance with the nice value of 10 should display a lower percentage of CPU usage than the other instances, as the scheduler gives it less CPU time.
```
ps -o pid,pcpu,nice,comm $(pgrep sha1sum)
```
Look for the process with NI value 10. Its %CPU should be noticeably lower than the others.
```
    PID %CPU  NI COMMAND
   5443 56.8   0 sha1sum
   5444 58.0   0 sha1sum
   5445 56.5   0 sha1sum
   5446  6.7  10 sha1sum
```
(Note: The exact %CPU values will vary based on system load and core count, but the process with nice 10 should have a lower share.)
Use the sudo renice command to change the nice level of one of the regular processes to 5. The renice command allows you to change the nice value of an already running process. We will demonstrate this by changing one of the regular processes (nice value 0) to a nice value of 5.

First, identify the PID of one of the sha1sum processes that has a nice value of 0 from the output of the previous ps command. Let's use the first one from the example above (PID 5443).
```
sudo renice -n 5 <PID_of_regular_process>
```
Replace <PID_of_regular_process> with the actual PID you identified. For example:
```
sudo renice -n 5 5443
```
You should see output confirming the priority change:
```
5443 (process ID) old priority 0, new priority 5
```
Repeat the ps and pgrep commands to display the CPU percentage and nice level. Observe the change in CPU usage for the process whose nice value you modified. The process with nice value 5 should now have slightly lower CPU usage compared to the processes with nice value 0, but higher than the process with nice value 10.
```
ps -o pid,pcpu,nice,comm $(pgrep sha1sum)
```
You should see the NI value for the modified process is now 5, and its CPU usage reflects its new priority level.
```
    PID %CPU  NI COMMAND
   5443 55.4   5 sha1sum
   5444 67.2   0 sha1sum
   5445 67.1   0 sha1sum
   5446  7.5  10 sha1sum
```
(Note: The exact %CPU values will vary, but you should observe that processes with lower nice values (higher priority) get more CPU time.)

Clean Up Running Processes

In this final step, you will ensure that all background processes started during the lab are properly terminated. This is a critical cleanup step to prevent unintended resource consumption and ensure the lab environment is reset for future use.

Use the pkill command to terminate all running processes with the sha1sum name pattern. The pkill command is an efficient way to send a signal (by default, SIGTERM) to processes based on their name. This will stop all sha1sum processes you started in the previous steps.
```
pkill sha1sum
```
You might see messages indicating that processes have been terminated.
```
[3]-  Terminated              sha1sum /dev/zero
[2]-  Terminated              sha1sum /dev/zero
[4]+  Terminated              nice -n 10 sha1sum /dev/zero
[1]+  Terminated              sha1sum /dev/zero
```
Verify that no sha1sum processes are still running. You can use pgrep to check if any sha1sum processes are still active. If pgrep returns no output, it means no such processes are running.
```
pgrep sha1sum
```
This command should return no output, indicating that all sha1sum processes have been successfully terminated.
```
$ pgrep sha1sum
$
```

Summary

In this lab, we learned how to manage and utilize tuned for system performance optimization on RHEL. We began by verifying the installation and status of the tuned service and listing available tuning profiles, understanding that tuned dynamically adapts system settings for specific workloads using these profiles. We then practiced logging into a simulated servera environment via SSH as the labex user and confirmed the tuned package installation using dnf list tuned.

The lab further guided us through changing tuned profiles to observe their impact on system parameters, demonstrating how different profiles can alter system behavior. We also gained practical experience in starting and monitoring CPU-intensive processes, which is crucial for identifying performance bottlenecks. Finally, we learned to adjust process priorities using nice and renice commands to manage resource allocation effectively, and concluded by cleaning up running processes to restore the system to its initial state.