Hashing with SHA-256 in Cryptography

LinuxBeginner
Practice Now

Introduction

Welcome to the world of cryptography! In this lab, you will get hands-on experience with one of the most fundamental concepts in modern security: cryptographic hashing. Specifically, we will be working with the SHA-256 algorithm.

A cryptographic hash function is a mathematical algorithm that takes an input (or 'message') of any size and returns a fixed-size string of bytes. This output is typically a 'digest' or 'hash'. SHA-256, for example, always produces a 256-bit (32-byte) hash.

These functions have several important properties:

  • Deterministic: The same input will always produce the same output.
  • One-way: It is computationally infeasible to reverse the function and find the original input from its hash.
  • Avalanche Effect: A small change in the input (like changing a single character) will produce a drastically different output hash.

Throughout this lab, you will use the openssl command-line tool and a simple Python script to explore these properties and understand how hashing is used in real-world scenarios like verifying file integrity and securing passwords.

Hash Function Properties

In this step, you will use the openssl command-line tool to explore two core properties of hash functions: being deterministic and the avalanche effect. A function is deterministic if the same input always produces the same output. The avalanche effect means a tiny change in the input results in a completely different output hash.

First, let's generate a SHA-256 hash for the string "hello". We will use the echo command to pass the string to openssl.

echo -n "hello" | openssl dgst -sha256

The -n flag in echo is important; it prevents echo from adding a newline character to the end of the string, which would change the resulting hash.

You should see an output like this:

SHA2-256(stdin)= 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

Now, let's run the exact same command again to demonstrate the deterministic property.

echo -n "hello" | openssl dgst -sha256

Notice that the output is identical. This confirms that for the same input, the SHA-256 hash is always the same.

SHA2-256(stdin)= 2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824

Next, let's demonstrate the avalanche effect. We will make a very small change to our input string—changing "hello" to "Hello" (with a capital 'H').

echo -n "Hello" | openssl dgst -sha256

Observe the new hash:

SHA2-256(stdin)= 185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969

Compare this hash to the one for "hello". Even though only one bit of the input was changed (the capitalization of the first letter), the resulting hash is completely different. This is the avalanche effect in action and is a critical feature for a secure hash function.

Compute File Hash

In this step, you will compute the SHA-256 hash of a text file. This is a common practice used to verify file integrity. When you download a file from the internet, websites often provide a checksum (a hash) so you can verify that the file was not corrupted during download or tampered with.

The setup script for this lab has already created a file named message.txt in your current directory (~/project). First, let's view its contents using the cat command.

cat message.txt

You will see the following content:

This is a secret message.

Now, let's compute the SHA-256 hash of this file. The syntax is similar to what you used before, but instead of piping input, you provide the filename as an argument to the openssl dgst command.

openssl dgst -sha256 message.txt

The command will process the file and print its SHA-256 hash. The output will look like this:

SHA2-256(message.txt)= 6432f513cfd40d47c8584494c0524468257e50dc1a0422f73becac85189543f8

This hash serves as a unique digital fingerprint for the current content of message.txt. If anyone changes even a single character in the file, the hash will change completely, as you'll see in a later step.

Generate Multiple Hashes

In this step, you'll get more practice generating SHA-256 hashes for different string inputs. This will help reinforce your understanding of how any unique input produces a unique hash. We will continue to use the echo -n command piped to openssl to ensure we are only hashing the string itself, without any extra characters.

First, let's generate the hash for the string "labex".

echo -n "labex" | openssl dgst -sha256

The output will be the SHA-256 hash for "labex":

SHA2-256(stdin)= 679e75b679886c5eaf8aaab88ddfc0181e6dae14cff346db8ba398bd7b2e31fe

Next, let's try a different string, "crypto", to see its unique hash.

echo -n "crypto" | openssl dgst -sha256

As expected, this produces a completely different hash:

SHA2-256(stdin)= da2f073e06f78938166f247273729dfe465bf7e46105c13ce7cc651047bf0ca4

This exercise demonstrates that every distinct piece of data, no matter how small or large, has its own unique hash value. This property is fundamental to how hashes are used for data verification, in blockchain technologies, and in digital signatures.

Demonstrate Collision Resistance

In this step, you will directly observe the avalanche effect and the concept of collision resistance by slightly modifying the message.txt file and seeing how its hash changes. Collision resistance means it is extremely difficult to find two different inputs that produce the same hash.

First, let's re-calculate the hash of the original message.txt file to have it fresh in our minds.

openssl dgst -sha256 message.txt

You should see the original hash again:

SHA2-256(message.txt)= 6432f513cfd40d47c8584494c0524468257e50dc1a0422f73becac85189543f8

Now, let's make a very small change to the file. We will append a single period . to the end of the file. We can do this easily using the echo command with the >> redirection operator, which appends output to a file.

echo "." >> message.txt

You can verify the change was made by viewing the file's content again.

cat message.txt

You will see the period at the end:

This is a secret message.
.

Now, let's re-hash the modified file.

openssl dgst -sha256 message.txt

The new hash will be:

SHA2-256(message.txt)= 4106e1c985a4ee1754fff76b8bffda0c4844679885cb70758f24cd43e771daac

Compare this new hash with the original one. They are completely different. This powerful demonstration shows that even a one-character change to a file will result in a radically different hash, making it easy to detect any form of tampering.

Create Password Hash

In this step, you will move beyond the command line and write a simple Python script to hash a password. Storing passwords in plain text is a major security vulnerability. The standard practice is to store a hash of the password instead. When a user tries to log in, the system hashes the password they entered and compares it to the stored hash.

The setup script has already created an empty file named hash_password.py. You will now add code to it using the nano text editor.

Open the file with nano:

nano hash_password.py

Now, copy and paste the following Python code into the nano editor:

import hashlib

## The password we want to hash
password = "mysecretpassword"

## Hash functions in Python work on bytes, not strings.
## We must encode the string into bytes first, using UTF-8.
password_bytes = password.encode('utf-8')

## Create a new SHA-256 hash object.
sha256_hash = hashlib.sha256(password_bytes)

## Get the hexadecimal representation of the hash.
hex_digest = sha256_hash.hexdigest()

print(f"The password is: {password}")
print(f"The SHA-256 hash is: {hex_digest}")

This script does the following:

  1. Imports the hashlib library, which provides various hashing algorithms.
  2. Defines a password string.
  3. Encodes the string into bytes using .encode('utf-8'). This is a crucial step, as hash functions operate on bytes.
  4. Creates a SHA-256 hash object and updates it with the password bytes.
  5. Retrieves the final hash in a readable hexadecimal format using .hexdigest().

After pasting the code, save the file and exit nano by pressing Ctrl+X, then Y, and then Enter.

Finally, run your Python script from the terminal:

python3 hash_password.py

The script will execute and print the password and its corresponding SHA-256 hash:

The password is: mysecretpassword
The SHA-256 hash is: 94aefb8be78b2b7c344d11d1ba8a79ef087eceb19150881f69460b8772753263

You have successfully used Python to perform cryptographic hashing, a skill essential for secure application development.

Summary

Congratulations on completing this lab! You have gained practical, hands-on experience with the SHA-256 cryptographic hash function.

In this lab, you learned:

  • The core properties of hash functions: they are deterministic, one-way, and exhibit the avalanche effect.
  • How to use the openssl dgst -sha256 command in a Linux environment to compute hashes for both strings and files.
  • The importance of hashing for verifying file integrity and detecting tampering.
  • How to use Python's hashlib library to programmatically generate SHA-256 hashes, a common task in password security.

Hashing is a cornerstone of modern cybersecurity. The skills you've practiced here are fundamental to understanding more advanced topics like digital signatures, message authentication codes (MACs), and blockchain technology. For a next step, you could research "salting" passwords, which adds another layer of security on top of hashing.