Introduction
In this lab, you will explore the process of cracking password-protected PDF documents using the powerful open-source password cracking tool, John the Ripper. You will learn how to create a secure PDF, extract its hash, and then attempt to crack the password. This lab will also cover different PDF encryption types and provide best practices for securing your PDF documents. Understanding these concepts is crucial for both cybersecurity professionals and anyone handling sensitive information in PDF format.
Create a Password-Protected PDF Document
In this step, you will create a simple text file and convert it into a password-protected PDF document. We will use the enscript and ps2pdf commands for this purpose. enscript converts text files to PostScript, and ps2pdf converts PostScript to PDF.
First, create a simple text file named secret.txt in your ~/project directory.
echo "This is a secret document." > ~/project/secret.txt
cat ~/project/secret.txt
Next, convert secret.txt to a PostScript file.
enscript -p ~/project/secret.ps ~/project/secret.txt
ls -l ~/project/secret.ps
Now, convert the PostScript file to a password-protected PDF. We will set both the user password and the owner password to labex123. The user password restricts opening the document, while the owner password restricts permissions like printing or editing.
ps2pdf -sOwnerPassword=labex123 -sUserPassword=labex123 ~/project/secret.ps ~/project/protected.pdf
ls -l ~/project/protected.pdf
You have successfully created a password-protected PDF document.
Extract Hash from PDF using pdf2john
In this step, you will extract the hash from the password-protected PDF document using pdf2john.py. pdf2john.py is a Python script that comes with John the Ripper and is designed to extract crackable hashes from PDF files.
First, locate the pdf2john.py script. It's usually found in the run directory of John the Ripper's installation.
find /usr/share/john -name pdf2john.py
Now, use pdf2john.py to extract the hash from ~/project/protected.pdf and save it to a file named pdf_hash.txt.
python3 /usr/share/john/pdf2john.py ~/project/protected.pdf > ~/project/pdf_hash.txt
cat ~/project/pdf_hash.txt
The output will show a hash string that John the Ripper can attempt to crack. The format typically includes information about the PDF version, encryption type, and the actual hash.
Crack PDF Hash with John the Ripper
In this step, you will use John the Ripper to crack the extracted PDF hash. We will use a simple wordlist for this demonstration.
First, create a wordlist file named wordlist.txt in your ~/project directory. Include labex123 (the correct password) and some other common passwords.
echo -e "password\n123456\nlabex123\nqwerty" > ~/project/wordlist.txt
cat ~/project/wordlist.txt
Now, use John the Ripper with the wordlist.txt to crack the hash in pdf_hash.txt.
john --wordlist=~/project/wordlist.txt ~/project/pdf_hash.txt
John the Ripper will process the hash and the wordlist. If it finds a match, it will display the cracked password.
After cracking, you can view the cracked passwords that John has found and stored.
john --show ~/project/pdf_hash.txt
You should see labex123 as the cracked password for protected.pdf.
Understand PDF Encryption Types
In this step, you will learn about different PDF encryption types and how they affect the security of a PDF document. PDF encryption has evolved over time, with newer versions offering stronger protection.
PDF encryption typically uses algorithms like RC4 or AES. The key length also varies, with longer keys providing more security.
- RC4 40-bit: This is an older and weaker encryption method, easily crackable.
- RC4 128-bit: A more common older standard, but still vulnerable to modern cracking techniques.
- AES 128-bit: A stronger encryption standard, more resistant to brute-force attacks.
- AES 256-bit: The strongest encryption currently available for PDFs, offering robust protection.
You can use exiftool to inspect the encryption details of a PDF. Let's check the protected.pdf we created.
exiftool ~/project/protected.pdf | grep "Encryption"
The output will show details like Encryption : RC4 128-bit or AES 256-bit, depending on the tools and versions used to create the PDF. Our ps2pdf command typically uses RC4 128-bit by default for older compatibility. Stronger encryption requires specific options or newer PDF creation tools.
Understanding the encryption type is crucial because it directly impacts the effort required to crack a password. Weaker encryption types are much faster to crack.
Secure PDF Documents Properly
In this step, you will learn best practices for securing PDF documents to prevent unauthorized access and cracking. While tools like John the Ripper can crack weak passwords, strong security measures can make it extremely difficult or impossible.
Here are key recommendations for securing your PDF documents:
Use Strong Passwords: This is the most critical step. A strong password should be:
- Long (at least 12-16 characters).
- Complex (mix of uppercase, lowercase, numbers, and symbols).
- Unique (not used for any other account or document).
- Random (avoid dictionary words or personal information).
Use Strong Encryption Algorithms: Always choose the highest available encryption standard, preferably AES 256-bit. When creating PDFs, ensure your software is configured to use modern encryption.
Set Both User and Owner Passwords:
- User Password: Restricts opening the document. This is the primary protection.
- Owner Password: Restricts permissions like printing, copying content, editing, or adding comments. Even if the user password is known, the owner password can prevent certain actions.
Avoid Storing Passwords with Documents: Never store the password for a PDF in the same location or on the same system as the PDF itself.
Regularly Update Software: Ensure your PDF creation and viewing software is up-to-date to benefit from the latest security patches and encryption standards.
Consider Digital Signatures: For authenticity and integrity, digital signatures can verify the document's origin and ensure it hasn't been tampered with.
By following these practices, you can significantly enhance the security of your PDF documents and protect sensitive information from unauthorized access.
Summary
In this lab, you have gained hands-on experience with cracking password-protected PDF documents using John the Ripper. You learned how to create a password-protected PDF, extract its hash using pdf2john.py, and then successfully crack the password. Furthermore, you explored different PDF encryption types and understood their implications for security. Finally, you reviewed essential best practices for properly securing PDF documents, emphasizing the importance of strong passwords and modern encryption standards. This knowledge is vital for protecting sensitive information and understanding potential vulnerabilities in PDF files.


