Introduction
In this lab, you will learn how to effectively use Gobuster, a popular directory and file brute-forcing tool, to exclude specific HTTP status codes from its scan results. When performing web enumeration, you often encounter many "Not Found" (404) or other irrelevant responses that clutter the output. By filtering these out, you can focus on more meaningful results, making your reconnaissance efforts more efficient and targeted. This lab will guide you through identifying noisy status codes, running a baseline scan, and then applying exclusion filters using Gobuster's -b flag.
Identify Noisy Status Codes to Ignore (e.g., 404)
In this step, you will understand why it's important to identify and exclude certain HTTP status codes during a Gobuster scan. When Gobuster attempts to access non-existent paths, the web server typically responds with a "404 Not Found" status code. These 404 responses can flood your scan output, making it difficult to spot legitimate findings. Other status codes, like 3xx redirects or 5xx server errors, might also be considered "noisy" depending on your specific reconnaissance goals.
To illustrate this, we will first run a simple curl command against our local web server to see how it responds to a non-existent path.
Open your terminal and execute the following command:
curl -I http://localhost:8000/nonexistent_page
You should see an output similar to this, indicating a 404 Not Found status:
HTTP/1.0 404 Not Found
Server: SimpleHTTP/0.6
Date: ...
Content-type: text/html
Content-Length: ...
This 404 Not Found is a common "noisy" response. In the following steps, you will learn how to tell Gobuster to ignore such responses.
Run a Scan Without Any Filtering
In this step, you will perform a basic Gobuster scan without any status code filtering. This will serve as a baseline to demonstrate how much noise (e.g., 404 responses) can be generated in the output. You will use the dir mode of Gobuster to enumerate directories and files on our local web server.
Execute the following command in your terminal:
gobuster dir -u http://localhost:8000 -w ~/project/wordlist.txt
Let's break down the command:
gobuster dir: Specifies that we want to use the directory/file enumeration mode.-u http://localhost:8000: Sets the target URL to our local web server.-w ~/project/wordlist.txt: Specifies the wordlist file to use for brute-forcing.
Observe the output. You will likely see many entries with (Status: 404) next to them, indicating that Gobuster tried to access paths that do not exist on the server.
...
/nonexistent_page (Status: 404)
/admin (Status: 404)
/test (Status: 404)
/existing_dir (Status: 200)
/another_file.txt (Status: 200)
/redirect_me (Status: 302)
/forbidden_area (Status: 403)
/server_error (Status: 500)
...
As you can see, the output is cluttered with 404 responses, making it harder to identify the actual existing resources.
Use the -b Flag to Exclude 404 Not Found
In this step, you will learn how to use Gobuster's -b (or --exclude-length) flag to exclude specific HTTP status codes from the results. This is crucial for filtering out irrelevant responses and focusing on what matters. We will specifically exclude the 404 Not Found status code, which is the most common noisy response.
The -b flag takes a comma-separated list of status codes to exclude.
Execute the following command in your terminal to run the Gobuster scan, excluding 404 responses:
gobuster dir -u http://localhost:8000 -w ~/project/wordlist.txt -b 404
Let's look at the new flag:
-b 404: Tells Gobuster to exclude any results that return an HTTP status code of 404.
Observe the output carefully. You should notice that all entries with (Status: 404) are now gone, resulting in a much cleaner and more focused list of results.
...
/existing_dir (Status: 200)
/another_file.txt (Status: 200)
/redirect_me (Status: 302)
/forbidden_area (Status: 403)
/server_error (Status: 500)
...
This significantly improves the readability and usefulness of your scan results.
Execute the Scan and Observe the Cleaner Output
In this step, you will re-run the Gobuster scan with the -b 404 flag and pay close attention to the output to confirm that the 404 responses are indeed excluded. This reinforces your understanding of how the exclusion flag works and its impact on the scan results.
Execute the command again to see the filtered output:
gobuster dir -u http://localhost:8000 -w ~/project/wordlist.txt -b 404
As the scan progresses, you will see only the entries that returned status codes other than 404. This demonstrates the effectiveness of the -b flag in reducing noise.
Example of expected cleaner output:
...
/existing_dir (Status: 200)
/another_file.txt (Status: 200)
/redirect_me (Status: 302)
/forbidden_area (Status: 403)
/server_error (Status: 500)
...
Notice how the nonexistent_page, admin, and test entries (which returned 404) are no longer present in the output. This makes it much easier to identify valid resources.
Combine -s and -b for Precise Filtering
In this final step, you will learn how to combine the -s (include status codes) and -b (exclude status codes) flags for even more precise filtering. While -b is great for removing noise, sometimes you only want to see specific types of responses, like successful ones (200 OK) or redirects (3xx).
The -s flag allows you to specify a comma-separated list of status codes to include in the results. When both -s and -b are used, Gobuster will first apply the -s filter, and then from the remaining results, it will apply the -b filter.
Let's say you only want to see 200 OK and 302 Found responses, while still explicitly excluding 404 Not Found.
Execute the following command:
gobuster dir -u http://localhost:8000 -w ~/project/wordlist.txt -s 200,302 -b 404
Here's the breakdown:
-s 200,302: Tells Gobuster to only show results with status codes 200 or 302.-b 404: Tells Gobuster to exclude results with status code 404. (Though in this specific case, 404 would already be excluded by-s, it demonstrates the combination.)
Observe the output. You should now only see entries with Status: 200 and Status: 302.
...
/existing_dir (Status: 200)
/another_file.txt (Status: 200)
/redirect_me (Status: 302)
...
This powerful combination allows you to fine-tune your Gobuster scans to retrieve only the most relevant information, significantly improving your efficiency in web reconnaissance.
Summary
In this lab, you have successfully learned how to exclude specific HTTP status codes from your Gobuster scan results. You started by understanding why filtering is necessary, especially for common "noisy" responses like 404 Not Found. You then performed a baseline scan to observe the unfiltered output. The core of this lab involved using the -b flag to exclude unwanted status codes, leading to a much cleaner and more focused result set. Finally, you explored how to combine the -s (include) and -b (exclude) flags for even more precise control over your Gobuster scans. This skill is invaluable for efficient and targeted web reconnaissance.
