Use Tshark for Network Traffic Analysis

Introduction

In this lab, you will learn how to use tshark, the command - line interface of the powerful network protocol analyzer Wireshark. Mastering tshark allows you to streamline network analysis workflows, automate tasks, and gain deeper insights into network traffic.

This lab will guide you through different command - line options and practical scenarios. It will equip you with the skills to efficiently analyze network captures and troubleshoot network - related issues. The command - line approach has significant advantages over the graphical interface, especially for large capture files or automated analysis.

Skills Graph

%%%%{init: {'theme':'neutral'}}%%%% flowchart RL wireshark(("Wireshark")) -.-> wireshark/WiresharkGroup(["Wireshark"]) wireshark/WiresharkGroup -.-> wireshark/installation("Installation and Setup") wireshark/WiresharkGroup -.-> wireshark/packet_capture("Packet Capture") wireshark/WiresharkGroup -.-> wireshark/display_filters("Display Filters") wireshark/WiresharkGroup -.-> wireshark/export_packets("Exporting Packets") wireshark/WiresharkGroup -.-> wireshark/packet_analysis("Packet Analysis") wireshark/WiresharkGroup -.-> wireshark/commandline_usage("Command Line Usage") subgraph Lab Skills wireshark/installation -.-> lab-415942{{"Use Tshark for Network Traffic Analysis"}} wireshark/packet_capture -.-> lab-415942{{"Use Tshark for Network Traffic Analysis"}} wireshark/display_filters -.-> lab-415942{{"Use Tshark for Network Traffic Analysis"}} wireshark/export_packets -.-> lab-415942{{"Use Tshark for Network Traffic Analysis"}} wireshark/packet_analysis -.-> lab-415942{{"Use Tshark for Network Traffic Analysis"}} wireshark/commandline_usage -.-> lab-415942{{"Use Tshark for Network Traffic Analysis"}} end

Understanding and Capturing Network Traffic with Tshark

In this step, we'll dive into the world of network traffic analysis using Tshark. First, you'll learn what Tshark is and why it's a valuable tool for network analysis. Then, we'll figure out how to identify the network interfaces on your system, which is crucial because you need to know where to capture the traffic from. Finally, we'll see how to capture network traffic using the Wireshark command - line interface, which is Tshark.

What is tshark?

Tshark is essentially the command - line version of Wireshark. While Wireshark has a graphical interface that's great for visual inspection, Tshark offers the same core functionality without the need for a graphical display. With Tshark, you can capture packets from a network. Packets are like small envelopes that carry data across a network. It also allows you to display detailed information about these packets, such as where they came from, where they're going, and what kind of data they're carrying. You can save the captured data in a file for later analysis. This tool is particularly useful in several scenarios:

Automated network monitoring: You can set up scripts to run Tshark regularly and check for any unusual network activity.
Analyzing large capture files efficiently: Since it's a command - line tool, it can handle large amounts of data more quickly than some graphical alternatives.
Running on servers without a graphical interface: Servers often don't have a graphical display, and Tshark can be run directly from the command line.
Integrating network analysis into scripts: You can use Tshark commands within your own scripts to perform custom network analysis tasks.

Installing tshark

Before we can start using Tshark, we need to make sure it's installed on your system. To install Tshark on a system that uses the apt package manager (like Ubuntu), run the following command in your terminal. The sudo part gives you administrative privileges, apt install is used to install packages, -y automatically answers yes to any prompts, and tshark is the package we want to install.

sudo apt install -y tshark

Identifying Your Network Interfaces

Before you can start capturing network traffic, you need to know which network interface to monitor. A network interface is like a door through which your computer connects to a network. To list all the available network interfaces on your system, run the following command:

tshark -D

This command will display a list of all the network interfaces on your system. The output will look something like this:

1. eth0
2. eth1
3. lo (Loopback)
4. any (Pseudo - device that captures on all interfaces)

In our lab, we'll use the any interface. This is a special pseudo - device that allows us to capture traffic from all the available network interfaces at once.

Capturing Network Traffic

Now that we know which interface to use, let's start capturing some network traffic. The basic syntax for capturing traffic with Tshark is as follows:

tshark -i <interface> -w <output_file>

Here's what each part means:

-i <interface>: This option specifies which network interface you want to capture traffic from. You can replace <interface> with the name of the actual interface, like eth0 or any.
-w <output_file>: This option specifies the location and name of the file where you want to save the captured packets.

First, let's create a directory to store our captured files. The mkdir -p command creates a directory if it doesn't exist, and we're creating it at /home/labex/project/captures.

mkdir -p /home/labex/project/captures

Now, let's start the capture. We'll use the any interface and save the captured packets in a file named capture.pcapng in the /home/labex/project directory.

tshark -i any -w /home/labex/project/capture.pcapng

Once you run this command, you'll see output indicating that Tshark has started capturing packets. The output will look like this:

Capturing on 'any'

To actually capture some meaningful traffic, we need to generate it. Open a new terminal tab and run the following command. The curl command is used to transfer data from a server. Here, we're trying to access the website https://www.example.com.

curl https://www.example.com

After generating the traffic, go back to the terminal where Tshark is running and press Ctrl + C to stop the capture. You'll see a message indicating how many packets were captured. It might look like this:

Capturing on 'any'
164 packets captured

Examining the Captured File

To make sure that the capture file was created successfully, we can use the ls -l command. This command lists the files in a directory and shows detailed information about them. Run the following command to check the capture file:

ls -l /home/labex/project/capture.pcapng

You should see output similar to this:

-rw-r--r-- 1 labex labex 24680 Jan 27 12:34 /home/labex/project/capture.pcapng

Now, let's take a quick look at what we captured. We'll use Tshark again, but this time with the -r option. The -r option is used to read a capture file. We'll pipe the output to the head - 10 command, which will show us the first 10 packets in the file.

tshark -r /home/labex/project/capture.pcapng | head -10

This command will display detailed packet information, including timestamps (when the packet was captured), source and destination addresses (where the packet came from and where it was going), and the protocols used.

Filtering Network Traffic with Tshark

In this step, we'll explore how to apply filters to network traffic captures. When dealing with network traffic, capture files can be quite large and filled with a vast amount of data. Filtering helps us focus on specific types of packets that we're interested in. This is crucial because it allows us to analyze large capture files more efficiently and identify relevant traffic patterns.

Understanding Display Filters

Tshark uses display filters to select which packets to display or process from a capture file. Think of these filters as a way to tell Tshark which packets you want to look at. They use a specific syntax to define matching criteria based on protocol fields. For example, you can tell Tshark to only show packets that belong to a certain protocol or have a specific IP address. The basic syntax for applying a display filter is:

tshark -r "<filter_expression>" < input_file > -Y

Let's break down the components of this command:

-r <input_file>: This part of the command specifies the capture file that Tshark should read. It's like telling Tshark where to find the network traffic data.
-Y "<filter_expression>": This specifies the display filter that you want to apply. The filter expression is a set of rules that define which packets should be selected.

Common Display Filter Examples

Here are some useful filter expressions that you can use. These examples cover different aspects of packet filtering, such as filtering by protocol, IP address, port, HTTP method, DNS query, and combining multiple filters.

Filter by protocol: tcp, udp, icmp, http, dns. For example, if you use tcp, Tshark will only show packets that use the TCP protocol.
Filter by IP address: ip.addr == 192.168.1.1. This filter will show only packets that have the IP address 192.168.1.1 either as the source or destination.
Filter by port: tcp.port == 80 or tcp.port == 443. These filters will show packets that use TCP ports 80 or 443. Port 80 is commonly used for HTTP traffic, and port 443 is used for HTTPS traffic.
Filter by HTTP method: http.request.method == "GET". This filter will show only HTTP requests that use the GET method.
Filter by DNS query: dns.qry.name contains "example.com". This filter will show DNS packets where the query name contains the string "example.com".
Combining filters: tcp.port == 80 and http.request.method == "POST". This filter combines two conditions. It will show only packets that use TCP port 80 and have an HTTP POST request.

Applying Filters to Our Capture

Let's start by filtering for HTTPS traffic (TCP port 443) from our capture file. We'll use the following command:

tshark -r /home/labex/project/capture.pcapng -Y "tcp.port == 443"

When you run this command, Tshark will read the capture file /home/labex/project/capture.pcapng and apply the filter tcp.port == 443. As a result, you should see only packets that use TCP port 443, which is typically used for HTTPS traffic. The output will include details about these packets, such as the source and destination IP addresses, port numbers, and packet flags. Here's an example of what the output might look like:

  1   0.000000 192.168.1.100 → 93.184.216.34 TCP 74 43210 → 443 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM=1
  2   0.023456 93.184.216.34 → 192.168.1.100 TCP 74 443 → 43210 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=128 SACK_PERM=1
  3   0.023789 192.168.1.100 → 93.184.216.34 TCP 66 43210 → 443 [ACK] Seq=1 Ack=1 Win=64240 Len=0
  ...

Let's try another filter to look for DNS traffic. We'll use the following command:

tshark -r /home/labex/project/capture.pcapng -Y "dns"

This command will display only the DNS packets in the capture. The output will show DNS queries and responses, including details such as the query name and the response IP address. Here's an example of what the output might look like:

  8   0.034567 192.168.1.100 → 8.8.8.8 DNS 82 Standard query 0x1234 A example.com
  9   0.056789 8.8.8.8 → 192.168.1.100 DNS 98 Standard query response 0x1234 A example.com A 93.184.216.34

Counting Packets by Type

You can also use filters to count specific types of packets. This can be useful for getting an overview of the traffic in a capture file. For example, to count the number of TCP packets, we can use the following command:

tshark -r /home/labex/project/capture.pcapng -Y "tcp" | wc -l

In this command, tshark reads the capture file and applies the filter tcp. The output of tshark is then piped (|) to the wc -l command, which counts the number of lines in the output. Since each line represents a packet, this gives us the number of TCP packets in the capture file.

Let's count the number of HTTPS packets and save the result to a file. We'll use the following command:

tshark -r /home/labex/project/capture.pcapng -Y "tcp.port == 443" | wc -l > /home/labex/project/filtered_packet_count.txt

This command is similar to the previous one, but instead of just displaying the count, we redirect (>) the output to a file named filtered_packet_count.txt. You can view the result with the following command:

cat /home/labex/project/filtered_packet_count.txt

The output will show the number of packets that match the filter. For example:

Extracting Specific Fields

Tshark can extract specific fields from packets using the -T fields and -e options. This is useful when you're only interested in certain information from the packets, such as the host, method, and URI of an HTTP request. Here's an example command:

tshark -r /home/labex/project/capture.pcapng -Y "http" -T fields -e http.host -e http.request.method -e http.request.uri

In this command, tshark reads the capture file, applies the filter http to select only HTTP packets, and then uses the -T fields option to specify that we want to extract fields. The -e option is used to specify which fields to extract. In this case, we're extracting the http.host, http.request.method, and http.request.uri fields. The output might look like this:

example.com	GET	/index.html
example.com	GET	/images/logo.png

Analyzing and Exporting Network Traffic with Tshark

In this step, we'll focus on how to export network traffic in different formats and perform basic traffic analysis using tshark. These skills are crucial because they allow you to share the captured data with your colleagues or use it in other tools. By the end of this section, you'll be able to handle different file formats and extract valuable information from network traffic.

Understanding Capture File Formats

Wireshark, a well - known network protocol analyzer, supports several capture file formats. Each format has its own unique characteristics, which are important to understand as they determine how the data can be used later.

pcapng: This is the default format used by Wireshark. It supports multiple interfaces and has advanced features. It's a great choice when you need to capture complex network scenarios.
pcap: The classic format. It's compatible with older tools, but it has fewer features compared to pcapng. If you need to work with legacy systems, this format might be your go - to.
csv: Comma - separated values. This format is very useful when you want to import the data into spreadsheets for further analysis.
json: JavaScript Object Notation. It's ideal for programmatic analysis, as it can be easily parsed by programming languages.
text: A plain text format that is human - readable. It's useful when you want to quickly view the data without any special tools.

Exporting to Different File Formats

To change the format of a capture file, you can use the -F option in tshark. The general command structure is as follows:

tshark -r <input_file> -F <format> -w <output_file>

Here, -r specifies the input file, -F sets the output format, and -w defines the output file.

Let's take an example and export our capture to the pcap format:

tshark -r /home/labex/project/capture.pcapng -F pcap -w /home/labex/project/export.pcap

When this command runs successfully, you won't see any output on the screen. To confirm that the export was successful, you can use the ls command to list the details of the exported file:

ls -l /home/labex/project/export.pcap

You should see output similar to this:

-rw-r--r-- 1 labex labex 22468 Jan 27 12:45 /home/labex/project/export.pcap

Analyzing Protocol Statistics

Tshark is not only useful for exporting files but also for generating various statistics about the captured traffic. Let's explore some of these statistical analysis options.

Protocol Hierarchy Statistics

If you want to see how different protocols are distributed in your capture, you can use the following command:

tshark -r /home/labex/project/capture.pcapng -z io,phs

The -z option is used to specify the statistics type. In this case, io,phs stands for protocol hierarchy statistics. The output will show the hierarchy of protocols and the percentage of packets for each protocol.

Protocol Hierarchy Statistics
|
+ Ethernet
  + Internet Protocol Version 4
    + Transmission Control Protocol
      + Transport Layer Security
        + Hypertext Transfer Protocol Secure
    + User Datagram Protocol
      + Domain Name System

Conversation Statistics

To analyze the conversations between endpoints in the network, you can use the following command:

tshark -r /home/labex/project/capture.pcapng -z conv,tcp

This command focuses on TCP conversations. It shows statistics such as the endpoints involved, the number of packets exchanged, and the total bytes transferred.

TCP Conversations
                                               |       <-      | |       ->      | |     Total     |    Relative    |   Duration   |
                                               | Frames  Bytes | | Frames  Bytes | | Frames  Bytes |      Start     |              |
192.168.1.100:43210 <-> 93.184.216.34:443          24   18765      18    4532      42    23297       0.000000000        8.2345

HTTP Request Statistics

If your capture contains HTTP traffic, you can analyze the HTTP requests using the following command:

tshark -r /home/labex/project/capture.pcapng -z http,tree

This command organizes the HTTP requests by URI and shows the number of requests for each URI.

HTTP/Requests:
 /index.html                                    1 requests
 /images/logo.png                               2 requests

Exporting to Different Text Formats

Apart from binary formats, tshark can also export data to text formats, which are often easier to analyze.

Exporting to CSV

To export specific fields from the capture to a CSV file, you can use the following command:

tshark -r /home/labex/project/capture.pcapng -T fields -e frame.number -e ip.src -e ip.dst -e tcp.srcport -e tcp.dstport -E header=y -E separator=, > /home/labex/project/tcp_summary.csv

Here, -T fields specifies that we want to export specific fields. The -e option is used to define the fields we want to export, such as frame number, source and destination IPs, and source and destination TCP ports. -E header=y adds a header to the CSV file, and -E separator=, sets the separator as a comma.

Examining the CSV Export

After exporting the data to a CSV file, you can quickly view the first few lines of the file using the head command:

head -5 /home/labex/project/tcp_summary.csv

The output might look like this:

frame.number,ip.src,ip.dst,tcp.srcport,tcp.dstport
1,192.168.1.100,93.184.216.34,43210,443
2,93.184.216.34,192.168.1.100,443,43210
3,192.168.1.100,93.184.216.34,43210,443
...

Advanced Tshark Techniques and Piping

In this step, you'll learn advanced tshark techniques. These techniques are crucial for network analysis as they allow you to perform complex operations on network traffic data. Specifically, you'll learn how to read network traffic from standard input (stdin) and how to combine tshark with other command - line tools using pipes. Mastering these skills will enable you to create powerful network analysis workflows, which can save you time and effort when dealing with large amounts of network data.

Understanding Linux Pipes and Standard Input

In the Linux operating system, pipes (|) are a very useful feature. They act as a bridge between two commands, allowing you to send the output of one command as input to another command. This way, you can chain multiple commands together to perform more complex tasks. Standard input (stdin) is a data stream that a program reads for input. When you use the - symbol with many command - line tools, it's a signal to the tool that the input should come from stdin instead of a file. This gives you more flexibility in how you process data.

Reading Network Traffic from Standard Input

Tshark has the ability to read capture data from standard input using the -r - option. This feature is extremely useful in scenarios where you want to process data from another command or when you need to filter a capture on - the - fly. Instead of directly reading from a file, you can pipe data into tshark.

The basic syntax for reading network traffic from standard input is:

cat <input_file> | tshark -r -

Let's try this with our capture file. The following command reads the capture file and displays all packets, similar to running tshark -r capture.pcapng.

cat /home/labex/project/capture.pcapng | tshark -r -

The output will show all packets in the capture, like this:

  1   0.000000 192.168.1.100 → 93.184.216.34 TCP 74 43210 → 443 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM=1
  2   0.023456 93.184.216.34 → 192.168.1.100 TCP 74 443 → 43210 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=128 SACK_PERM=1
  ...

Filtering on Standard Input

You can also apply filters when reading from stdin. This allows you to focus on specific types of network traffic.

cat /home/labex/project/capture.pcapng | tshark -r - -Y "tcp.port == 80"

This command will display only HTTP traffic (TCP port 80) from the capture. By using the filter, you can quickly isolate the data you're interested in.

Creating a Pipeline for Network Analysis

Let's create a more complex pipeline that performs multiple operations on the network capture data. This pipeline will:

Read the capture file
Filter for DNS traffic
Extract only the DNS query names
Sort them alphabetically
Remove duplicates
Save the result to a file

cat /home/labex/project/capture.pcapng | tshark -r - -Y "dns" -T fields -e dns.qry.name | sort | uniq > /home/labex/project/dns_queries.txt

Let's examine the result by running the following command:

cat /home/labex/project/dns_queries.txt

The output will show a sorted list of unique DNS query names from your capture, like this:

example.com
www.example.com

Combining Tshark with Other Tools

Tshark can be combined with other command - line tools for more powerful analysis.

Counting Packet Types with grep

cat /home/labex/project/capture.pcapng | tshark -r - | grep TCP | wc -l > /home/labex/project/tcp_count.txt

This pipeline counts the number of TCP packets in the capture. By using grep to find TCP packets and wc -l to count them, you can quickly get an idea of the amount of TCP traffic in your capture.

Extracting HTTP User Agents with sed

cat /home/labex/project/capture.pcapng | tshark -r - -Y "http.user_agent" -T fields -e http.user_agent | sed 's/,/\n/g' > /home/labex/project/user_agents.txt

This extracts all HTTP user agent strings, replacing commas with newlines. This makes the output more readable and easier to analyze.

Saving Output from Stdin to a File

Let's save the complete output from a stdin tshark analysis to a file. This way, you can review the data later.

cat /home/labex/project/capture.pcapng | tshark -r - > /home/labex/project/stdin_output.txt

Let's verify the content by running the following command:

head -5 /home/labex/project/stdin_output.txt

This should show the first 5 lines of the analysis, similar to:

  1   0.000000 192.168.1.100 → 93.184.216.34 TCP 74 43210 → 443 [SYN] Seq=0 Win=64240 Len=0 MSS=1460 WS=256 SACK_PERM=1
  2   0.023456 93.184.216.34 → 192.168.1.100 TCP 74 443 → 43210 [SYN, ACK] Seq=0 Ack=1 Win=65535 Len=0 MSS=1460 WS=128 SACK_PERM=1
  3   0.023789 192.168.1.100 → 93.184.216.34 TCP 66 43210 → 443 [ACK] Seq=1 Ack=1 Win=64240 Len=0
  4   0.024012 192.168.1.100 → 93.184.216.34 TLSv1.2 192 Client Hello
  5   0.045678 93.184.216.34 → 192.168.1.100 TLSv1.2 1023 Server Hello, Certificate, Server Key Exchange, Server Hello Done

Summary

In this lab, you have learned how to effectively use the Wireshark command - line interface (tshark) for network traffic analysis. First, you grasped the basic concepts of tshark and learned to capture network traffic from network interfaces. Then, you explored applying filters to focus on specific traffic types, which is crucial when dealing with large capture files.

You also learned to export network traffic in different formats for sharing or further analysis. Moreover, you explored tshark's statistical analysis capabilities to understand network traffic composition. Finally, you advanced to more complex techniques, such as reading traffic from standard input and creating analysis pipelines by combining tshark with other command - line tools. These skills offer advantages over the graphical Wireshark interface in scenarios like handling large files, performing automated analysis, analyzing on remote servers, and creating repeatable workflows. By mastering these techniques, you have enhanced your network troubleshooting and security analysis capabilities for more efficient work in various networking contexts.