Checking Disk Usage in HDFS
To check the disk usage in HDFS, you can use the hdfs dfsadmin
command. This command provides various options to retrieve information about the HDFS cluster, including disk usage.
Checking Total Disk Usage
To get the total disk usage of the HDFS cluster, you can use the following command:
hdfs dfsadmin -report
This command will display the following information:
- Total capacity of the cluster
- Total used space
- Total remaining space
- Number of live and dead DataNodes
- Decommissioned and Decomissioning DataNodes
Here's an example output:
Configured Capacity: 1000000000 (1.00 GB)
Present Capacity: 900000000 (900.00 MB)
DFS Remaining: 800000000 (800.00 MB)
DFS Used: 100000000 (100.00 MB)
DFS Used%: 11.11%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0
Pending deletion blocks: 0
-------------------------------------------------
Live datanodes (1):
Name: 127.0.0.1:50010 (localhost)
Hostname: localhost
Decommission Status : Normal
Configured Capacity: 1000000000 (1.00 GB)
DFS Used: 100000000 (100.00 MB)
Non DFS Used: 100000000 (100.00 MB)
DFS Remaining: 800000000 (800.00 MB)
DFS Used%: 11.11%
DFS Remaining%: 88.89%
Last contact: Fri Apr 14 16:04:12 UTC 2023
This output provides a detailed overview of the HDFS cluster's disk usage, including the total capacity, used space, remaining space, and the status of the DataNodes.
Checking Disk Usage for a Specific Path
To check the disk usage for a specific path in HDFS, you can use the following command:
hdfs dfs -du -h <path>
Replace <path>
with the HDFS path you want to check. This command will display the disk usage for the specified path in a human-readable format (e.g., "100.00 MB").
By combining these commands, you can effectively monitor and manage the disk usage in your HDFS cluster.