Introduction
In this lab, you will learn how to manage the limited storage capacity of a camel caravan in the Arabian Desert, including space quotas versus name quotas. You will be tasked with having to carefully plan and allocate cargo space to ensure the safety and integrity of transporting precious cargo, including spices and fine silks.
Explore Current Storage Limits
In this step, you will explore the current storage usage of the Hadoop Distributed File System (HDFS) and familiarize yourself with the existing directories and files.
- Switch to the
hadoopuser usingsu - hadoopcommand. Then, use the following code to create files and directories:
hdfs dfs -mkdir -p /cargo_space/fine_silks
hdfs dfs -touchz /cargo_space/spices.txt
- Use the following code to view the created files and directories:
hdfs dfs -ls -R /cargo_space
This command will list the contents of the /cargo_space directory.
- Query the detailed statistical information of the specified directory in HDFS and display the quota situation:
hdfs dfs -count -q /cargo_space
Here is the explanation of the above command:
hdfs: represents the command line tool for the Hadoop distributed file system.dfs: represents the set of commands that operate on the distributed file system.-count: this option is used to count the number of files and the number of directories in the specified directory.-q: This parameter indicates that quota information for the directory is to be displayed. The quota information includes the storage space and file count limits set by the file system for the directory./cargo_space: This is the path to the destination directory of the command operation.
You will see the following results:
none inf none inf 2 1 0 /cargo_space
The results are explained in turn as follows:
none: there is no limit to the number of files quota (i.e. name quota).inf: the number of remaining files under the limit is infinite.none: disk space quota has no limit.inf: the remaining space under the limit is unlimited.2: The number of files or directories in the/cargo_spacedirectory is 2.1:/cargo_spaceStored data occupies 1 storage unit.0 /cargo_space: The/cargo_spacedirectory has an additional disk space usage of 0 bytes.
Set a Space Quota for a Directory
In this step, you will learn how to set a sapce quota for a directory in HDFS, which will limit the total disk space usage for that directory and its subdirectories.
- Set a quota of 1 GB (1073741824 bytes) for the
/cargo_spacedirectory by running:
hdfs dfsadmin -setSpaceQuota 1073741824 /cargo_space
This command sets a disk space quota of 1 GB for the /cargo_space directory and its subdirectories.
- Query the detailed statistical information of the specified directory in HDFS and display the quota situation:
hdfs dfs -count -q /cargo_space
Set a Name Quota for a Directory
In addition to setting a disk space quota, HDFS also allows you to set a quota for the maximum number of files and directories within a directory. In this step, you will learn how to set this name quota.
- Set a quota of 10 files/directories for the
/cargo_spacedirectory by running:
hdfs dfsadmin -setQuota 10 /cargo_space
This command sets a name quota of 10 files and directories for the /cargo_space directory and its subdirectories.
- To verify the quota, run the following command:
hdfs dfs -count -q /cargo_space
This command will display the current number of files and directories, as well as the quota limit for the specified directory.
Remove Quota Limits on Directories
In this step, you will learn how to remove quota limits for directories in HDFS, which includes both disk space quotas and name quotas set previously.
Removing Disk Space Quotas
- For the
/cargo_spacedirectory, run the following command to remove its disk space quota:
hdfs dfsadmin -clrSpaceQuota /cargo_space
This command removes the disk space quota limit for the /cargo_space directory and its subdirectories.
- To confirm that the quota has been removed, query the detailed statistics of the specified directory and display the quota:
hdfs dfs -count -q /cargo_space
Remove name quota
- For the
/cargo_spacedirectory, run the following command to remove its file and directory count quotas:
hdfs dfsadmin -clrQuota /cargo_space
This command removes the file and directory quota limits for the /cargo_space directory and its subdirectories.
- To verify that the quota has been removed, run the following command:
hdfs dfs -count -q /cargo_space
At this point, you can ensure that the /cargo_space directory is no longer subject to the previously set quota limits.
Summary
Congratulations! You have successfully completed the Hadoop Quota Management lab, mastering the basic techniques for managing storage resources in the Hadoop Distributed File System (HDFS). Through the Camelot scenario, you have learned how to explore current storage usage, implement space quotas, and name quotas. This hands-on experience not only improves your practical skills, but also highlights the importance of efficient resource management in distributed systems such as Hadoop.



