Exporting Data to the Galactic Trade Network
In this step, you will learn how to export processed data from Hadoop to the Galactic Trade Network, ensuring that the cargo information is accessible to all member systems.
First, create a new directory in HDFS called /home/hadoop/exports
:
hdfs dfs -mkdir /home/hadoop/exports
Now, launch the Hive shell by executing the following command:
hive
Run a Hive query to process the orion_manifest.csv
file and generate a summary report:
CREATE TABLE orion_manifest(
item STRING,
quantity INT,
origin STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE;
LOAD DATA INPATH '/home/hadoop/imports/orion_manifest.csv' INTO TABLE orion_manifest;
INSERT OVERWRITE DIRECTORY '/home/hadoop/exports/orion_summary'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
SELECT origin, SUM(quantity) AS total_quantity
FROM orion_manifest
GROUP BY origin;
EXIT;
This Hive query will create a table from the orion_manifest.csv
file, process the data, and store the summary report in the /home/hadoop/exports/orion_summary
directory in HDFS.
Export the summary report from HDFS to the local filesystem:
mkdir /home/hadoop/galactic_exports
hadoop fs -get /home/hadoop/exports/orion_summary/* /home/hadoop/galactic_exports/
This command will create a galactic_exports
directory in the /home/hadoop
directory and copy the files from the /home/hadoop/exports/orion_summary
directory in HDFS to the galactic_exports
directory.
Finally, upload the summary report to the Galactic Trade Network using the scp
command:
scp /home/hadoop/galactic_exports/* localhost:/home/hadoop/incoming/reports/
This command will securely copy the files from the galactic_exports
directory to the /incoming/reports/
directory on the localhost
server, making the summary report available to all member systems of the Galactic Trade Network. In practice, you can replace localhost
with a real server, e.g. trade.network.com
.