Compiling and Running the Job
In this step, we will compile the Java classes and run the MapReduce job on the Hadoop cluster.
First, we need to compile the Java classes:
javac -source 8 -target 8 -classpath $HADOOP_HOME/share/hadoop/common/hadoop-common-3.3.6.jar:$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-3.3.6.jar:. *.java
This command compiles the Java classes and places the compiled .class
files in the current directory. The -classpath
option includes the Hadoop library paths, which are needed to compile the code that uses Hadoop classes. The -source
and -target
are parameters used to specify the Java source and target bytecode versions to match the version of java in hadoop
Next, packing a class
file with the jar
command:
jar -cvf Artifact.jar *.class
Finally, we can run the MapReduce job, and all the data about the desert is already stored in the /input
HDFS directory:
hadoop jar Artifact.jar ArtifactDriver /input /output
After executing the command, you should see logs indicating the progress of the MapReduce job. Once the job is complete, you can find the output files in the /output
HDFS directory. And use the following command to view the result:
hdfs dfs -ls /output
hdfs dfs -cat /output/part-r-00000