Implementing the Mapper and Reducer
In this step, we will create a Mapper and a Reducer class to process the book data using the MapReduce paradigm.
Custom BookMapper
First, create a new Java file named BookMapper.java
in the /home/hadoop
directory with the following content:
import java.io.IOException;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
// BookMapper extends the Mapper class to process text input files
// Input key-value pairs are LongWritable (line number) and Text (line content)
// Output key-value pairs are Text (author name) and Book (book details)
public class BookMapper extends Mapper<LongWritable, Text, Text, Book> {
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
// Split the input line by comma
String[] bookData = value.toString().split(",");
// Extract title, author, and year from the input line
String title = bookData[0];
String author = bookData[1];
int year = Integer.parseInt(bookData[2]);
// Write the author and book details to the context
context.write(new Text(author), new Book(title, author, year));
}
}
This BookMapper
class takes a line of input data in the format "title,author,year"
and emits a key-value pair with the author as the key and a Book
object as the value.
Custom BookReducer
Next, create a new Java file named BookReducer.java
in the /home/hadoop
directory with the following content:
import java.io.IOException;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;
// BookReducer extends the Reducer class to aggregate book details by author
// Input key-value pairs are Text (author name) and Book (book details)
// Output key-value pairs are Text (author name) and Book (aggregated book details)
public class BookReducer extends Reducer<Text, Book, Text, Book> {
@Override
protected void reduce(Text key, Iterable<Book> values, Context context) throws IOException, InterruptedException {
// Iterate through books for the same author and write each book to the context
for (Book book : values) {
context.write(key, book);
}
}
}
This BookReducer
class simply emits the input key-value pairs as-is, effectively grouping the books by author.
Compile The Files
Fanilly, you will need to compile the Java classes use the following commands:
## Compile the Java classes
javac -source 8 -target 8 -classpath $HADOOP_HOME/share/hadoop/common/hadoop-common-3.3.6.jar:$HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-client-core-3.3.6.jar:. BookMapper.java BookReducer.java