public class GuaguaMapper<MASTER_RESULT extends ml.shifu.guagua.io.Bytable,WORKER_RESULT extends ml.shifu.guagua.io.Bytable>
extends org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>
GuaguaMapper is the Hadoop Mapper implementation for both guagua master and guagua workers.
Use (GuaguaInputSplit) context.getInputSplit() to check whether this task is guagua master or guagua
worker.
guaguaService is the interface for both guagua Master and Worker implementation. According to
isMaster, master service and worker service will be determined.
Only mapper, no reducer for guagua MapReduce implementation. And in this mapper
run(org.apache.hadoop.mapreduce.Mapper.Context) is override while
Mapper.map(Object, Object, org.apache.hadoop.mapreduce.Mapper.Context) is not since we don't need to iterate mapper
raw input.
| Constructor and Description |
|---|
GuaguaMapper() |
| Modifier and Type | Method and Description |
|---|---|
protected void |
cleanup(org.apache.hadoop.mapreduce.Mapper.Context context) |
ml.shifu.guagua.GuaguaService |
getGuaguaService() |
boolean |
isMaster() |
void |
run(org.apache.hadoop.mapreduce.Mapper.Context context)
Run guagua service according
isMaster setting. |
void |
setGuaguaService(ml.shifu.guagua.GuaguaService guaguaService) |
void |
setMaster(boolean isMaster) |
protected void |
setup(org.apache.hadoop.mapreduce.Mapper.Context context) |
protected void setup(org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException,
InterruptedException
setup in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>IOExceptionInterruptedExceptionpublic void run(org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException,
InterruptedException
isMaster setting. Iteration, coordination will be included in service
running.
cleanup(org.apache.hadoop.mapreduce.Mapper.Context) is called in finally block to make sure resources
can be cleaned.
Guagua try best to update progress for each iteration. And also task status will be updated in each iteration in hadoop job web ui.
run in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>IOExceptionInterruptedExceptionprotected void cleanup(org.apache.hadoop.mapreduce.Mapper.Context context)
throws IOException,
InterruptedException
cleanup in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>IOExceptionInterruptedExceptionpublic boolean isMaster()
public void setMaster(boolean isMaster)
public ml.shifu.guagua.GuaguaService getGuaguaService()
public void setGuaguaService(ml.shifu.guagua.GuaguaService guaguaService)
Copyright © 2019. All Rights Reserved.