Class GuaguaMapper<MASTER_RESULT extends ml.shifu.guagua.io.Bytable,​WORKER_RESULT extends ml.shifu.guagua.io.Bytable>


  • public class GuaguaMapper<MASTER_RESULT extends ml.shifu.guagua.io.Bytable,​WORKER_RESULT extends ml.shifu.guagua.io.Bytable>
    extends org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,​org.apache.hadoop.io.Text,​org.apache.hadoop.io.Text,​org.apache.hadoop.io.Text>
    GuaguaMapper is the Hadoop Mapper implementation for both guagua master and guagua workers.

    Use (GuaguaInputSplit) context.getInputSplit() to check whether this task is guagua master or guagua worker.

    guaguaService is the interface for both guagua Master and Worker implementation. According to isMaster, master service and worker service will be determined.

    Only mapper, no reducer for guagua MapReduce implementation. And in this mapper run(org.apache.hadoop.mapreduce.Mapper.Context) is override while Mapper.map(Object, Object, org.apache.hadoop.mapreduce.Mapper.Context) is not since we don't need to iterate mapper raw input.

    • Nested Class Summary

      • Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper

        org.apache.hadoop.mapreduce.Mapper.Context
    • Constructor Summary

      Constructors 
      Constructor Description
      GuaguaMapper()  
    • Constructor Detail

      • GuaguaMapper

        public GuaguaMapper()
    • Method Detail

      • setup

        protected void setup​(org.apache.hadoop.mapreduce.Mapper.Context context)
                      throws IOException,
                             InterruptedException
        Overrides:
        setup in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,​org.apache.hadoop.io.Text,​org.apache.hadoop.io.Text,​org.apache.hadoop.io.Text>
        Throws:
        IOException
        InterruptedException
      • run

        public void run​(org.apache.hadoop.mapreduce.Mapper.Context context)
                 throws IOException,
                        InterruptedException
        Run guagua service according isMaster setting. Iteration, coordination will be included in service running.

        cleanup(org.apache.hadoop.mapreduce.Mapper.Context) is called in finally block to make sure resources can be cleaned.

        Guagua try best to update progress for each iteration. And also task status will be updated in each iteration in hadoop job web ui.

        Overrides:
        run in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,​org.apache.hadoop.io.Text,​org.apache.hadoop.io.Text,​org.apache.hadoop.io.Text>
        Throws:
        IOException
        InterruptedException
      • cleanup

        protected void cleanup​(org.apache.hadoop.mapreduce.Mapper.Context context)
                        throws IOException,
                               InterruptedException
        Overrides:
        cleanup in class org.apache.hadoop.mapreduce.Mapper<org.apache.hadoop.io.LongWritable,​org.apache.hadoop.io.Text,​org.apache.hadoop.io.Text,​org.apache.hadoop.io.Text>
        Throws:
        IOException
        InterruptedException
      • isMaster

        public boolean isMaster()
      • setMaster

        public void setMaster​(boolean isMaster)
      • getGuaguaService

        public ml.shifu.guagua.GuaguaService getGuaguaService()
      • setGuaguaService

        public void setGuaguaService​(ml.shifu.guagua.GuaguaService guaguaService)