Class GuaguaMapReduceClient


  • public class GuaguaMapReduceClient
    extends Object
    GuaguaMapReduceClient is the entry point for guagua mapreduce implementation application.

    To use it in normal Hadoop mode. Use main(String[]) as entry point.

    To run jobs in parallel:

     GuaguaMapReduceClient client = new GuaguaMapReduceClient();
     client.addJob(args);
     client.addJob(args);
     client.run();
     

    WARNING: In one GuaguaMapReduceClient instance, addJob(String[]) to make sure job names are no duplicated.

    If one job is failed, it will be re-submitted again and try, if failed times over two, no re-try.

    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static void addInputPath​(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path path)  
      void addJob​(String[] args)
      Add new job to JobControl instance.
      protected double calculateProgress​(Set<String> successJobs, org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl jc, org.apache.hadoop.mapred.JobClient jobClient)
      Compute the progress of the current job submitted through the JobControl object jc to the JobClient jobClient
      org.apache.hadoop.mapreduce.Job createJob​(String[] args)
      Create Hadoop job according to arguments from main.
      static void main​(String[] args)  
      protected double progressOfRunningJob​(org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob cjob, org.apache.hadoop.mapred.JobClient jobClient)
      Returns the progress of a Job j which is part of a submitted JobControl object.
      int run()
      Run all jobs added to JobControl.
      String toFakedStateString​(org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob controlledJob)  
    • Constructor Detail

      • GuaguaMapReduceClient

        public GuaguaMapReduceClient()
        Default constructor. Construct default JobControl instance.
    • Method Detail

      • run

        public int run()
                throws IOException
        Run all jobs added to JobControl.
        Returns:
        0 - if all jobs run successfully 1 - if there is any fail job
        Throws:
        IOException
      • toFakedStateString

        public String toFakedStateString​(org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob controlledJob)
      • calculateProgress

        protected double calculateProgress​(Set<String> successJobs,
                                           org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl jc,
                                           org.apache.hadoop.mapred.JobClient jobClient)
                                    throws IOException
        Compute the progress of the current job submitted through the JobControl object jc to the JobClient jobClient
        Parameters:
        jc - The JobControl object that has been submitted
        jobClient - The JobClient to which it has been submitted
        Returns:
        The progress as a percentage in double format
        Throws:
        IOException - In case any IOException connecting to JobTracker.
      • progressOfRunningJob

        protected double progressOfRunningJob​(org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob cjob,
                                              org.apache.hadoop.mapred.JobClient jobClient)
                                       throws IOException
        Returns the progress of a Job j which is part of a submitted JobControl object. The progress is for this Job. So it has to be scaled down by the number of jobs that are present in the JobControl.
        Parameters:
        cjob - - The Job for which progress is required
        jobClient - - the JobClient to which it has been submitted
        Returns:
        Returns the percentage progress of this Job
        Throws:
        IOException - In case any IOException connecting to JobTracker.
      • addInputPath

        public static void addInputPath​(org.apache.hadoop.conf.Configuration conf,
                                        org.apache.hadoop.fs.Path path)
                                 throws IOException
        Throws:
        IOException
      • createJob

        public org.apache.hadoop.mapreduce.Job createJob​(String[] args)
                                                  throws IOException
        Create Hadoop job according to arguments from main.
        Throws:
        IOException