Package ml.shifu.guagua.mapreduce
Class GuaguaMapReduceClient
- java.lang.Object
-
- ml.shifu.guagua.mapreduce.GuaguaMapReduceClient
-
public class GuaguaMapReduceClient extends Object
GuaguaMapReduceClientis the entry point for guagua mapreduce implementation application.To use it in normal Hadoop mode. Use
main(String[])as entry point.To run jobs in parallel:
GuaguaMapReduceClient client = new GuaguaMapReduceClient(); client.addJob(args); client.addJob(args); client.run();
WARNING: In one GuaguaMapReduceClient instance,
addJob(String[])to make sure job names are no duplicated.If one job is failed, it will be re-submitted again and try, if failed times over two, no re-try.
-
-
Constructor Summary
Constructors Constructor Description GuaguaMapReduceClient()Default constructor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static voidaddInputPath(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path path)voidaddJob(String[] args)Add new job to JobControl instance.protected doublecalculateProgress(Set<String> successJobs, org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl jc, org.apache.hadoop.mapred.JobClient jobClient)Compute the progress of the current job submitted through the JobControl object jc to the JobClient jobClientorg.apache.hadoop.mapreduce.JobcreateJob(String[] args)Create Hadoop job according to arguments from main.static voidmain(String[] args)protected doubleprogressOfRunningJob(org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob cjob, org.apache.hadoop.mapred.JobClient jobClient)Returns the progress of a Job j which is part of a submitted JobControl object.intrun()Run all jobs added to JobControl.StringtoFakedStateString(org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob controlledJob)
-
-
-
Method Detail
-
addJob
public void addJob(String[] args) throws IOException
Add new job to JobControl instance.- Throws:
IOException
-
run
public int run() throws IOExceptionRun all jobs added to JobControl.- Returns:
- 0 - if all jobs run successfully 1 - if there is any fail job
- Throws:
IOException
-
toFakedStateString
public String toFakedStateString(org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob controlledJob)
-
calculateProgress
protected double calculateProgress(Set<String> successJobs, org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl jc, org.apache.hadoop.mapred.JobClient jobClient) throws IOException
Compute the progress of the current job submitted through the JobControl object jc to the JobClient jobClient- Parameters:
jc- The JobControl object that has been submittedjobClient- The JobClient to which it has been submitted- Returns:
- The progress as a percentage in double format
- Throws:
IOException- In case any IOException connecting to JobTracker.
-
progressOfRunningJob
protected double progressOfRunningJob(org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob cjob, org.apache.hadoop.mapred.JobClient jobClient) throws IOExceptionReturns the progress of a Job j which is part of a submitted JobControl object. The progress is for this Job. So it has to be scaled down by the number of jobs that are present in the JobControl.- Parameters:
cjob- - The Job for which progress is requiredjobClient- - the JobClient to which it has been submitted- Returns:
- Returns the percentage progress of this Job
- Throws:
IOException- In case any IOException connecting to JobTracker.
-
addInputPath
public static void addInputPath(org.apache.hadoop.conf.Configuration conf, org.apache.hadoop.fs.Path path) throws IOException- Throws:
IOException
-
createJob
public org.apache.hadoop.mapreduce.Job createJob(String[] args) throws IOException
Create Hadoop job according to arguments from main.- Throws:
IOException
-
main
public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException
-
-