public class GuaguaMRRecordReader
extends org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
GuaguaMRRecordReader is used as a mock for mapreduce reader interface, not real reading data.
To update progress, currentIteration and totalIterations should be set. currentIteration
only can be set in GuaguaMapper.run.
Why set currentIteration to static? The reason is that currentIteration for task cannot be transferred to
#GuaguaRecordReader because of no API from MapperContext. So static field here is used to update current
iteration.
If currentIteration is not set in each iteration. It can only start from 0. This progress update doesn't
work well for task fail-over(TODO).
| Constructor and Description |
|---|
GuaguaMRRecordReader()
Default constructor,
totalIterations is set to default 0. |
GuaguaMRRecordReader(int totalIterations)
Constructor with
totalIterations setting. |
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
org.apache.hadoop.io.LongWritable |
getCurrentKey()
This is a mock to hide Hadoop raw map iteration on map input key.
|
org.apache.hadoop.io.Text |
getCurrentValue()
This is a mock to hide Hadoop raw map iteration on map input value.
|
float |
getProgress()
Each iteration
context.nextKeyValue should be called, and currentIteration is updated, so the progress is
updated. |
void |
initialize(org.apache.hadoop.mapreduce.InputSplit inputSplit,
org.apache.hadoop.mapreduce.TaskAttemptContext context) |
boolean |
nextKeyValue()
Update iteration number.
|
static void |
setCurrentIteration(int currentIteration)
Should only be called in GuaguaMapper Progress callback.
|
public GuaguaMRRecordReader()
totalIterations is set to default 0.public GuaguaMRRecordReader(int totalIterations)
totalIterations setting.totalIterations - total iterations for such guagua job.public void close()
throws IOException
close in interface Closeableclose in interface AutoCloseableclose in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>IOExceptionpublic float getProgress()
throws IOException
context.nextKeyValue should be called, and currentIteration is updated, so the progress is
updated.getProgress in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>IOExceptionpublic org.apache.hadoop.io.LongWritable getCurrentKey()
throws IOException,
InterruptedException
getCurrentKey in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>IOExceptionInterruptedExceptionpublic org.apache.hadoop.io.Text getCurrentValue()
throws IOException,
InterruptedException
getCurrentValue in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>IOExceptionInterruptedExceptionpublic void initialize(org.apache.hadoop.mapreduce.InputSplit inputSplit,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
throws IOException,
InterruptedException
initialize in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>IOExceptionInterruptedExceptionpublic boolean nextKeyValue()
throws IOException,
InterruptedException
nextKeyValue in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>IOExceptionInterruptedExceptionpublic static void setCurrentIteration(int currentIteration)
Copyright © 2019. All Rights Reserved.