MASTER_RESULT - master result for computation in each iteration.WORKER_RESULT - worker result for computation in each iteration.KEY - key type for each recordVALUE - value type for each recordpublic abstract class AbstractCombineWorkerComputable<MASTER_RESULT extends Bytable,WORKER_RESULT extends Bytable,KEY extends Bytable,VALUE extends Bytable> extends Object implements WorkerComputable<MASTER_RESULT,WORKER_RESULT>
AbstractWorkerComputable, only doCompute(Bytable, Bytable, WorkerContext) for each record are
published to user. But the first iteration to load data is included in computation.
Worker result should be updated in doCompute(Bytable, Bytable, WorkerContext), and which will also be
populated to Master when all records are processed in one iteration.
To load data successfully, make sure GuaguaRecordReader is initialized firstly. in
initRecordReader(GuaguaFileSplit):
this.setRecordReader(new GuaguaSequenceAsTextRecordReader()); this.getRecordReader().initialize(fileSplit);or directly use other constructors:
this.setRecordReader(new GuaguaSequenceAsTextRecordReader(fileSplit));
After data is loaded in the first iteration, one can store the data into collections (meomory or disk) to do later iteration logic. But OOM issue should be taken care by users.
| Modifier | Constructor and Description |
|---|---|
protected |
AbstractCombineWorkerComputable() |
protected |
AbstractCombineWorkerComputable(boolean isOrder) |
| Modifier and Type | Method and Description |
|---|---|
WORKER_RESULT |
compute(WorkerContext<MASTER_RESULT,WORKER_RESULT> context)
Worker computation for each iteration.
|
abstract void |
doCompute(KEY currentKey,
VALUE currentValue,
WorkerContext<MASTER_RESULT,WORKER_RESULT> context)
Computation by each record, all update can be set to WORKER_RESULT by
context.setCurrentWorkerResult(WORKER_RESULT); |
GuaguaRecordReader<KEY,VALUE> |
getRecordReader() |
abstract void |
init(WorkerContext<MASTER_RESULT,WORKER_RESULT> workerContext)
Initialization work for the whole computation
|
abstract void |
initRecordReader(GuaguaFileSplit fileSplit)
Each
GuaguaFileSplit must be initialized before loading data. |
protected void |
postLoad(WorkerContext<MASTER_RESULT,WORKER_RESULT> workerContext)
Do some post work after loading data.
|
protected void |
preLoad(WorkerContext<MASTER_RESULT,WORKER_RESULT> workerContext)
Do some pre work before loading data.
|
void |
setRecordReader(GuaguaRecordReader<KEY,VALUE> recordReader) |
protected AbstractCombineWorkerComputable()
protected AbstractCombineWorkerComputable(boolean isOrder)
public WORKER_RESULT compute(WorkerContext<MASTER_RESULT,WORKER_RESULT> context) throws IOException
WorkerComputablecompute in interface WorkerComputable<MASTER_RESULT extends Bytable,WORKER_RESULT extends Bytable>context - the worker context instance which includes worker info, master result of last iteration or other
useful into for each iteration.IOException - if any io exception in computation, for example, IOException in reading data.protected void preLoad(WorkerContext<MASTER_RESULT,WORKER_RESULT> workerContext)
protected void postLoad(WorkerContext<MASTER_RESULT,WORKER_RESULT> workerContext)
public abstract void initRecordReader(GuaguaFileSplit fileSplit) throws IOException
GuaguaFileSplit must be initialized before loading data.IOExceptionpublic abstract void init(WorkerContext<MASTER_RESULT,WORKER_RESULT> workerContext)
public abstract void doCompute(KEY currentKey, VALUE currentValue, WorkerContext<MASTER_RESULT,WORKER_RESULT> context)
context.setCurrentWorkerResult(WORKER_RESULT);public GuaguaRecordReader<KEY,VALUE> getRecordReader()
public void setRecordReader(GuaguaRecordReader<KEY,VALUE> recordReader)
Copyright © 2015. All Rights Reserved.