ml.shifu.guagua.worker
类 AbstractCombineWorkerComputable<MASTER_RESULT extends Bytable,WORKER_RESULT extends Bytable,KEY extends Bytable,VALUE extends Bytable>

java.lang.Object
  继承者 ml.shifu.guagua.worker.AbstractCombineWorkerComputable<MASTER_RESULT,WORKER_RESULT,KEY,VALUE>
类型参数:
MASTER_RESULT - master result for computation in each iteration.
WORKER_RESULT - worker result for computation in each iteration.
KEY - key type for each record
VALUE - value type for each record
所有已实现的接口:
WorkerComputable<MASTER_RESULT,WORKER_RESULT>

public abstract class AbstractCombineWorkerComputable<MASTER_RESULT extends Bytable,WORKER_RESULT extends Bytable,KEY extends Bytable,VALUE extends Bytable>
extends Object
implements WorkerComputable<MASTER_RESULT,WORKER_RESULT>

A high-effective implementation to load data and do computation. This is different with AbstractWorkerComputable, only doCompute(Bytable, Bytable, WorkerContext) for each record are published to user. But the first iteration to load data is included in computation.

Worker result should be updated in doCompute(Bytable, Bytable, WorkerContext), and which will also be populated to Master when all records are processed in one iteration.

To load data successfully, make sure GuaguaRecordReader is initialized firstly. in initRecordReader(GuaguaFileSplit):

 this.setRecordReader(new GuaguaSequenceAsTextRecordReader());
 this.getRecordReader().initialize(fileSplit);
 
or directly use other constructors:
 this.setRecordReader(new GuaguaSequenceAsTextRecordReader(fileSplit));
 

After data is loaded in the first iteration, one can store the data into collections (meomory or disk) to do later iteration logic. But OOM issue should be taken care by users.


构造方法摘要
protected AbstractCombineWorkerComputable()
           
protected AbstractCombineWorkerComputable(boolean isOrder)
           
 
方法摘要
 WORKER_RESULT compute(WorkerContext<MASTER_RESULT,WORKER_RESULT> context)
          Worker computation for each iteration.
abstract  void doCompute(KEY currentKey, VALUE currentValue, WorkerContext<MASTER_RESULT,WORKER_RESULT> context)
          Computation by each record, all update can be set to WORKER_RESULT by context.setCurrentWorkerResult(WORKER_RESULT);
 GuaguaRecordReader<KEY,VALUE> getRecordReader()
           
abstract  void init(WorkerContext<MASTER_RESULT,WORKER_RESULT> workerContext)
          Initialization work for the whole computation
abstract  void initRecordReader(GuaguaFileSplit fileSplit)
          Each GuaguaFileSplit must be initialized before loading data.
protected  void postLoad(WorkerContext<MASTER_RESULT,WORKER_RESULT> workerContext)
          Do some post work after loading data.
protected  void preLoad(WorkerContext<MASTER_RESULT,WORKER_RESULT> workerContext)
          Do some pre work before loading data.
 void setRecordReader(GuaguaRecordReader<KEY,VALUE> recordReader)
           
 
从类 java.lang.Object 继承的方法
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

构造方法详细信息

AbstractCombineWorkerComputable

protected AbstractCombineWorkerComputable()

AbstractCombineWorkerComputable

protected AbstractCombineWorkerComputable(boolean isOrder)
方法详细信息

compute

public WORKER_RESULT compute(WorkerContext<MASTER_RESULT,WORKER_RESULT> context)
                                      throws IOException
从接口 WorkerComputable 复制的描述
Worker computation for each iteration.

指定者:
接口 WorkerComputable<MASTER_RESULT extends Bytable,WORKER_RESULT extends Bytable> 中的 compute
参数:
context - the worker context instance which includes worker info, master result of last iteration or other useful into for each iteration.
返回:
the worker result of each iteration.
抛出:
IOException - if any io exception in computation, for example, IOException in reading data.

preLoad

protected void preLoad(WorkerContext<MASTER_RESULT,WORKER_RESULT> workerContext)
Do some pre work before loading data.


postLoad

protected void postLoad(WorkerContext<MASTER_RESULT,WORKER_RESULT> workerContext)
Do some post work after loading data.


initRecordReader

public abstract void initRecordReader(GuaguaFileSplit fileSplit)
                               throws IOException
Each GuaguaFileSplit must be initialized before loading data.

抛出:
IOException

init

public abstract void init(WorkerContext<MASTER_RESULT,WORKER_RESULT> workerContext)
Initialization work for the whole computation


doCompute

public abstract void doCompute(KEY currentKey,
                               VALUE currentValue,
                               WorkerContext<MASTER_RESULT,WORKER_RESULT> context)
Computation by each record, all update can be set to WORKER_RESULT by context.setCurrentWorkerResult(WORKER_RESULT);


getRecordReader

public GuaguaRecordReader<KEY,VALUE> getRecordReader()

setRecordReader

public void setRecordReader(GuaguaRecordReader<KEY,VALUE> recordReader)


Copyright © 2014. All Rights Reserved.