Package org.pipecraft.pipes.terminal
Class AsyncEnqueuingSharderPipe<T>
- java.lang.Object
-
- org.pipecraft.pipes.terminal.TerminalPipe
-
- org.pipecraft.pipes.terminal.AsyncEnqueuingSharderPipe<T>
-
- Type Parameters:
T- The items data type
- All Implemented Interfaces:
Closeable,AutoCloseable,BasePipe
public class AsyncEnqueuingSharderPipe<T> extends TerminalPipe
A terminal pipe that receives an async pipe as input, and shards the contents of the input pipe into multiple queues according to some sharding criteria based on item values. In case of relatively few shards this option is a good alternative toAsyncSharderPipe(when used as an intermediate step), because it doesn't involve disk IO. In case of errors, the start() method unblocks and exits with the exception, as required by the spec. In addition, queue consumers will read a special error marker placed by this class. Similarly, completion is reported by sending a completion marker to all queues. The implementation allows calling close() by any thread after start() has been invoked. In case of a premature close, no markers (error/success) are sent to the output queues, meaning that the caller is responsible for releasing them. Caveats: 1. This implementation fills multiple queues, so it's recommended to use bounded queues and be aware of their total memory consumption. 2. In order to prevent a deadlock, the caller should make sure to not start queue consumers and the start() method by the same thread. Alternatively, one can use the asyncStart() method. 3. Queue consumers should not try to drain some queues before others using blocking calls, since it will result in a deadlock. 4. Queue consumers should be aware of the reserved error and successful completion markers, and handle them differently than a standard item.- Author:
- Eyal Schneider
-
-
Constructor Summary
Constructors Constructor Description AsyncEnqueuingSharderPipe(AsyncPipe<T> input, List<? extends BlockingQueue<T>> queues, Function<? super T,Integer> selectorFunction, T successMarker, T errorMarker)ConstructorAsyncEnqueuingSharderPipe(AsyncPipe<T> input, List<? extends BlockingQueue<T>> queues, T successMarker, T errorMarker)Constructor Uses hash based sharding into queues
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Future<Void>asyncStart()A special async version of the standard start() method.voidclose()floatgetProgress()int[]getShardSizes()voidstart()Performs pre-processing prior to item flow throw the pipe.
-
-
-
Constructor Detail
-
AsyncEnqueuingSharderPipe
public AsyncEnqueuingSharderPipe(AsyncPipe<T> input, List<? extends BlockingQueue<T>> queues, Function<? super T,Integer> selectorFunction, T successMarker, T errorMarker)
Constructor- Parameters:
input- The input pipequeues- The queues to write to. The order indicates their identities used by the selector function.selectorFunction- Given an item, selects the index of the queue to write the item to. Must return an integer between 0 and queues.size() - 1.successMarker- A special (reserved reference) item value used for indicating a successful completion to queue consumerserrorMarker- A special (reserved reference) item value used for indicating an error to queue consumers.
-
AsyncEnqueuingSharderPipe
public AsyncEnqueuingSharderPipe(AsyncPipe<T> input, List<? extends BlockingQueue<T>> queues, T successMarker, T errorMarker)
Constructor Uses hash based sharding into queues- Parameters:
input- The input pipequeues- The queues to write to. The order indicates their identities used by the selector function.successMarker- A special (reserved reference) item value used for indicating a successful completion to queue consumerserrorMarker- A special (reserved reference) item value used for indicating an error to queue consumers.
-
-
Method Detail
-
close
public void close() throws IOException- Throws:
IOException
-
start
public void start() throws PipeException, InterruptedExceptionDescription copied from interface:BasePipePerforms pre-processing prior to item flow throw the pipe. Implementations must call the same method for all their input pipes before accessing their items. This is typically done here.- Throws:
PipeException- In case of pipe errors in this pipe or somewhere up-stream.InterruptedException- In case that the operation has been interrupted by another thread.
-
asyncStart
public Future<Void> asyncStart()
A special async version of the standard start() method. The caller may run this method before starting the queue consumers or vice versa, without risking with deadlock. The returned future can be used to detect pipe completion, and to get the exception, if any.- Returns:
- the future representing the completion of this terminal pipe
-
getShardSizes
public int[] getShardSizes()
- Returns:
- The counts of items written to each shard, as an array. Item i corresponds to queue #i. Call this method only after start() has been called and completed successfully.
-
getProgress
public float getProgress()
- Specified by:
getProgressin interfaceBasePipe- Overrides:
getProgressin classTerminalPipe- Returns:
- The pipe flow progress, as a floating number between 0.0 and 1.0. Important implementation rules: 1) Calling this method before start() call is complete isn't allowed and has an undefined behavior. 2) Implementation should do best effort to provide an estimate of the progress this pipe has made (0.0 - 1.0) 3) When the pipe is fully consumed, getProgress() should return 1.0. 4) Results must be monotonous, i.e. results of consecutive calls may never be decreasing. 5) Thread safety: progress may be maintained by some thread/s but monitoring by other threads. Implementations must be thread safe.
-
-