Package org.pipecraft.pipes.terminal
Class SharderByItemPipe<T>
- java.lang.Object
-
- org.pipecraft.pipes.terminal.TerminalPipe
-
- org.pipecraft.pipes.terminal.SharderByItemPipe<T>
-
- Type Parameters:
T- The input items' data type
- All Implemented Interfaces:
Closeable,AutoCloseable,BasePipe
- Direct Known Subclasses:
SharderByHashPipe
public class SharderByItemPipe<T> extends TerminalPipe
A terminal pipe that splits the contents of the input pipe into multiple files, according to some sharding criteria based on each item. The original order is preserved in each shard. Note that this implementation keeps all shard files open at the same time, so make sure the system can handle this number of open files.- Author:
- Eyal Schneider
-
-
Constructor Summary
Constructors Constructor Description SharderByItemPipe(Pipe<T> input, EncoderFactory<? super T> encoderFactory, FailableFunction<? super T,String,PipeException> shardSelectorFunction, File folder)ConstructorSharderByItemPipe(Pipe<T> input, EncoderFactory<? super T> encoderFactory, FailableFunction<? super T,String,PipeException> shardSelectorFunction, File folder, FileWriteOptions writeOptions)Constructor
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidclose()Map<String,Integer>getShardSizes()voidstart()Performs pre-processing prior to item flow throw the pipe.-
Methods inherited from class org.pipecraft.pipes.terminal.TerminalPipe
getProgress
-
-
-
-
Constructor Detail
-
SharderByItemPipe
public SharderByItemPipe(Pipe<T> input, EncoderFactory<? super T> encoderFactory, FailableFunction<? super T,String,PipeException> shardSelectorFunction, File folder, FileWriteOptions writeOptions)
Constructor- Parameters:
input- The input pipeencoderFactory- The encoder factory to use for writing items into the different shardsshardSelectorFunction- Given an item, selects the corresponding shard id. Files will use this id as a name. Must not return null for any non null input!folder- The folder where to place all shards. Must exist.writeOptions- Specify how the shard files should be written
-
SharderByItemPipe
public SharderByItemPipe(Pipe<T> input, EncoderFactory<? super T> encoderFactory, FailableFunction<? super T,String,PipeException> shardSelectorFunction, File folder)
Constructor- Parameters:
input- The input pipeencoderFactory- The encoder factory to use for writing items into the different shardsshardSelectorFunction- Given an item, selects the corresponding shard id. Files will use this id as a name. Must not return null for any non null input!folder- The folder where to place all shards. Must exist.
-
-
Method Detail
-
close
public void close() throws IOException- Throws:
IOException
-
start
public void start() throws PipeException, InterruptedExceptionDescription copied from interface:BasePipePerforms pre-processing prior to item flow throw the pipe. Implementations must call the same method for all their input pipes before accessing their items. This is typically done here.- Throws:
PipeException- In case of pipe errors in this pipe or somewhere up-stream.InterruptedException- In case that the operation has been interrupted by another thread.
-
-