Package org.pipecraft.pipes.sync.source
Class StorageMultiTxtFileReaderPipe<B>
- java.lang.Object
-
- org.pipecraft.pipes.sync.inter.CompoundPipe<String>
-
- org.pipecraft.pipes.sync.source.StorageMultiTxtFileReaderPipe<B>
-
- Type Parameters:
B- The bucket's file metadata type in the cloud storage implementation being used For more features and for general binary files useStorageMultiFileReaderPipe.
- All Implemented Interfaces:
Closeable,AutoCloseable,BasePipe,Pipe<String>
public class StorageMultiTxtFileReaderPipe<B> extends CompoundPipe<String>
Reads data from multiple files in cloud storage under some folder, as if they were concatenated using some predefined order. Files are automatically un-compressed according to their extensions (SeeCompression) for supported formats. In contrast to the more general and flexibleStorageMultiFileReaderPipe, here the data is always streamed and not downloaded first, so it may be less efficient in some cases.- Author:
- Eyal Schneider
-
-
Constructor Summary
Constructors Constructor Description StorageMultiTxtFileReaderPipe(Storage<?,B> storage, String bucket, String folderPath)Constructor Scans the remote files in lexicographic name order.StorageMultiTxtFileReaderPipe(Storage<?,B> storage, String bucket, String folderPath, Charset charset, int chunkSize, String fileRegex, Comparator<B> comparator)ConstructorStorageMultiTxtFileReaderPipe(Storage<?,B> storage, String bucket, String folderPath, Comparator<B> comparator)Constructor Assumes UTF8 encoding of all files, and doesn't apply any filter on files to read from.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected Pipe<String>createPipeline()-
Methods inherited from class org.pipecraft.pipes.sync.inter.CompoundPipe
close, getProgress, next, peek, start
-
-
-
-
Constructor Detail
-
StorageMultiTxtFileReaderPipe
public StorageMultiTxtFileReaderPipe(Storage<?,B> storage, String bucket, String folderPath, Charset charset, int chunkSize, String fileRegex, Comparator<B> comparator)
Constructor- Parameters:
storage- The cloud storage connectorbucket- The bucket to read the file fromfolderPath- The full path of the folder to read the files fromcharset- The charset usedchunkSize- The size (in bytes) of each chunk read from storage at once, or 0 for using the default one.fileRegex- Used for determining which files to read from based on the file namecomparator- A comparator used for defining the order at which file are read
-
StorageMultiTxtFileReaderPipe
public StorageMultiTxtFileReaderPipe(Storage<?,B> storage, String bucket, String folderPath, Comparator<B> comparator)
Constructor Assumes UTF8 encoding of all files, and doesn't apply any filter on files to read from.- Parameters:
storage- The cloud storage connectorbucket- The bucket to read the file fromfolderPath- The full path of the folder to read the files fromcomparator- A comparator used for defining the order at which file are read
-
StorageMultiTxtFileReaderPipe
public StorageMultiTxtFileReaderPipe(Storage<?,B> storage, String bucket, String folderPath)
Constructor Scans the remote files in lexicographic name order. Performs no filtering, and assumes UTF8 encoding.- Parameters:
storage- The cloud storage connectorbucket- The bucket to read the file fromfolderPath- The full path of the folder to read the files from
-
-
Method Detail
-
createPipeline
protected Pipe<String> createPipeline() throws PipeException, InterruptedException
- Specified by:
createPipelinein classCompoundPipe<String>- Returns:
- A new pipeline to represent the logic of this pipe
- Throws:
PipeException- In case of a pipeline creation errorInterruptedException- In case that the thread is interrupted
-
-