Class StorageMultiTxtFileReaderPipe<B>

  • Type Parameters:
    B - The bucket's file metadata type in the cloud storage implementation being used For more features and for general binary files use StorageMultiFileReaderPipe.
    All Implemented Interfaces:
    Closeable, AutoCloseable, BasePipe, Pipe<String>

    public class StorageMultiTxtFileReaderPipe<B>
    extends CompoundPipe<String>
    Reads data from multiple files in cloud storage under some folder, as if they were concatenated using some predefined order. Files are automatically un-compressed according to their extensions (See Compression) for supported formats. In contrast to the more general and flexible StorageMultiFileReaderPipe, here the data is always streamed and not downloaded first, so it may be less efficient in some cases.
    Author:
    Eyal Schneider
    • Constructor Detail

      • StorageMultiTxtFileReaderPipe

        public StorageMultiTxtFileReaderPipe​(Storage<?,​B> storage,
                                             String bucket,
                                             String folderPath,
                                             Charset charset,
                                             int chunkSize,
                                             String fileRegex,
                                             Comparator<B> comparator)
        Constructor
        Parameters:
        storage - The cloud storage connector
        bucket - The bucket to read the file from
        folderPath - The full path of the folder to read the files from
        charset - The charset used
        chunkSize - The size (in bytes) of each chunk read from storage at once, or 0 for using the default one.
        fileRegex - Used for determining which files to read from based on the file name
        comparator - A comparator used for defining the order at which file are read
      • StorageMultiTxtFileReaderPipe

        public StorageMultiTxtFileReaderPipe​(Storage<?,​B> storage,
                                             String bucket,
                                             String folderPath,
                                             Comparator<B> comparator)
        Constructor Assumes UTF8 encoding of all files, and doesn't apply any filter on files to read from.
        Parameters:
        storage - The cloud storage connector
        bucket - The bucket to read the file from
        folderPath - The full path of the folder to read the files from
        comparator - A comparator used for defining the order at which file are read
      • StorageMultiTxtFileReaderPipe

        public StorageMultiTxtFileReaderPipe​(Storage<?,​B> storage,
                                             String bucket,
                                             String folderPath)
        Constructor Scans the remote files in lexicographic name order. Performs no filtering, and assumes UTF8 encoding.
        Parameters:
        storage - The cloud storage connector
        bucket - The bucket to read the file from
        folderPath - The full path of the folder to read the files from