Class MultiTxtFileReaderPipe

  • All Implemented Interfaces:
    Closeable, AutoCloseable, BasePipe, Pipe<String>

    public class MultiTxtFileReaderPipe
    extends CompoundPipe<String>
    Reads data from multiple files in local disk under some folder, as if they were concatenated using some predefined order. Files which have the gz extension are being unzipped. For more features and for general binary files use MultiFileReaderPipe.
    Author:
    Eyal Schneider
    • Constructor Detail

      • MultiTxtFileReaderPipe

        public MultiTxtFileReaderPipe​(File folder,
                                      Charset charset,
                                      int bufferSize,
                                      String fileRegex,
                                      Comparator<File> comparator)
        Constructor
        Parameters:
        folder - The local path of the folder to read from
        charset - The charset used
        bufferSize - The read buffer to use for every file. Use 0 for default buffer size (8k)
        fileRegex - Used for determining which files to read from based on the file name (excluding path)
        comparator - A comparator on file used for defining the order at which file are read
      • MultiTxtFileReaderPipe

        public MultiTxtFileReaderPipe​(File folder,
                                      Comparator<File> comparator)
        Constructor Assumes UTF8, no file filtering and default buffer size
        Parameters:
        folder - The folder to read from
        comparator - A comparator used for defining the order at which file are read
      • MultiTxtFileReaderPipe

        public MultiTxtFileReaderPipe​(File folder)
        Constructor Assumes UTF8, no file filtering, default buffer size and lexicographic order of files
        Parameters:
        folder - The folder to read from