Package org.corpus_tools.pepper.impl
Class CorpusPathResolver
- java.lang.Object
-
- org.corpus_tools.pepper.impl.CorpusPathResolver
-
public class CorpusPathResolver extends Object
-
-
Field Summary
Fields Modifier and Type Field Description static intNUMBER_OF_SAMPLED_FILESThe number of files which are read for sampling when invoking#findAppropriateImporters(URI).static intNUMBER_OF_SAMPLED_LINESThe number of lines in a file which are read for sampling when invoking#findAppropriateImporters(URI).protected com.google.common.collect.Multimap<String,org.corpus_tools.pepper.impl.CorpusPathResolver.FileContent>readFilesGroupedByExtensionprotected com.google.common.collect.Multimap<String,File>unreadFilesGroupedByExtension
-
Constructor Summary
Constructors Modifier Constructor Description protectedCorpusPathResolver()CorpusPathResolver(org.eclipse.emf.common.util.URI corpusPath)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected Collection<org.corpus_tools.pepper.impl.CorpusPathResolver.FileContent>getXFilesWithExtension(int numOfFiles, int numOfLinesToRead, String fileEnding)protected com.google.common.collect.Multimap<String,File>groupFilesByEnding(org.eclipse.emf.common.util.URI corpusPath)Groups files for their file ending into a multimap.protected StringreadFirstLines(File file, int numOfLinesToRead)Reads the first X lines of the passed file and returns them as a StringCollection<String>sampleFileContent(int numberOfSampledFiles, int numberOfSampledLines, String... fileEndings)ReturnsfileEndingslines of a sampled set ofnumberOfSampledLinesfiles having the ending specified byfileEndingsrecursively from specified corpus path.Collection<String>sampleFileContent(String... fileEndings)protected Collection<File>sampleFiles(Collection<File> files, int numberOfSampledFiles)Creates a sampled set ofnumberOfSampledFilesfiles recursively from directorydirwith specified endings.protected voidsetCorpusPath(org.eclipse.emf.common.util.URI corpusPath)
-
-
-
Field Detail
-
NUMBER_OF_SAMPLED_FILES
public static final int NUMBER_OF_SAMPLED_FILES
The number of files which are read for sampling when invoking#findAppropriateImporters(URI).- See Also:
- Constant Field Values
-
NUMBER_OF_SAMPLED_LINES
public static final int NUMBER_OF_SAMPLED_LINES
The number of lines in a file which are read for sampling when invoking#findAppropriateImporters(URI).- See Also:
- Constant Field Values
-
unreadFilesGroupedByExtension
protected com.google.common.collect.Multimap<String,File> unreadFilesGroupedByExtension
-
readFilesGroupedByExtension
protected com.google.common.collect.Multimap<String,org.corpus_tools.pepper.impl.CorpusPathResolver.FileContent> readFilesGroupedByExtension
-
-
Constructor Detail
-
CorpusPathResolver
protected CorpusPathResolver()
-
CorpusPathResolver
public CorpusPathResolver(org.eclipse.emf.common.util.URI corpusPath) throws FileNotFoundException- Throws:
FileNotFoundException
-
-
Method Detail
-
setCorpusPath
protected void setCorpusPath(org.eclipse.emf.common.util.URI corpusPath) throws FileNotFoundException- Throws:
FileNotFoundException
-
sampleFileContent
public Collection<String> sampleFileContent(String... fileEndings)
-
sampleFileContent
public Collection<String> sampleFileContent(int numberOfSampledFiles, int numberOfSampledLines, String... fileEndings)
ReturnsfileEndingslines of a sampled set ofnumberOfSampledLinesfiles having the ending specified byfileEndingsrecursively from specified corpus path.- Parameters:
numberOfSampledFiles- number of files to be readnumberOfSampledLines- number of lines to be readfileEnding- ending to be considered. If no endings specified, all files are considered- Returns:
- the first 10 lines of
numberOfSampledLinesfiles
-
groupFilesByEnding
protected com.google.common.collect.Multimap<String,File> groupFilesByEnding(org.eclipse.emf.common.util.URI corpusPath) throws FileNotFoundException
Groups files for their file ending into a multimap. The key is the ending.- Parameters:
corpusPath-- Returns:
- Throws:
FileNotFoundException
-
getXFilesWithExtension
protected Collection<org.corpus_tools.pepper.impl.CorpusPathResolver.FileContent> getXFilesWithExtension(int numOfFiles, int numOfLinesToRead, String fileEnding)
-
sampleFiles
protected Collection<File> sampleFiles(Collection<File> files, int numberOfSampledFiles)
Creates a sampled set ofnumberOfSampledFilesfiles recursively from directorydirwith specified endings.- Parameters:
dir- the directory to be traversed recursivelynumberOfSampledFiles- number of files to be sampledfileEndings- endings of files to be sampled- Returns:
- a collection of files having on of the endings in
endingsin directorydir
-
-