public class BAMInputFormat extends org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.LongWritable,SAMRecordWritable>
InputFormat for BAM files. Values
are the individual records; see BAMRecordReader for the meaning of
the key.| Modifier and Type | Field and Description |
|---|---|
static boolean |
DEBUG_BAM_SPLITTER |
static String |
INTERVALS_PROPERTY
Filter by region, like
-L in SAMtools. |
static String |
KEEP_PAIRED_READS_TOGETHER_PROPERTY
If set to
true, ensure that for paired reads both reads in a pair are
always in the same split for queryname-sorted BAM files. |
| Constructor and Description |
|---|
BAMInputFormat() |
| Modifier and Type | Method and Description |
|---|---|
org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,SAMRecordWritable> |
createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext ctx)
Returns a
BAMRecordReader initialized with the parameters. |
List<org.apache.hadoop.mapreduce.InputSplit> |
getSplits(org.apache.hadoop.mapreduce.JobContext job)
The splits returned are
FileVirtualSplits. |
List<org.apache.hadoop.mapreduce.InputSplit> |
getSplits(List<org.apache.hadoop.mapreduce.InputSplit> splits,
org.apache.hadoop.conf.Configuration cfg) |
boolean |
isSplitable(org.apache.hadoop.mapreduce.JobContext job,
org.apache.hadoop.fs.Path path) |
static <T extends htsjdk.samtools.util.Locatable> |
setIntervals(org.apache.hadoop.conf.Configuration conf,
List<T> intervals) |
addInputPath, addInputPathRecursively, addInputPaths, computeSplitSize, getBlockIndex, getFormatMinSplitSize, getInputDirRecursive, getInputPathFilter, getInputPaths, getMaxSplitSize, getMinSplitSize, listStatus, makeSplit, setInputDirRecursive, setInputPathFilter, setInputPaths, setInputPaths, setMaxInputSplitSize, setMinInputSplitSizepublic static final boolean DEBUG_BAM_SPLITTER
public static final String KEEP_PAIRED_READS_TOGETHER_PROPERTY
true, ensure that for paired reads both reads in a pair are
always in the same split for queryname-sorted BAM files.
Note: only use this option if for all paired reads both reads in each pair are present, otherwise it is possible that reads may be silently dropped.
public static final String INTERVALS_PROPERTY
-L in SAMtools. Takes a comma-separated
list of intervals, e.g. chr1:1-20000,chr2:12000-20000. For
programmatic use setIntervals(Configuration, List) should be preferred.public static <T extends htsjdk.samtools.util.Locatable> void setIntervals(org.apache.hadoop.conf.Configuration conf,
List<T> intervals)
public org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,SAMRecordWritable> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split, org.apache.hadoop.mapreduce.TaskAttemptContext ctx) throws InterruptedException, IOException
BAMRecordReader initialized with the parameters.createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<org.apache.hadoop.io.LongWritable,SAMRecordWritable>InterruptedExceptionIOExceptionpublic List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext job) throws IOException
FileVirtualSplits.getSplits in class org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.LongWritable,SAMRecordWritable>IOExceptionpublic List<org.apache.hadoop.mapreduce.InputSplit> getSplits(List<org.apache.hadoop.mapreduce.InputSplit> splits, org.apache.hadoop.conf.Configuration cfg) throws IOException
IOExceptionpublic boolean isSplitable(org.apache.hadoop.mapreduce.JobContext job,
org.apache.hadoop.fs.Path path)
isSplitable in class org.apache.hadoop.mapreduce.lib.input.FileInputFormat<org.apache.hadoop.io.LongWritable,SAMRecordWritable>Copyright © 2016. All rights reserved.