Calculates the set of unique attribute values that occur for the given tag, and the number of time each value occurs.
Calculates the set of unique attribute values that occur for the given tag, and the number of time each value occurs.
The name of the optional field whose values are to be counted.
A Map whose keys are the values of the tag, and whose values are the number of time each tag-value occurs.
Converts a set of records into an RDD containing the pairs of all unique tagStrings within the records, along with the count (number of records) which have that particular attribute.
Converts a set of records into an RDD containing the pairs of all unique tagStrings within the records, along with the count (number of records) which have that particular attribute.
An RDD of attribute name / count pairs.
Returns the subset of the ADAMRecords which have an attribute with the given name.
Returns the subset of the ADAMRecords which have an attribute with the given name.
The name of the attribute to filter on (should be length 2)
An RDD[ADAMRecord] containing the subset of records with a tag that matches the given name.
Aggregates together a sequence dictionary from the different individual reference sequences used in this dataset.
Aggregates together a sequence dictionary from the different individual reference sequences used in this dataset.
A sequence dictionary describing the reference contigs in this dataset.
Groups all reads by reference position and returns a non-aggregated pileup RDD.
Groups all reads by reference position and returns a non-aggregated pileup RDD.
Creates pileups for non-primary aligned reads. Default is false.
ADAMPileup without aggregation
Groups all reads by reference position, with all reference position bases grouped into a rod.
Groups all reads by reference position, with all reference position bases grouped into a rod.
Size in basepairs of buckets. Larger buckets take more time per bucket to convert, but have lower skew. Default is 1000.
Creates rods for non-primary aligned reads. Default is false.
RDD of ADAMRods.
Groups all reads by record group and read name
Groups all reads by record group and read name
SingleReadBuckets with primary, secondary and unmapped reads
For a single RDD element, returns 0+ sequence record elements.
For a single RDD element, returns 0+ sequence record elements.
Element from which to extract sequence records.
A seq of sequence records.