Created by bryan on 4/17/15.
A wrapper around the attrTuple (key) and value pair.
A wrapper around the attrTuple (key) and value pair. Includes the attrTuple-type explicitly, rather than embedding the corresponding information in the type of 'value', because otherwise it'd be difficult to extract the correct type for Byte and NumericSequence values.
Roughly analogous to Picards SAMTagAndValue.
The string key associated with this pair.
An enumerated value representing the type of the 'value' parameter.
The 'value' half of the pair.
Coverage record for CoverageRDD.
Coverage record for CoverageRDD. Contains Region indexed by contig name, start and end, as well as count of coverage at each base pair in that region.
Specifies chromosomal location of coverage
Specifies start position of coverage
Specifies end position of coverage
Specifies count of coverage at location
The standard DNA alphabet with A,T,C, and G
Creates a multi-reference-region collection of NonoverlappingRegions -- see the scaladocs to NonoverlappingRegions.
The evaluation of a regionJoin takes place with respect to a complete partition on the total space of the genome.
The evaluation of a regionJoin takes place with respect to a complete partition on the total space of the genome. NonoverlappingRegions is a class to compute the value of that partition, and to allow us to assign one or more elements of that partition to a new ReferenceRegion (see the 'regionsFor' method).
NonoverlappingRegions takes, as input, and 'input-set' of regions. These are arbitrary ReferenceRegions, which may be overlapping, identical, disjoint, etc. The input-set of regions _must_ all be located on the same reference chromosome (i.e. must all have the same refName); the generalization to reference regions from multiple chromosomes is in MultiContigNonoverlappingRegions, below.
NonoverlappingRegions produces, internally, a 'nonoverlapping-set' of regions. This is basically the set of _distinct unions_ of the input-set regions.
This class is similar to SingleReadBucket, except it breaks the reads down further.
This class is similar to SingleReadBucket, except it breaks the reads down further.
Rather than stopping at primary/secondary/unmapped, this will break it down further into whether they are paired or unpaired, and then whether they are the first or second of the pair.
This is useful as this will usually map a single read in any of the sequences.
Builds a dictionary containing record groups.
Builds a dictionary containing record groups. Record groups must have a unique name across all samples in the dictionary. This dictionary provides numerical IDs for each group; these IDs are only consistent when referencing a single dictionary.
A seq of record groups to popualate the dictionary.
AssertionError
Throws an assertion error if there are multiple record groups with the
same name.
Represents a contiguous region of the reference genome.
Represents a contiguous region of the reference genome.
The name of the sequence (chromosome) in the reference genome
The 0-based residue-coordinate for the start of the region
The 0-based residue-coordinate for the first residue after the start which is not in the region -- i.e. [start, end) define a 0-based half-open interval.
Utility class within the SequenceDictionary; represents unique reference name-to-id correspondence
A symbol in an alphabet
A symbol in an alphabet
a character which represents the symbol
acharacter which represents the complement of the symbol
Converts from avro Feature to Coverage.
SequenceDictionary contains the (bijective) map between Ints (the referenceId) and Strings (the referenceName) from the header of a BAM file, or the combined result of multiple such SequenceDictionaries.
Note: VariantContext inherits its name from the Picard VariantContext, and is not related to the SparkContext object.
Note: VariantContext inherits its name from the Picard VariantContext, and is not related to the SparkContext object. If you're looking for the latter, see org.bdgenomics.adam.rdd.variation.VariationContext
Created by bryan on 4/17/15.
An alphabet of symbols and related operations