bio.fmindex
¶
Source code: bio/fmindex.seq
- class bio.fmindex.bntann¶
Magic methods:
- __init__(name: str, anno: str, offset: int, len: int)
- class bio.fmindex.bntamb¶
Magic methods:
- __init__(offset: int, amb: byte)
- class bio.fmindex.bntseq¶
Arbitrary-length 2-bit packed sequence, adapted from BWA
Properties:
- _n_holes¶
Magic methods:
- __pickle__(jar: Jar)
- __unpickle__(jar: Jar)
- __init__()
- __init__(path: str)
- __init__(sequence: seq)
- __len__()
- __bool__()
Methods:
- class bio.fmindex.FMInterval¶
FM-index interval
Magic methods:
- __bool__()
- __len__()
- class bio.fmindex.FMDInterval¶
FMD-index interval
Magic methods:
- __new__()
- __bool__()
- __len__()
- __invert__()
FMDInterval for the reverse complemented sequence
Methods:
- class bio.fmindex.SMEM¶
Super-Maximal Exact Match (SMEM)
Properties:
Magic methods:
- __new__(interval: FMDInterval, start: int, stop: int)
- __new__(interval: FMDInterval)
- __new__()
- __len__()
- __bool__()
- bio.fmindex.smems(self, q: seq, x: int = 0, min_intv: int = 1, min_seed: int = 1, mems: Optional[List[SMEM]] = None, prev: Optional[List[SMEM]] = None, curr: Optional[List[SMEM]] = None)¶
Returns a list of SMEMs given an FsM-index or FMD-index (self), a query sequence (q) and a start position (x; 0-based). Adapted from BWA-MEM’s bwt_smem1a().
- bio.fmindex.OCC_INTV_SHIFT¶
- bio.fmindex.OCC_INTERVAL¶
- bio.fmindex.OCC_INTV_MASK¶
- class bio.fmindex.FMDIndex¶
FMD-index: a bi-directional FM-index. Based on BWA-MEM’s implementation. Note that this implementation does perform SA-compression.
Magic methods:
- __pickle__(jar: Jar)
- __unpickle__(jar: Jar)
- __init__()
- __init__(path: str)
Constructs an FM-index from the FASTA file at the specified path
- __init__(sequence: seq)
Constructs an FM-index from the specified sequence
- __getitem__(x: Tuple[FMInterval, seq])
Equivalent to self.update(x[0], x[1]).
- __getitem__(x: Tuple[FMDInterval, seq])
Equivalent to self.biupdate(x[0], x[1]).
- __getitem__(intv: FMInterval)
Iterator over all 0-based positions from specified interval
- __getitem__(s: seq)
Iterator over all 0-based positions at which the given sequence appears
Methods:
- occ(k: int, c: seq)¶
FM-index occ operation
- less(c: seq)¶
FM-index less operation
- interval(c: seq)¶
FMInterval corresponding to given length-1 sequence
- biinterval(c: seq)¶
FMDInterval corresponding to given length-1 sequence
- smems(q: seq, x: int = 0, min_intv: int = 1, min_seed: int = 1, mems: Optional[List[SMEM]] = None, prev: Optional[List[SMEM]] = None, curr: Optional[List[SMEM]] = None)¶
See smems function
- update(intv: FMInterval, c: seq)¶
Returns given FMInterval extended by base c
- biupdate(intv: FMDInterval, c: seq)¶
Returns given FMDInterval extended by base c
- count(s: seq)¶
Returns how many times the given sequence appears in the index
- results(intv: FMInterval, both_strands: bool = False)¶
Iterator over tuples (contig ID (int), contig name (str), 0-based position (int), reversed? (bool)) tuples corresponding to specified interval. both_strands determines whether to include reverse complemented results.
- loci(intv: FMInterval, both_strands: bool = True)¶
Iterator over Locus values contained in the specified interval. both_strands determines whether to include reverse complemented results.
- biresults(smem: SMEM)¶
Iterator over tuples (contig ID (int), contig name (str), 0-based position (int), reversed? (bool)) tuples corresponding to specified interval.
- biloci(intv: FMDInterval)¶
Iterator over Locus values contained in the specified interval.
- locate(s: seq, both_strands: bool = False)¶
results of interval corresponding to given sequence
- sequence(start: int, stop: int, rid: int = - 1, name: str = '')¶
Obtains the underlying sequence from this index given 0-based start and stop (non-inclusive) positions and either contig ID rid or contig name name. Note that ambiguous bases are randomly replaced with A/C/G/T.
- contig(rid: int)¶
Returns the Contig with the specified ID
- class bio.fmindex.FMIndex¶
FM-index data structure. Note that this implementation does not perform SA-compression.
Magic methods:
- __pickle__(jar: Jar)
- __unpickle__(jar: Jar)
- __init__()
- __init__(s: seq)
Constructs an FM-index from the specified sequence
- __init__(path: str)
Constructs an FM-index from the FASTA file at the specified path
- __init__(path: str, FMD: bool)
Constructs an FM-index from the FASTA file at the specified path. FMD controls whether this index should be bi-directional.
- __getitem__(x: Tuple[FMInterval, seq])
Equivalent to self.update(x[0], x[1]).
- __getitem__(x: Tuple[FMDInterval, seq])
Equivalent to self.biupdate(x[0], x[1]).
- __prefetch__(x: Tuple[FMInterval, seq])
- __prefetch__(x: Tuple[FMDInterval, seq])
- __getitem__(intv: FMInterval)
Iterator over all 0-based positions from specified interval
- __getitem__(s: seq)
Iterator over all 0-based positions at which the given sequence appears
Methods:
- occ(k: int, c: seq)¶
FM-index occ operation
- less(c: seq)¶
FM-index less operation
- interval(c: seq)¶
FMInterval corresponding to given length-1 sequence
- biinterval(c: seq)¶
FMDInterval corresponding to given length-1 sequence
- smems(q: seq, x: int = 0, min_intv: int = 1, min_seed: int = 1, mems: Optional[List[SMEM]] = None, prev: Optional[List[SMEM]] = None, curr: Optional[List[SMEM]] = None)¶
See smems function
- update(intv: FMInterval, c: seq)¶
Returns given FMInterval extended by base c
- biupdate(intv: FMDInterval, c: seq)¶
Returns given FMDInterval extended by base c
- count(s: seq)¶
Returns how many times the given sequence appears in the index
- results(intv: FMInterval)¶
Iterator over tuples (contig ID (int), contig name (str), 0-based position (int), reversed? (bool)) tuples corresponding to specified interval. both_strands determines whether to include reverse complemented results.
- loci(intv: FMInterval)¶
Iterator over Locus values contained in the specified interval. both_strands determines whether to include reverse complemented results.
- biresults(smem: SMEM)¶
Iterator over tuples (contig ID (int), contig name (str), 0-based position (int), reversed? (bool)) tuples corresponding to specified interval.
- biloci(intv: FMDInterval)¶
Iterator over Locus values contained in the specified interval.
- locate(s: seq)¶
results of interval corresponding to given sequence
- sequence(start: int, stop: int, rid: int = - 1, name: str = '')¶
Obtains the underlying sequence from this index given 0-based start and stop (non-inclusive) positions and either contig ID rid or contig name name. Note that ambiguous bases are randomly replaced with A/C/G/T.
- contig(rid: int)¶
Returns the Contig with the specified ID