bio.fmindex

Source code: bio/fmindex.seq

class bio.fmindex.bntann

Magic methods:

__init__(name: str, anno: str, offset: int, len: int)
class bio.fmindex.bntamb

Magic methods:

__init__(offset: int, amb: byte)
class bio.fmindex.bntseq

Arbitrary-length 2-bit packed sequence, adapted from BWA

Properties:

_n_holes

Magic methods:

__pickle__(jar: Jar)
__unpickle__(jar: Jar)
__init__()
__init__(path: str)
__init__(sequence: seq)
__len__()
__bool__()

Methods:

get_pac(pac: ptr[u8], l: int)
depos(pos: int)
pos2rid(pos: int)
intv2rid(rb: int, re: int)
cnt_ambi(pos: int, len: int)
get_seq(lo: int, hi: int)
type bio.fmindex.FMInterval

FM-index interval

Magic methods:

__init__()
__bool__()
__len__()
type bio.fmindex.FMDInterval

FMD-index interval

Magic methods:

__init__()
__bool__()
__len__()
__invert__()

FMDInterval for the reverse complemented sequence

Methods:

forward()

Forward-direction FMInterval corresponding to this interval

revcomp()

Reverse-direction FMInterval corresponding to this interval

type bio.fmindex.SMEM

Super-Maximal Exact Match (SMEM)

Properties:

interval

Corresponding FMDInterval

start

SMEM start position on query (0-based)

stop

SMEM stop position on query (non-inclusive; 0-based)

Magic methods:

__init__(interval: FMDInterval, start: int, stop: int)
__init__(interval: FMDInterval)
__init__()
__len__()
__bool__()
bio.fmindex.smems(self, q: seq, x: int = 0, min_intv: int = 1, min_seed: int = 1, mems: list[SMEM] = None, prev: list[SMEM] = None, curr: list[SMEM] = None)

Returns a list of SMEMs given an FsM-index or FMD-index (self), a query sequence (q) and a start position (x; 0-based). Adapted from BWA-MEM’s bwt_smem1a().

bio.fmindex.OCC_INTV_SHIFT
bio.fmindex.OCC_INTERVAL
bio.fmindex.OCC_INTV_MASK
class bio.fmindex.FMDIndex

FMD-index: a bi-directional FM-index. Based on BWA-MEM’s implementation. Note that this implementation does perform SA-compression.

Magic methods:

__pickle__(jar: Jar)
__unpickle__(jar: Jar)
__init__()
__init__(path: str)

Constructs an FM-index from the FASTA file at the specified path

__init__(sequence: seq)

Constructs an FM-index from the specified sequence

__getitem__(x: tuple[FMInterval, seq])

Equivalent to self.update(x[0], x[1]).

__getitem__(x: tuple[FMDInterval, seq])

Equivalent to self.biupdate(x[0], x[1]).

__getitem__(intv: FMInterval)

Iterator over all 0-based positions from specified interval

__getitem__(s: seq)

Iterator over all 0-based positions at which the given sequence appears

Methods:

occ(k: int, c: seq)

FM-index occ operation

less(c: seq)

FM-index less operation

interval(c: seq)

FMInterval corresponding to given length-1 sequence

biinterval(c: seq)

FMDInterval corresponding to given length-1 sequence

smems(q: seq, x: int = 0, min_intv: int = 1, min_seed: int = 1, mems: list[SMEM] = None, prev: list[SMEM] = None, curr: list[SMEM] = None)

See smems function

update(intv: FMInterval, c: seq)

Returns given FMInterval extended by base c

biupdate(intv: FMDInterval, c: seq)

Returns given FMDInterval extended by base c

count(s: seq)

Returns how many times the given sequence appears in the index

results(intv: FMInterval, both_strands: bool = False)

Iterator over tuples (contig ID (int), contig name (str), 0-based position (int), reversed? (bool)) tuples corresponding to specified interval. both_strands determines whether to include reverse complemented results.

loci(intv: FMInterval, both_strands: bool = True)

Iterator over Locus values contained in the specified interval. both_strands determines whether to include reverse complemented results.

biresults(smem: SMEM)

Iterator over tuples (contig ID (int), contig name (str), 0-based position (int), reversed? (bool)) tuples corresponding to specified interval.

biloci(intv: FMDInterval)

Iterator over Locus values contained in the specified interval.

locate(s: seq, both_strands: bool = False)

results of interval corresponding to given sequence

sequence(start: int, stop: int, rid: int = - 1, name: str = '')

Obtains the underlying sequence from this index given 0-based start and stop (non-inclusive) positions and either contig ID rid or contig name name. Note that ambiguous bases are randomly replaced with A/C/G/T.

contigs()

Iterator over `Contig`s contained in this index

contig(rid: int)

Returns the Contig with the specified ID

class bio.fmindex.FMIndex

FM-index data structure. Note that this implementation does not perform SA-compression.

Magic methods:

__pickle__(jar: Jar)
__unpickle__(jar: Jar)
__init__()
__init__(s: seq)

Constructs an FM-index from the specified sequence

__init__(path: str)

Constructs an FM-index from the FASTA file at the specified path

__init__(path: str, FMD: bool)

Constructs an FM-index from the FASTA file at the specified path. FMD controls whether this index should be bi-directional.

__getitem__(x: tuple[FMInterval, seq])

Equivalent to self.update(x[0], x[1]).

__getitem__(x: tuple[FMDInterval, seq])

Equivalent to self.biupdate(x[0], x[1]).

__prefetch__(x: tuple[FMInterval, seq])
__prefetch__(x: tuple[FMDInterval, seq])
__getitem__(intv: FMInterval)

Iterator over all 0-based positions from specified interval

__getitem__(s: seq)

Iterator over all 0-based positions at which the given sequence appears

Methods:

occ(k: int, c: seq)

FM-index occ operation

less(c: seq)

FM-index less operation

interval(c: seq)

FMInterval corresponding to given length-1 sequence

biinterval(c: seq)

FMDInterval corresponding to given length-1 sequence

smems(q: seq, x: int = 0, min_intv: int = 1, min_seed: int = 1, mems: list[SMEM] = None, prev: list[SMEM] = None, curr: list[SMEM] = None)

See smems function

update(intv: FMInterval, c: seq)

Returns given FMInterval extended by base c

biupdate(intv: FMDInterval, c: seq)

Returns given FMDInterval extended by base c

count(s: seq)

Returns how many times the given sequence appears in the index

results(intv: FMInterval)

Iterator over tuples (contig ID (int), contig name (str), 0-based position (int), reversed? (bool)) tuples corresponding to specified interval. both_strands determines whether to include reverse complemented results.

loci(intv: FMInterval)

Iterator over Locus values contained in the specified interval. both_strands determines whether to include reverse complemented results.

biresults(smem: SMEM)

Iterator over tuples (contig ID (int), contig name (str), 0-based position (int), reversed? (bool)) tuples corresponding to specified interval.

biloci(intv: FMDInterval)

Iterator over Locus values contained in the specified interval.

locate(s: seq)

results of interval corresponding to given sequence

sequence(start: int, stop: int, rid: int = - 1, name: str = '')

Obtains the underlying sequence from this index given 0-based start and stop (non-inclusive) positions and either contig ID rid or contig name name. Note that ambiguous bases are randomly replaced with A/C/G/T.

contigs()

Iterator over `Contig`s contained in this index

contig(rid: int)

Returns the Contig with the specified ID