pyckmeans.distance package

Submodules

pyckmeans.distance.c_interop module

pyckmeans.distance.c_interop.jc_distance(alignment: numpy.ndarray, pairwise_deletion: bool = True) numpy.ndarray

Calculate Jukes-Cantor distance for a nucleotide alignment.

Parameters
alignmentnumpy.ndarray

n*m numpy alignment, where n is the number of entries and m is the number of sites. Bases must be encoded in the format of pyckmeans.io.NucleotideAlignment.

pairwise_deletionbool, optional

Calculate distances with pairwise-deletion in case of missing data, by default True

Returns
numpy.ndarray

n*n distance matrix.

pyckmeans.distance.c_interop.k2p_distance(alignment: numpy.ndarray, pairwise_deletion: bool = True) numpy.ndarray

jc_distance

Calculate Kimura 2-parameter distance for a nucleotide alignment.

Parameters
alignmentnumpy.ndarray

n*m numpy alignment, where n is the number of entries and m is the number of sites. Bases must be encoded in the format of pyckmeans.io.NucleotideAlignment.

pairwise_deletionbool, optional

Calculate distances with pairwise-deletion in case of missing data, by default True

Returns
numpy.ndarray

n*n distance matrix.

pyckmeans.distance.c_interop.p_distance(alignment: numpy.ndarray, pairwise_deletion: bool = True) numpy.ndarray

Calculate p-distance for a nucleotide alignment.

Parameters
alignmentnumpy.ndarray

n*m numpy alignment, where n is the number of entries and m is the number of sites. Bases must be encoded in the format of pyckmeans.io.NucleotideAlignment.

pairwise_deletionbool, optional

Calculate distances with pairwise-deletion in case of missing data, by default True

Returns
numpy.ndarray

n*n distance matrix.

Module contents

distance

Module for distance calculations.

class pyckmeans.distance.DistanceMatrix(dist_mat: numpy.ndarray, names: Optional[Iterable[str]] = None)

Bases: object

Distance Matrix, optionally named.

Parameters
dist_matnumpy.ndarray

n*n distance matrix.

namesOptional[Iterable[str]]

Names, by default None.

Raises
IncompatibleNamesError

Raised if dimension of names and dist_mat are incompatible.

Attributes
shape

shape

Methods

from_csv(file_path[, header, index_col, sep])

read_csv_distmat

from_phylip(file_path)

Read PHYLIP distance matrix.

to_csv(file_path[, force])

Write DistanceMatrix object to CSV.

to_phylip(file_path[, force])

Write distance matrix to file in PHYLIP matrix format.

static from_csv(file_path: str, header: Optional[int] = 0, index_col: Optional[int] = 0, sep: str = ',', **kwargs) pyckmeans.distance.DistanceMatrix

read_csv_distmat

Read distance matrix from CSV file.

Parameters
file_pathstr

Path to CSV file.

headerOptional[int]

Determines the row in the CSV file containing sample names. Is passed to pandas.read_csv(). By default 0, meaning the first row.

index_colOptional[int]

Determines the index column. By default, the first column is expected to contain sample names. Passed to pandas.read_csv().

sepstr

Column separator, be default ‘,’. Passed to Passed to pandas.read_csv().

**kwargs

Additional keyword arguments passed to pandas.read_csv().

Returns
——-
pyckmeans.distance.DistanceMatrix

DistanceMatrix object.

static from_phylip(file_path: str) pyckmeans.distance.DistanceMatrix

Read PHYLIP distance matrix.

Returns
DistanceMatrix

DistanceMatrix object.

property shape: Tuple[int]

Get matrix shape.

Returns
Tuple[int]

Matrix shape.

to_csv(file_path: str, force: bool = False)

Write DistanceMatrix object to CSV.

Parameters
file_pathstr

CSV file path.

forcebool, optional

Force overwrite if file_path already exists, by default False

to_phylip(file_path: str, force: bool = False)

Write distance matrix to file in PHYLIP matrix format.

Parameters
file_pathstr

Output file path.

forcebool, optional

Force overwrite if file exists, by default False

exception pyckmeans.distance.IncompatibleNamesError

Bases: Exception

exception pyckmeans.distance.InvalidDistanceTypeError

Bases: Exception

UnknownDistanceTypeError

pyckmeans.distance.alignment_distance(alignment: pyckmeans.io.nucleotide_alignment.NucleotideAlignment, distance_type: str = 'p', pairwise_deletion: bool = True) pyckmeans.distance.DistanceMatrix

genetic_distance

Calculate genetic distance based on a nucleotide alignment.

Parameters
alignmentpyckmeans.io.NucleotideAlignment

Nucleotide alignment.

distance_typestr, optional

Type of genetic distance to calculate, by default ‘p’. Available distance types are p-distances (‘p’), Jukes-Cantor distances (‘jc’), and Kimura 2-paramater distances (‘k2p’).

pairwise_deletionbool

Use pairwise deletion as action to deal with missing data. If False, complete deletion is applied. Gaps (“-”, “~”, ” “), “?”, and ambiguous bases are treated as missing data.

Returns
——-
DistanceMatrix

n*n distance matrix.

Raises
InvalidDistanceTypeError

Raised if invalid distance_type is passed.