pyckmeans.distance package
Submodules
pyckmeans.distance.c_interop module
- pyckmeans.distance.c_interop.jc_distance(alignment: numpy.ndarray, pairwise_deletion: bool = True) numpy.ndarray
Calculate Jukes-Cantor distance for a nucleotide alignment.
- Parameters
- alignmentnumpy.ndarray
n*m numpy alignment, where n is the number of entries and m is the number of sites. Bases must be encoded in the format of pyckmeans.io.NucleotideAlignment.
- pairwise_deletionbool, optional
Calculate distances with pairwise-deletion in case of missing data, by default True
- Returns
- numpy.ndarray
n*n distance matrix.
- pyckmeans.distance.c_interop.k2p_distance(alignment: numpy.ndarray, pairwise_deletion: bool = True) numpy.ndarray
jc_distance
Calculate Kimura 2-parameter distance for a nucleotide alignment.
- Parameters
- alignmentnumpy.ndarray
n*m numpy alignment, where n is the number of entries and m is the number of sites. Bases must be encoded in the format of pyckmeans.io.NucleotideAlignment.
- pairwise_deletionbool, optional
Calculate distances with pairwise-deletion in case of missing data, by default True
- Returns
- numpy.ndarray
n*n distance matrix.
- pyckmeans.distance.c_interop.p_distance(alignment: numpy.ndarray, pairwise_deletion: bool = True) numpy.ndarray
Calculate p-distance for a nucleotide alignment.
- Parameters
- alignmentnumpy.ndarray
n*m numpy alignment, where n is the number of entries and m is the number of sites. Bases must be encoded in the format of pyckmeans.io.NucleotideAlignment.
- pairwise_deletionbool, optional
Calculate distances with pairwise-deletion in case of missing data, by default True
- Returns
- numpy.ndarray
n*n distance matrix.
Module contents
distance
Module for distance calculations.
- class pyckmeans.distance.DistanceMatrix(dist_mat: numpy.ndarray, names: Optional[Iterable[str]] = None)
Bases:
objectDistance Matrix, optionally named.
- Parameters
- dist_matnumpy.ndarray
n*n distance matrix.
- namesOptional[Iterable[str]]
Names, by default None.
- Raises
- IncompatibleNamesError
Raised if dimension of names and dist_mat are incompatible.
- Attributes
shapeshape
Methods
from_csv(file_path[, header, index_col, sep])read_csv_distmat
from_phylip(file_path)Read PHYLIP distance matrix.
to_csv(file_path[, force])Write DistanceMatrix object to CSV.
to_phylip(file_path[, force])Write distance matrix to file in PHYLIP matrix format.
- static from_csv(file_path: str, header: Optional[int] = 0, index_col: Optional[int] = 0, sep: str = ',', **kwargs) pyckmeans.distance.DistanceMatrix
read_csv_distmat
Read distance matrix from CSV file.
- Parameters
- file_pathstr
Path to CSV file.
- headerOptional[int]
Determines the row in the CSV file containing sample names. Is passed to pandas.read_csv(). By default 0, meaning the first row.
- index_colOptional[int]
Determines the index column. By default, the first column is expected to contain sample names. Passed to pandas.read_csv().
- sepstr
Column separator, be default ‘,’. Passed to Passed to pandas.read_csv().
- **kwargs
Additional keyword arguments passed to pandas.read_csv().
- Returns
- ——-
- pyckmeans.distance.DistanceMatrix
DistanceMatrix object.
- static from_phylip(file_path: str) pyckmeans.distance.DistanceMatrix
Read PHYLIP distance matrix.
- Returns
- DistanceMatrix
DistanceMatrix object.
- property shape: Tuple[int]
Get matrix shape.
- Returns
- Tuple[int]
Matrix shape.
- to_csv(file_path: str, force: bool = False)
Write DistanceMatrix object to CSV.
- Parameters
- file_pathstr
CSV file path.
- forcebool, optional
Force overwrite if file_path already exists, by default False
- to_phylip(file_path: str, force: bool = False)
Write distance matrix to file in PHYLIP matrix format.
- Parameters
- file_pathstr
Output file path.
- forcebool, optional
Force overwrite if file exists, by default False
- exception pyckmeans.distance.IncompatibleNamesError
Bases:
Exception
- exception pyckmeans.distance.InvalidDistanceTypeError
Bases:
ExceptionUnknownDistanceTypeError
- pyckmeans.distance.alignment_distance(alignment: pyckmeans.io.nucleotide_alignment.NucleotideAlignment, distance_type: str = 'p', pairwise_deletion: bool = True) pyckmeans.distance.DistanceMatrix
genetic_distance
Calculate genetic distance based on a nucleotide alignment.
- Parameters
- alignmentpyckmeans.io.NucleotideAlignment
Nucleotide alignment.
- distance_typestr, optional
Type of genetic distance to calculate, by default ‘p’. Available distance types are p-distances (‘p’), Jukes-Cantor distances (‘jc’), and Kimura 2-paramater distances (‘k2p’).
- pairwise_deletionbool
Use pairwise deletion as action to deal with missing data. If False, complete deletion is applied. Gaps (“-”, “~”, ” “), “?”, and ambiguous bases are treated as missing data.
- Returns
- ——-
- DistanceMatrix
n*n distance matrix.
- Raises
- InvalidDistanceTypeError
Raised if invalid distance_type is passed.