compute_distance_matrix#

datafold.pcfold.distance.compute_distance_matrix(X, Y=None, metric='euclidean', cut_off=None, k=None, backend='guess_optimal', **backend_kwargs)[source]#

Compute distance matrix with different settings and backends.

Parameters:
  • X (ndarray) – Point cloud of shape (n_samples_X, n_features_X).

  • Y (Optional[ndarray]) – Reference point cloud for component-wise computation of shape (n_samples_Y, n_features_Y). If not given, then Y=X (pairwise computation)

  • metric (str) – Distance metric. Needs to be supported by backend.

  • cut_off (Optional[float]) –

    Distances larger than cut_off are set to zero. The parameter controls the degree of sparsity in the distance matrix.

    Note

    The pseudo-metric “sqeuclidean” is handled differently in a way that the cut-off must be stated in in Eucledian distance (not squared cut-off).

  • k (Optional[int]) – Minimum number of neighbors per point. Ignored if cut_off=np.inf to indicate a dense distance matrix, where all distance pairs are computed.

  • backend (Union[str, type[DistanceAlgorithm]]) – Backend to compute distance matrix.

  • **backend_kwargs – Keyword arguments handled to selected backend.

Returns:

distance matrix of shape (n_samples_X, n_samples_X) if Y=None, else of shape (n_samples_Y, n_samples_X)

Return type:

Union[numpy.ndarray, scipy.sparse.csr_matrix]