campa.tl.Cluster
- class Cluster(config, cluster_mpp=None, save_config=False)[source]
Cluster data.
Contains functions to create a (subsampled)
MPPDatafor clustering, cluster it, and to project the clustering to other MPPDatas.Cluster is initialised from a cluster config dictionary with the following keys:
data_config: name of the data config to use, should be registered incampa.inidata_dirs: where to read data from (relative toDATA_DIRdefined in data config)process_like_dataset: name of dataset that gives parameters for processing (except subsampling/subsetting)subsample: (bool) subsampling of pixelssubsample_kwargs: kwargs forMPPData.subsample()defining the fraction of pixels to be sampledsubset: (bool) subset to objects with certain metadata.subset_kwargs: kwargs toMPPData.subset()defining which object to subset toseed: random seed to make subsampling reproduciblecluster_data_dir: name of the dir containing the mpp_data that is clustered. Relative to EXPERIMENT_DIRcluster_name: name of the cluster assignment filecluster_rep: representation that should be clustered (name of existing file, should be predicted withPredictor.get_representation()).cluster_method: leiden or kmeans (kmeans not tested).leiden_resolution: resolution parameter for leiden clustering.kmeans_n: number of clusters for kmeans.umap: (bool) predict UMAP ofcluster_rep.
- Parameters
Attributes
Cluster annotation pd.DataFrame, read from
{cluster_name}_annotation.csv.MPPDatathat is used for clustering.Cluster config.
Methods
add_cluster_annotation(annotation, to_col[, ...])Add annotation and colormap to clustering.
add_cluster_colors(colors[, from_col])Add colours to clustering or to annotation.
add_umap()If umap does not yet exist, but should be calculated, calculates umap.
Use cluster parameters to create and save
Cluster.cluster_mppto use for clustering.Cluster
Cluster.cluster_mppusingcluster_methoddefined inCluster.config.from_cluster_data_dir(data_dir)Initialise from existing
cluster_data_dir.from_exp(exp[, cluster_config, data_dir])Initialise from experiment for clustering of entire data that went into creating training data.
from_exp_split(exp)Initialise from experiment for clustering of val/test split.
get_hpa_localisation([cluster_name, thresh, ...])Query subcellular localisation for each cluster from Human Protein Atlas (https://www.proteinatlas.org).
get_nndescent_index([recreate])Calculate and return pynndescent index of existing clustering for fast prediction of new data.
predict_cluster_imgs(exp)Predict cluster images from experiment.
predict_cluster_rep(exp)Use experiment to predict the necessary cluster representation.
project_clustering(mpp_data[, save_dir, ...])Project already computed clustering from
Cluster.cluster_mpptompp_data.set_cluster_name(cluster_name)Change the cluster name and reloads
cluster_mpp, andcluster_annotation.