get_hpa_localisation

Cluster.get_hpa_localisation(cluster_name='clustering_res0.5', thresh=1, max_num_channels=3, limit_to_groups=None, **kwargs)[source]

Query subcellular localisation for each cluster from Human Protein Atlas (https://www.proteinatlas.org).

Calculates cluster loadings and returns the subcellular localisations of the channels that are enriched for each cluster. Requires “hpa_gene_name” column in channel_metadata.csv file in DATA_DIR to map channel names to genes available in HPA.

Parameters
  • cluster_name (str) – Clustering to calculate localisations for. Must exist already.

  • thresh (float) – Minimum z-scored intensity value of channel in cluster to be considered for HPA query. thresh=0 considers all enriched channel of this cluster

  • max_num_channels (int) – Maximal number of channels to be considered for HPA query. Channels with highest z-scored intensity value will be used. If None, all channels passing thresh will be used.

  • limit_to_groups (Mapping[str, str | list[str]] | None) – Dict with obs as keys and groups from obs as values, to subset data before calculating loadings.

  • kwargs (Any) – Keyword arguments for campa.tl.query_hpa_subcellular_location().

Returns

Results dictionary with clusters as keys, and return value from campa.tl.query_hpa_subcellular_location()

Return type

Mapping[str, Mapping[str, Any]]