subset

MPPData.subset(frac=None, num=None, obj_ids=None, nona_condition=False, copy=False, **kwargs)[source]

Object-level subsetting of MPPData.

Several filters for subsetting can be defined:

  • subset to random fraction / number of objects

  • filtering by object ids

  • filtering by values in MPPData.metadata

  • filtering by condition values

Restrict objects to those with specified value(s) for key in the metadata table

Parameters
  • frac (Optional[float]) – Fraction of objects to randomly subsample. Applied after the other subsetting

  • num (Optional[int]) – Number of objects to randomly subsample. frac takes precedence. Applied after the other subsetting.

  • obj_ids (Union[ndarray, List[int], None]) – Object ids to subset to.

  • nona_condition (bool) – If set to True, all values having NaN conditions will be filtered out. Note that the way conditions are created allows one to e.g. leave only entries which values in the specified column were in the low and high quantiles, and filter out everything else.

  • copy (bool) – Return new MPPData object or modify in place.

  • kwargs (Any) – Keys are column names in the metadata table. Values (str or list of str) are allowed entries for that key in the metadata table for selected objects. NO_NAN is special token selecting all values except NaN.

Return type

Subsetted MPPData