campa.data.create_dataset

create_dataset(params)[source]

Create a NNDataset.

Parameters determine how the data should be selected and processed. The following keys in parameters are expected:

  • dataset_name: name of the resulting dataset that is defined by these parameters (relative to DATA_DIR/datasets)

  • data_config: name of data configuration (registered in campa.ini)

  • data_dirs: where to read data from (relative to DATA_DIR defined in data config)

  • channels: list of channel names to include in this dataset

  • condition: list of conditions. Should be defined in data config. The suffix _one_hot will convert the condition in a one-hot encoded vector. Conditions are concatenated, except when they are defined as a list of lists. In this case the condition is defined as a pairwise combination of the conditions.

  • condition_kwargs: kwargs to MPPData.add_conditions()

  • split_kwargs: kwargs to MPPData.train_val_test_split()

  • test_img_size: standard size of images in test set. Imaged are padded/truncated to this size

  • subset: (bool) subset to objects with certain metadata.

  • subset_kwargs: kwargs to MPPData.subset() defining which object to subset to

  • subsample: (bool) subsampling of pixels (only for train/val)

  • subsample_kwargs: kwargs for MPPData.subsample() defining the fraction of pixels to be sampled

  • neighborhood: (bool) add local neighbourhood to samples in NNDataset

  • neighborhood_size: size of neighbourhood

  • normalise: (bool) Intensity normalisation

  • normalise_kwargs: kwargs to MPPData.normalise()

  • seed: random seed to make subsampling reproducible

Parameters

params (Mapping[str, Any]) – parameter dict

Return type

None