avalanche.benchmarks.generators.nc_benchmark

avalanche.benchmarks.generators.nc_benchmark(train_dataset: Union[Sequence[Union[IDatasetWithTargets, ITensorDataset, Subset, ConcatDataset, ClassificationDataset]], IDatasetWithTargets, ITensorDataset, Subset, ConcatDataset, ClassificationDataset], test_dataset: Union[Sequence[Union[IDatasetWithTargets, ITensorDataset, Subset, ConcatDataset, ClassificationDataset]], IDatasetWithTargets, ITensorDataset, Subset, ConcatDataset, ClassificationDataset], n_experiences: int, task_labels: bool, *, shuffle: bool = True, seed: Optional[int] = None, fixed_class_order: Optional[Sequence[int]] = None, per_exp_classes: Optional[Dict[int, int]] = None, class_ids_from_zero_from_first_exp: bool = False, class_ids_from_zero_in_each_exp: bool = False, one_dataset_per_exp: bool = False, train_transform=None, eval_transform=None, reproducibility_data: Optional[Dict[str, Any]] = None) → NCScenario[source]

This is the high-level benchmark instances generator for the “New Classes” (NC) case. Given a sequence of train and test datasets creates the continual stream of data as a series of experiences. Each experience will contain all the instances belonging to a certain set of classes and a class won’t be assigned to more than one experience.

This is the reference helper function for creating instances of Class- or Task-Incremental benchmarks.

The task_labels parameter determines if each incremental experience has an increasing task label or if, at the contrary, a default task label 0 has to be assigned to all experiences. This can be useful when differentiating between Single-Incremental-Task and Multi-Task scenarios.

There are other important parameters that can be specified in order to tweak the behaviour of the resulting benchmark. Please take a few minutes to read and understand them as they may save you a lot of work.

This generator features a integrated reproducibility mechanism that allows the user to store and later re-load a benchmark. For more info see the reproducibility_data parameter.

Parameters

train_dataset – A list of training datasets, or a single dataset.
test_dataset – A list of test datasets, or a single test dataset.
n_experiences – The number of incremental experience. This is not used when using multiple train/test datasets with the one_dataset_per_exp parameter set to True.
task_labels – If True, each experience will have an ascending task label. If False, the task label will be 0 for all the experiences.
shuffle – If True, the class (or experience) order will be shuffled. Defaults to True.
seed – If shuffle is True and seed is not None, the class (or experience) order will be shuffled according to the seed. When None, the current PyTorch random number generator state will be used. Defaults to None.
fixed_class_order – If not None, the class order to use (overrides the shuffle argument). Very useful for enhancing reproducibility. Defaults to None.
per_exp_classes – Is not None, a dictionary whose keys are (0-indexed) experience IDs and their values are the number of classes to include in the respective experiences. The dictionary doesn’t have to contain a key for each experience! All the remaining experiences will contain an equal amount of the remaining classes. The remaining number of classes must be divisible without remainder by the remaining number of experiences. For instance, if you want to include 50 classes in the first experience while equally distributing remaining classes across remaining experiences, just pass the “{0: 50}” dictionary as the per_experience_classes parameter. Defaults to None.
class_ids_from_zero_from_first_exp – If True, original class IDs will be remapped so that they will appear as having an ascending order. For instance, if the resulting class order after shuffling (or defined by fixed_class_order) is [23, 34, 11, 7, 6, …] and class_ids_from_zero_from_first_exp is True, then all the patterns belonging to class 23 will appear as belonging to class “0”, class “34” will be mapped to “1”, class “11” to “2” and so on. This is very useful when drawing confusion matrices and when dealing with algorithms with dynamic head expansion. Defaults to False. Mutually exclusive with the class_ids_from_zero_in_each_exp parameter.
class_ids_from_zero_in_each_exp – If True, original class IDs will be mapped to range [0, n_classes_in_exp) for each experience. Defaults to False. Mutually exclusive with the class_ids_from_zero_from_first_exp parameter.
one_dataset_per_exp – available only when multiple train-test datasets are provided. If True, each dataset will be treated as a experience. Mutually exclusive with the per_experience_classes and fixed_class_order parameters. Overrides the n_experiences parameter. Defaults to False.
train_transform – The transformation to apply to the training data, e.g. a random crop, a normalization or a concatenation of different transformations (see torchvision.transform documentation for a comprehensive list of possible transformations). Defaults to None.
eval_transform – The transformation to apply to the test data, e.g. a random crop, a normalization or a concatenation of different transformations (see torchvision.transform documentation for a comprehensive list of possible transformations). Defaults to None.
reproducibility_data – If not None, overrides all the other benchmark definition options. This is usually a dictionary containing data used to reproduce a specific experiment. One can use the get_reproducibility_data method to get (and even distribute) the experiment setup so that it can be loaded by passing it as this parameter. In this way one can be sure that the same specific experimental setup is being used (for reproducibility purposes). Beware that, in order to reproduce an experiment, the same train and test datasets must be used. Defaults to None.

Returns

A properly initialized NCScenario instance.