avalanche.benchmarks.generators.data_incremental_benchmark

avalanche.benchmarks.generators.data_incremental_benchmark(benchmark_instance: GenericCLScenario, experience_size: int, shuffle: bool = False, drop_last: bool = False, split_streams: Sequence[str] = ('train',), custom_split_strategy: Optional[Callable[[ClassificationExperience], Sequence[make_classification_dataset]]] = None, experience_factory: Optional[Callable[[ClassificationStream, int], ClassificationExperience]] = None)[source]

High-level benchmark generator for a Data Incremental setup.

This generator accepts an existing benchmark instance and returns a version of it in which experiences have been split in order to produce a Data Incremental stream.

In its base form this generator will split train experiences in experiences of a fixed, configurable, size. The split can be also performed on other streams (like the test one) if needed.

The custom_split_strategy parameter can be used if a more specific splitting is required.

Beware that experience splitting is NOT executed in a lazy way. This means that the splitting process takes place immediately. Consider optimizing the split process for speed when using a custom splitting strategy.

Please note that each mini-experience will have a task labels field equal to the one of the originating experience.

The complete_test_set_only field of the resulting benchmark instance will be True only if the same field of original benchmark instance is True and if the resulting test stream contains exactly one experience.

Parameters
  • benchmark_instance – The benchmark to split.

  • experience_size – The size of the experience, as an int. Ignored if custom_split_strategy is used.

  • shuffle – If True, experiences will be split by first shuffling instances in each experience. This will use the default PyTorch random number generator at its current state. Defaults to False. Ignored if custom_split_strategy is used.

  • drop_last – If True, if the last experience doesn’t contain experience_size instances, then the last experience will be dropped. Defaults to False. Ignored if custom_split_strategy is used.

  • split_streams – The list of streams to split. By default only the “train” stream will be split.

  • custom_split_strategy – A function that implements a custom splitting strategy. The function must accept an experience and return a list of datasets each describing an experience. Defaults to None, which means that the standard splitting strategy will be used (which creates experiences of size experience_size). A good starting to understand the mechanism is to look at the implementation of the standard splitting function fixed_size_experience_split_strategy().

  • experience_factory – The experience factory. Defaults to GenericExperience.

Returns

The Data Incremental benchmark instance.