avalanche.benchmarks.utils.data_loader.ReplayDataLoader

class avalanche.benchmarks.utils.data_loader.ReplayDataLoader(data: AvalancheDataset, memory: AvalancheDataset | None = None, oversample_small_tasks: bool = False, batch_size: int = 32, batch_size_mem: int = 32, task_balanced_dataloader: bool = False, distributed_sampling: bool = True, **kwargs)[source]

Custom data loader for rehearsal/replay strategies.

__init__(data: AvalancheDataset, memory: AvalancheDataset | None = None, oversample_small_tasks: bool = False, batch_size: int = 32, batch_size_mem: int = 32, task_balanced_dataloader: bool = False, distributed_sampling: bool = True, **kwargs)[source]

Custom data loader for rehearsal strategies.

This dataloader iterates in parallel two datasets, the current data and the rehearsal memory, which are used to create mini-batches by concatenating their data together. Mini-batches from both of them are balanced using the task label (i.e. each mini-batch contains a balanced number of examples from all the tasks in the data and memory).

The length of the loader is determined only by the current task data and is the same than what it would be when creating a data loader for this dataset.

If oversample_small_tasks == True smaller tasks are oversampled to match the largest task.

Parameters:
  • data – AvalancheDataset.

  • memory – AvalancheDataset.

  • oversample_small_tasks – whether smaller tasks should be oversampled to match the largest one.

  • batch_size – the size of the data batch. It must be greater than or equal to the number of tasks.

  • batch_size_mem – the size of the memory batch. If task_balanced_dataloader is set to True, it must be greater than or equal to the number of tasks.

  • task_balanced_dataloader – if true, buffer data loaders will be task-balanced, otherwise it creates a single data loader for the buffer samples.

  • distributed_sampling – If True, apply the PyTorch DistributedSampler. Defaults to True. Note: the distributed sampler is not applied if not running a distributed training, even when True is passed.

  • kwargs – data loader arguments used to instantiate the loader for each task separately. See pytorch DataLoader.

Methods

__init__(data[, memory, ...])

Custom data loader for rehearsal strategies.