avalanche.benchmarks.utils.data_loader.GroupBalancedDataLoader

class avalanche.benchmarks.utils.data_loader.GroupBalancedDataLoader(datasets: Sequence[AvalancheDataset], oversample_small_groups: bool = False, batch_size: int = 32, distributed_sampling: bool = True, **kwargs)[source]

Data loader that balances data from multiple datasets.

__init__(datasets: Sequence[AvalancheDataset], oversample_small_groups: bool = False, batch_size: int = 32, distributed_sampling: bool = True, **kwargs)[source]

Data loader that balances data from multiple datasets.

Mini-batches emitted by this dataloader are created by collating together mini-batches from each group. It may be used to balance data among classes, experiences, tasks, and so on.

If oversample_small_groups == True smaller groups are oversampled to match the largest group. Otherwise, once data from a group is completely iterated, the group will be skipped.

Parameters:

datasets – an instance of AvalancheDataset.
oversample_small_groups – whether smaller groups should be oversampled to match the largest one.
batch_size – the size of the batch. It must be greater than or equal to the number of groups.
distributed_sampling – If True, apply the PyTorch DistributedSampler. Defaults to True. Note: the distributed sampler is not applied if not running a distributed training, even when True is passed.
kwargs – data loader arguments used to instantiate the loader for each group separately. See pytorch DataLoader.

Methods

__init__(datasets[, ...])

Data loader that balances data from multiple datasets.