avalanche.benchmarks.split_online_stream

avalanche.benchmarks.split_online_stream(original_stream: Iterable[DatasetExperience], experience_size: int, shuffle: bool = True, drop_last: bool = False, experience_split_strategy: Callable[[DatasetExperience[TCLDataset], int], Iterable[OnlineCLExperience[TCLDataset]]] | None = None, access_task_boundaries: bool = False, seed: int | None = None) → CLStream[DatasetExperience[TCLDataset]][source]

Split a stream of large batches to create an online stream of small mini-batches.

The resulting stream can be used for Online Continual Learning (OCL) scenarios (or data-incremental, or other online-based settings).

For efficiency reasons, the resulting stream is an iterator, generating experience on-demand.

Parameters:

original_stream – The stream with the original data.
experience_size – The size of the experience, as an int. Ignored if custom_split_strategy is used.
shuffle – If True, experiences will be split by first shuffling instances in each experience. This will use the default PyTorch random number generator at its current state. Defaults to False. Ignored if experience_split_strategy is used.
drop_last – If True, if the last experience doesn’t contain experience_size instances, then the last experience will be dropped. Defaults to False. Ignored if experience_split_strategy is used.
experience_split_strategy – A function that implements a custom splitting strategy. The function must accept an experience and return an experience’s iterator. Defaults to None, which means that the standard splitting strategy will be used (which creates experiences of size experience_size). A good starting to understand the mechanism is to look at the implementation of the standard splitting function fixed_size_experience_split().
seed – random seed used for shuffling by the default splitter.

Returns:

A lazy online stream with experiences of size experience_size.