Training module
training
Training Templates
Templates define the training/eval loop for each setting (supervised CL, online CL, RL, …). Each template supports a set of callback that can be used by a plugin to execute code inside the training/eval loops.
Templates
Templates are defined in the avalanche.training.templates module.
|
Base class for continual learning skeletons. |
|
Base SGD class for continual learning skeletons. |
|
Base class for continual learning strategies. |
Plugins ABCs
ABCs for plugins are available in avalanche.core.
ABC for BaseTemplate plugins. |
|
ABC for BaseSGDTemplate plugins. |
|
ABC for SupervisedTemplate plugins. |
Training Strategies
Ready-to-use continual learning strategies.
|
Cumulative training strategy. |
|
Joint training on the entire stream. |
|
Naive finetuning. |
|
AR1 with Latent Replay. |
|
Deep Streaming Linear Discriminant Analysis. |
|
iCaRL Strategy. |
|
Progressive Neural Network strategy. |
|
CWR* Strategy. |
|
Experience replay strategy. |
|
Experience replay strategy. |
|
GDumb strategy. |
|
Learning without Forgetting (LwF) strategy. |
|
Average Gradient Episodic Memory (A-GEM) strategy. |
|
Gradient Episodic Memory (GEM) strategy. |
|
Elastic Weight Consolidation (EWC) strategy. |
|
Synaptic Intelligence strategy. |
|
Continual Prototype Evolution strategy. |
|
Less Forgetful Learning strategy. |
|
Generative Replay Strategy |
|
Memory Aware Synapses (MAS) strategy. |
|
Bias Correction (BiC) strategy. |
|
Maximally Interfered Replay Strategy See ER_MIR plugin for details. |
|
|
|
ER ACE, as proposed in "New Insights on Reducing Abrupt Representation Change in Online Continual Learning" by Lucas Caccia et. |
|
Learning to Prompt (L2P) strategy. |
|
Supervised Contrastive Replay from https://arxiv.org/pdf/2103.13885.pdf. |
|
Task-incremental fixed-network parameter isolation with PackNet. |
|
From scratch training strategy. |
|
Expert Gate strategy. |
|
Implements the DER and the DER++ Strategy, from the "Dark Experience For General Continual Learning" paper, Buzzega et. |
|
ER AML, as proposed in "New Insights on Reducing Abrupt Representation Change in Online Continual Learning" by Lucas Caccia et. |
|
Store some last layer features and use them for replay |
|
|
|
Replay Buffers and Selection Strategies
Buffers to store past samples according to different policies and selection strategies.
Buffers
|
ABC for rehearsal buffers to store exemplars. |
|
Buffer updated with reservoir sampling. |
|
A buffer that stores exemplars for rehearsal in separate groups. |
|
Rehearsal buffer with samples balanced over experiences. |
|
Stores samples for replay, equally divided over classes. |
|
Stores samples for replay using a custom selection strategy and grouping. |
Selection strategies
Base class to define how to select a subset of exemplars from a dataset. |
|
Select the exemplars at random in the dataset |
|
Base class to select exemplars from their features |
|
|
The herding strategy as described in iCaRL. |
|
A greedy algorithm that selects the remaining exemplar that is the closest to the center of all elements (in feature space). |
Loss Functions
Similar to the Knowledge Distillation Loss. |
|
RegularizationMethod implement regularization strategies. |
|
|
Learning Without Forgetting. |
Asymetric cross-entropy (ACE) Criterion used in "New Insights on Reducing Abrupt Representation Change in Online Continual Learning" by Lucas Caccia et. |
|
|
Supervised Contrastive Replay Loss as defined in Eq. |
|
Masked Cross Entropy |
|
Asymmetric metric learning (AML) Criterion used in "New Insights on Reducing Abrupt Representation Change in Online Continual Learning" by Lucas Caccia et. |
Training Plugins
Plugins can be added to any CL strategy to support additional behavior.
Utilities in avalanche.training.plugins.
|
Early stopping and model checkpoint plugin. |
|
Manager for logging and metrics. |
|
Learning Rate Scheduler Plugin. |
Strategy implemented as plugins in avalanche.training.plugins.
|
Average Gradient Episodic Memory Plugin. |
|
Continual Prototype Evolution plugin. |
|
CWR* Strategy. |
|
Elastic Weight Consolidation (EWC) plugin. |
|
GDumb plugin. |
|
Riemannian Walk (RWalk) plugin. |
|
Gradient Episodic Memory Plugin. |
|
GSSPlugin replay plugin. |
|
Less-Forgetful Learning (LFL) Plugin. |
|
Learning without Forgetting plugin. |
|
Experience replay plugin. |
|
Synaptic Intelligence plugin. |
|
Memory Aware Synapses (MAS) plugin. |
TrainGeneratorAfterExpPlugin makes sure that after each experience of training the solver of a scholar model, we also train the generator on the data of the current experience. |
|
|
Riemannian Walk (RWalk) plugin. |
|
Experience generative replay plugin. |
|
Bias Correction (BiC) plugin. |
|
Maximally Interfered Retrieval plugin, Implements the strategy defined in "Online Continual Learning with Maximally Interfered Retrieval" https://arxiv.org/abs/1908.04742 |
|
Retrospective Adversarial Replay for Continual Learning https://openreview.net/forum?id=XEoih0EwCwL Continual learning is an emerging research challenge in machine learning that addresses the problem where models quickly fit the most recently trained-on data and are prone to catastrophic forgetting due to distribution shifts --- it does this by maintaining a small historical replay buffer in replay-based methods. To avoid these problems, this paper proposes a method, ``Retrospective Adversarial Replay (RAR)'', that synthesizes adversarial samples near the forgetting boundary. RAR perturbs a buffered sample towards its nearest neighbor drawn from the current task in a latent representation space. By replaying such samples, we are able to refine the boundary between previous and current tasks, hence combating forgetting and reducing bias towards the current task. To mitigate the severity of a small replay buffer, we develop a novel MixUp-based strategy to increase replay variation by replaying mixed augmentations. Combined with RAR, this achieves a holistic framework that helps to alleviate catastrophic forgetting. We show that this excels on broadly-used benchmarks and outperforms other continual learning baselines especially when only a small buffer is used. We conduct a thorough ablation study over each key component as well as a hyperparameter sensitivity analysis to demonstrate the effectiveness and robustness of RAR. |
|
From Scratch Training Plugin. |
|
|
Updates FeCAM cov and prototypes using all the data seen so far WARNING: This is an oracle, and thus breaks assumptions usually made in continual learning algorithms i (storage of full dataset) This is meant to be used as an upper bound for FeCAM based methods (i.e when trying to estimate prototype and covariance drift) |
|
|
Updates FeCAM cov and prototypes using the data contained inside a memory buffer |
Updates FeCAM cov and prototypes using the current task data (at the end of each task) |
|
|
Updates NCM prototypes using the data contained inside a memory buffer (as is is done in ICaRL) |
Updates NCM prototypes using all the data seen so far WARNING: This is an oracle, and thus breaks assumptions usually made in continual learning algorithms i (storage of full dataset) This is meant to be used as an upper bound for NCM based methods (i.e when trying to estimate prototype drift) |
|
Updates the NCM prototypes using the current task data |