avalanche.training.AR1

class avalanche.training.AR1(criterion=None, lr: float = 0.001, momentum=0.9, l2=0.0005, train_epochs: int = 4, init_update_rate: float = 0.01, inc_update_rate=5e-05, max_r_max=1.25, max_d_max=0.5, inc_step=4.1e-05, rm_sz: int = 1500, freeze_below_layer: str = 'lat_features.19.bn.beta', latent_layer_num: int = 19, ewc_lambda: float = 0, train_mb_size: int = 128, eval_mb_size: int = 128, device=None, plugins: typing.Optional[typing.List[avalanche.core.SupervisedPlugin]] = None, evaluator: avalanche.training.plugins.evaluation.EvaluationPlugin = <avalanche.training.plugins.evaluation.EvaluationPlugin object>, eval_every=-1)[source]

AR1 with Latent Replay.

This implementations allows for the use of both Synaptic Intelligence and Latent Replay to protect the lower level of the model from forgetting.

While the original papers show how to use those two techniques in a mutual exclusive way, this implementation allows for the use of both of them concurrently. This behaviour is controlled by passing proper constructor arguments).

__init__(criterion=None, lr: float = 0.001, momentum=0.9, l2=0.0005, train_epochs: int = 4, init_update_rate: float = 0.01, inc_update_rate=5e-05, max_r_max=1.25, max_d_max=0.5, inc_step=4.1e-05, rm_sz: int = 1500, freeze_below_layer: str = 'lat_features.19.bn.beta', latent_layer_num: int = 19, ewc_lambda: float = 0, train_mb_size: int = 128, eval_mb_size: int = 128, device=None, plugins: typing.Optional[typing.List[avalanche.core.SupervisedPlugin]] = None, evaluator: avalanche.training.plugins.evaluation.EvaluationPlugin = <avalanche.training.plugins.evaluation.EvaluationPlugin object>, eval_every=-1)[source]

Creates an instance of the AR1 strategy.

Parameters
  • criterion – The loss criterion to use. Defaults to None, in which case the cross entropy loss is used.

  • lr – The learning rate (SGD optimizer).

  • momentum – The momentum (SGD optimizer).

  • l2 – The L2 penalty used for weight decay.

  • train_epochs – The number of training epochs. Defaults to 4.

  • init_update_rate – The initial update rate of BatchReNorm layers.

  • inc_update_rate – The incremental update rate of BatchReNorm layers.

  • max_r_max – The maximum r value of BatchReNorm layers.

  • max_d_max – The maximum d value of BatchReNorm layers.

  • inc_step – The incremental step of r and d values of BatchReNorm layers.

  • rm_sz – The size of the replay buffer. The replay buffer is shared across classes. Defaults to 1500.

  • freeze_below_layer – A string describing the name of the layer to use while freezing the lower (nearest to the input) part of the model. The given layer is not frozen (exclusive).

  • latent_layer_num – The number of the layer to use as the Latent Replay Layer. Usually this is the same of freeze_below_layer.

  • ewc_lambda – The Synaptic Intelligence lambda term. Defaults to 0, which means that the Synaptic Intelligence regularization will not be applied.

  • train_mb_size – The train minibatch size. Defaults to 128.

  • eval_mb_size – The eval minibatch size. Defaults to 128.

  • device – The device to use. Defaults to None (cpu).

  • plugins – (optional) list of StrategyPlugins.

  • evaluator – (optional) instance of EvaluationPlugin for logging and metric computations.

  • eval_every – the frequency of the calls to eval inside the training loop. -1 disables the evaluation. 0 means eval is called only at the end of the learning experience. Values >0 mean that eval is called every eval_every epochs and at the end of the learning experience.

Methods

__init__([criterion, lr, momentum, l2, ...])

Creates an instance of the AR1 strategy.

backward()

Run the backward pass.

criterion()

Loss function.

eval(exp_list, **kwargs)

Evaluate the current model on a series of experiences and returns the last recorded value for each metric.

eval_dataset_adaptation(**kwargs)

Initialize self.adapted_dataset.

eval_epoch(**kwargs)

Evaluation loop over the current self.dataloader.

filter_bn_and_brn(param_def)

forward()

Compute the model's output given the current mini-batch.

make_eval_dataloader([num_workers, ...])

Initializes the eval data loader. :param num_workers: How many subprocesses to use for data loading. 0 means that the data will be loaded in the main process. (default: 0). :param pin_memory: If True, the data loader will copy Tensors into CUDA pinned memory before returning them. Defaults to True. :param kwargs: :return:.

make_optimizer()

Optimizer initialization.

make_train_dataloader([num_workers, shuffle])

Called after the dataset instantiation.

model_adaptation([model])

Adapts the model to the current data.

optimizer_step()

Execute the optimizer step (weights update).

stop_training()

Signals to stop training at the next iteration.

train(experiences[, eval_streams])

Training loop.

train_dataset_adaptation(**kwargs)

Initialize self.adapted_dataset.

training_epoch(**kwargs)

Training epoch.

Attributes

is_eval

True if the strategy is in evaluation mode.

mb_task_id

Current mini-batch task labels.

mb_x

Current mini-batch input.

mb_y

Current mini-batch target.