Evaluation module

This module provides a number of metrics to monitor the continual learning performance.
Metrics subclass the PluginMetric class, which provides all the callbacks needed to include custom metric logic in specific points of the continual learning workflow.

evaluation.metrics

Metrics helper functions

High-level functions to get specific plugin metrics objects (to be passed to the EvaluationPlugin).
This is the recommended way to build metrics. Use these functions when available.

accuracy_metrics(*[, minibatch, epoch, ...])

Helper method that can be used to obtain the desired set of plugin metrics.

loss_metrics(*[, minibatch, epoch, ...])

Helper method that can be used to obtain the desired set of plugin metrics.

bwt_metrics(*[, experience, stream])

Helper method that can be used to obtain the desired set of plugin metrics.

forgetting_metrics(*[, experience, stream])

Helper method that can be used to obtain the desired set of plugin metrics.

forward_transfer_metrics(*[, experience, stream])

Helper method that can be used to obtain the desired set of plugin metrics.

confusion_matrix_metrics([num_classes, ...])

Helper method that can be used to obtain the desired set of plugin metrics.

cpu_usage_metrics(*[, minibatch, epoch, ...])

Helper method that can be used to obtain the desired set of plugin metrics.

disk_usage_metrics(*[, paths_to_monitor, ...])

Helper method that can be used to obtain the desired set of standalone metrics.

gpu_usage_metrics(gpu_id[, every, ...])

Helper method that can be used to obtain the desired set of plugin metrics.

ram_usage_metrics(*[, every, minibatch, ...])

Helper method that can be used to obtain the desired set of plugin metrics.

timing_metrics(*[, minibatch, epoch, ...])

Helper method that can be used to obtain the desired set of plugin metrics.

MAC_metrics(*[, minibatch, epoch, experience])

Helper method that can be used to obtain the desired set of plugin metrics.

labels_repartition_metrics(*[, on_train, ...])

Create plugins to monitor the labels repartition.

mean_scores_metrics(*[, on_train, on_eval, ...])

Helper to create plugins to show the scores of the true class, averaged by

Stream Metrics

Stream metrics work at eval time only. Stream metrics return the average of metric results over all the experiences present in the evaluation stream.
Slicing the evaluation stream during test (e.g., strategy.eval(benchmark.test_stream[0:2])) will not include sliced-out experiences in the average.

StreamAccuracy()

At the end of the entire stream of experiences, this plugin metric reports the average accuracy over all patterns seen in all experiences.

TrainedExperienceAccuracy()

At the end of each experience, this plugin metric reports the average accuracy for only the experiences that the model has been trained on so far.

StreamLoss()

At the end of the entire stream of experiences, this metric reports the average loss over all patterns seen in all experiences.

StreamBWT()

The StreamBWT metric, emitting the average BWT across all experiences encountered during training.

StreamForgetting()

The StreamForgetting metric, describing the average evaluation accuracy loss detected over all experiences observed during training.

StreamForwardTransfer()

The Forward Transfer averaged over all the evaluation experiences.

StreamConfusionMatrix(num_classes, ...)

The Stream Confusion Matrix metric.

WandBStreamConfusionMatrix([class_names])

Confusion Matrix metric compatible with Weights and Biases logger.

StreamCPUUsage()

The average stream CPU usage metric.

StreamDiskUsage(paths_to_monitor)

The average stream Disk usage metric.

StreamTime()

The stream time metric.

StreamMaxRAM([every])

The Stream Max RAM metric.

StreamMaxGPU(gpu_id[, every])

The Stream Max GPU metric.

MeanScoresEvalPluginMetric(image_creator, ], ...)

Plugin to show the scores of the true class during evaluation, averaged by

Experience Metrics

Experience metrics work at eval time only. Experience metrics return the average metric results over all the patterns in the experience.

ExperienceAccuracy()

At the end of each experience, this plugin metric reports the average accuracy over all patterns seen in that experience.

ExperienceLoss()

At the end of each experience, this metric reports the average loss over all patterns seen in that experience.

ExperienceBWT()

The Experience Backward Transfer metric.

ExperienceForgetting()

The ExperienceForgetting metric, describing the accuracy loss detected for a certain experience.

ExperienceForwardTransfer()

The Forward Transfer computed on each experience separately.

ExperienceCPUUsage()

The average experience CPU usage metric.

ExperienceDiskUsage(paths_to_monitor)

The average experience Disk usage metric.

ExperienceTime()

The experience time metric.

ExperienceMAC()

At the end of each experience, this metric reports the MAC computed on a single pattern.

ExperienceMaxRAM([every])

The Experience Max RAM metric.

ExperienceMaxGPU(gpu_id[, every])

The Experience Max GPU metric.

Epoch Metrics

Epoch metrics work at train time only. Epoch metrics return the average metric results over all the patterns in the training dataset.

EpochAccuracy()

The average accuracy over a single training epoch.

EpochLoss()

The average loss over a single training epoch.

EpochCPUUsage()

The Epoch CPU usage metric.

EpochDiskUsage(paths_to_monitor)

The Epoch Disk usage metric.

EpochTime()

The epoch elapsed time metric.

EpochMAC()

The MAC at the end of each epoch computed on a single pattern.

EpochMaxRAM([every])

The Epoch Max RAM metric.

EpochMaxGPU(gpu_id[, every])

The Epoch Max GPU metric.

MeanScoresTrainPluginMetric(image_creator, ...)

Plugin to show the scores of the true class during the lasts training

RunningEpoch Metrics

Running Epoch metrics work at train time only. RunningEpoch metrics return the average metric results over all the patterns encountered up to the current iteration in the training epoch.

RunningEpochAccuracy()

The average accuracy across all minibatches up to the current epoch iteration.

RunningEpochLoss()

The average loss across all minibatches up to the current epoch iteration.

RunningEpochCPUUsage()

The running epoch CPU usage metric.

RunningEpochTime()

The running epoch time metric.

Minibatch Metrics

Minibatch metrics work at train time only. Minibatch metrics return the average metric results over all the patterns in the current minibatch.

MinibatchAccuracy()

The minibatch plugin accuracy metric.

MinibatchLoss()

The minibatch loss metric.

MinibatchCPUUsage()

The minibatch CPU usage metric.

MinibatchDiskUsage(paths_to_monitor)

The minibatch Disk usage metric.

MinibatchTime()

The minibatch time metric.

MinibatchMAC()

The minibatch MAC metric.

MinibatchMaxRAM([every])

The Minibatch Max RAM metric.

MinibatchMaxGPU(gpu_id[, every])

The Minibatch Max GPU metric.

evaluation.metric_definitions

General interfaces on which metrics are built.

Metric(*args, **kwargs)

Definition of a standalone metric.

PluginMetric()

A metric that can be used together with EvaluationPlugin.

GenericPluginMetric(metric[, reset_at, ...])

This class provides a generic implementation of a Plugin Metric.