Epoch metrics are metrics calculated only at the end of every epoch. They need to be implemented following the interface class, but we provide an exhaustive list.

Epoch Metric Interface

class poutyne.framework.metrics.EpochMetric[source]

The abstract class representing a epoch metric which can be accumulated at each batch and calculated at the end of the epoch.

abstract forward(y_pred, y_true)[source]

To define the behavior of the metric when called.

  • y_pred – The prediction of the model.

  • y_true – Target to evaluate the model.

abstract get_metric()[source]

Compute and return the metric.

Epoch Metrics

class poutyne.framework.metrics.FBeta(metric: str = 'fscore', average: str = 'micro', beta: float = 1.0)[source]

The source code of this class is under the Apache v2 License and was copied from the AllenNLP project and has been modified.

Compute precision, recall, F-measure and support for each class.

The precision is the ratio tp / (tp + fp) where tp is the number of true positives and fp the number of false positives. The precision is intuitively the ability of the classifier not to label as positive a sample that is negative.

The recall is the ratio tp / (tp + fn) where tp is the number of true positives and fn the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples.

The F-beta score can be interpreted as a weighted harmonic mean of the precision and recall, where an F-beta score reaches its best value at 1 and worst score at 0.

If we have precision and recall, the F-beta score is simply: F-beta = (1 + beta ** 2) * precision * recall / (beta ** 2 * precision + recall)

The F-beta score weights recall more than precision by a factor of beta. beta == 1.0 means recall and precision are equally important.

The support is the number of occurrences of each class in y_true.

  • metric (str) – One of {‘fscore’, ‘precision’, ‘recall’}. Wheter to return the F-score, the precision or the recall. (Default value = ‘fscore’)

  • average (Union[str, int]) –

    One of {‘micro’ (default), ‘macro’, label_number} If the argument is of type integer, the score for this class (the label number) is calculated. Otherwise, this determines the type of averaging performed on the data:


    Calculate metrics globally by counting the total true positives, false negatives and false positives.


    Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.

    (Default value = ‘micro’)

  • beta (float) – The strength of recall versus precision in the F-score. (Default value = 1.0)

class poutyne.framework.metrics.F1(average='micro')[source]

Alias class for FBeta where metric == 'fscore' and beta == 1.