poutyne.framework.callbacks

Callbacks are a way to interact with the optimization process. For instance, the ModelCheckpoint callback allows to save the weights of the epoch that has the best “score”, or the EarlyStopping callback allows to stop the training when the “score” has not gone up for a while, etc. The following presents the callbacks available in Poutyne, but first the documentation of the Callback class shows which methods are available in the callback and what arguments they are provided with.

Callback class

class poutyne.framework.callbacks.Callback[source]
params

Contains a key ‘epoch’ and a key ‘steps_per_epoch’ which are passed to the fit function in Model. It may contain other keys.

Type

dict

model

a reference to the Model object which is using the callback.

Type

Model

on_backward_end(batch)[source]

Is called after the backpropagation but before the optimization step.

Parameters

batch (int) – The batch number.

on_batch_begin(batch, logs)[source]

Is called before the begining of each batch.

Parameters
  • batch (int) – The batch number.

  • logs (dict) – Usually an empty dict.

on_batch_end(batch, logs)[source]

Is called before the end of each batch.

Parameters
  • batch (int) – The batch number.

  • logs (dict) –

    Contains the following keys:

    • ’batch’: The batch number.

    • ’loss’: The loss of the batch.

    • ’time’: The computation time of the batch.

    • Other metrics: One key for each type of metrics.

Example:

logs = {'batch': 6, 'time': 0.10012837, 'loss': 4.34462, 'accuracy': 0.766}
on_epoch_begin(epoch, logs)[source]

Is called before the begining of each epoch.

Parameters
  • epoch (int) – The epoch number.

  • logs (dict) – Usually an empty dict.

on_epoch_end(epoch, logs)[source]

Is called before the end of each epoch.

Parameters
  • epoch (int) – The epoch number.

  • logs (dict) –

    Contains the following keys:

    • ’epoch’: The epoch number.

    • ’loss’: The average loss of the batches.

    • ’time’: The computation time of the epoch.

    • Other metrics: One key for each type of metrics. The metrics are also averaged.

    • val_loss’: The average loss of the batches on the validation set.

    • Other metrics: One key for each type of metrics on the validation set. The metrics are also averaged.

Example:

logs = {'epoch': 6, 'time': 3.141519837, 'loss': 4.34462, 'accuracy': 0.766,
        'val_loss': 5.2352, 'val_accuracy': 0.682}
on_train_begin(logs)[source]

Is called before the begining of the training.

Parameters

logs (dict) – Usually an empty dict.

on_train_end(logs)[source]

Is called before the end of the training.

Parameters

logs (dict) – Usually an empty dict.

Poutyne’s Callbacks

class poutyne.framework.callbacks.TerminateOnNaN[source]

Stops the training when the loss is either NaN or inf.

class poutyne.framework.callbacks.BestModelRestore(*, monitor='val_loss', mode='min', verbose=False)[source]

Restore the weights of the best model at the end of the training depending on a monitored quantity.

Parameters
  • monitor (string) – Quantity to monitor. (Default value = ‘val_loss’)

  • mode (string) – One of {min, max}. Whether the monitored has to be maximized or minimized. For instance, for val_accuracy, this should be max, and for val_loss, this should be min, etc. (Default value = ‘min’)

  • verbose (bool) – Whether to display a message when the model has improved or when restoring the best model. (Default value = False)

class poutyne.framework.callbacks.EarlyStopping(*, monitor='val_loss', min_delta=0, patience=0, verbose=False, mode='min')[source]

The source code of this class is under the MIT License and was copied from the Keras project, and has been modified.

Stop training when a monitored quantity has stopped improving.

Parameters
  • monitor (int) – Quantity to be monitored.

  • min_delta (float) – Minimum change in the monitored quantity to qualify as an improvement, i.e. an absolute change of less than min_delta, will count as no improvement. (Default value = 0)

  • patience (int) – Number of epochs with no improvement after which training will be stopped. (Default value = 0)

  • verbose (bool) – Whether to print when early stopping is done. (Default value = False)

  • mode (string) – One of {min, max}. In min mode, training will stop when the quantity monitored has stopped decreasing; in max mode it will stop when the quantity monitored has stopped increasing. (Default value = ‘min’)

class poutyne.framework.callbacks.DelayCallback(callbacks, *, epoch_delay=None, batch_delay=None)[source]

Delays one or many callbacks for a certain number of epochs or number of batches. If both epoch_delay and batch_delay are provided, the longer one has precedence.

Parameters
  • callbacks (Callback, list of Callback) – A callback or a list of callbacks to delay.

  • epoch_delay (int, optional) – Number of epochs to delay.

  • batch_delay (int, optional) – Number of batches to delay. The number of batches can span many epochs. When the batch delay expires (i.e. there are more than batch_delay done), the on_epoch_begin method is called on the callback(s) before the on_batch_begin method.

class poutyne.framework.callbacks.ClipNorm(parameters, max_norm, *, norm_type=2)[source]

Uses PyTorch torch.nn.utils.clip_grad_norm_ method to clip gradient.

See:

torch.nn.utils.clip_grad_norm_

class poutyne.framework.callbacks.ClipValue(parameters, clip_value)[source]

Uses PyTorch torch.nn.utils.clip_grad_value_ method to clip gradient.

See:

torch.nn.utils.clip_grad_value_

Logging

class poutyne.framework.callbacks.CSVLogger(filename, *, batch_granularity=False, separator=', ', append=False)[source]

Callback that output the result of each epoch or batch into a CSV file.

Parameters
  • filename (string) – The filename of the CSV.

  • batch_granularity (bool) – Whether to also output the result of each batch in addition to the epochs. (Default value = False)

  • separator (string) – The separator to use in the CSV. (Default value = ‘,’)

  • append (bool) – Whether to append to an existing file.

class poutyne.framework.callbacks.TensorBoardLogger(writer)[source]

Callback that output the result of each epoch or batch into a Tensorboard experiment folder.

Parameters

writer (SummaryWriter) – The tensorboard writer.

Example

Using TensorBoardLogger:

from tensorboardX import SummaryWriter
# or from torch.utils.tensorboard import SummaryWriter
from poutyne.framework import Model
from poutyne.framework.callbacks import TensorBoardLogger

writer = SummaryWriter('runs')
tb_logger = TensorBoardLogger(writer)

model = Model(...)
model.fit_generator(..., callbacks=[tb_logger])

Checkpointing

Poutyne provides callbacks for checkpointing the state of the optimization so that it can be stopped and restarted at a later point. All the checkpointing classes inherit the PeriodicSaveCallback class and, thus, have the same arguments in their constructors. They may have other arguments specific to their purpose.

class poutyne.framework.callbacks.PeriodicSaveCallback(filename, *, monitor='val_loss', mode='min', save_best_only=False, period=1, verbose=False, temporary_filename=None, atomic_write=True, open_mode='wb')[source]

The source code of this class is under the MIT License and was copied from the Keras project, and has been modified.

Write a file after every epoch. filename can contain named formatting options, which will be filled the value of epoch and keys in logs (passed in on_epoch_end). For example: if filename is weights.{epoch:02d}-{val_loss:.2f}.txt, then save_file() will be called with a file descriptor for a file with the epoch number and the validation loss in the filename.

By default, the file are written atomically to the specified filename so that the training can be killed and restarted later using the same filename for periodic file saving. To do so, a temporary file is created using the system’s tmp directory and then is moved a the final destination after the checkpoint is made. Sometimes, this move is not possible on some system. To address this problem, it is possible to specify the destination of the temporary file using the temporary_filename argument.

Parameters
  • filename (string) – Path to save the model file.

  • monitor (string) – Quantity to monitor. (Default value = ‘val_loss’)

  • verbose (bool) – Whether to display a message when saving and restoring a checkpoint. (Default value = False)

  • save_best_only (bool) – If save_best_only is true, the latest best model according to the quantity monitored will not be overwritten. (Default value = False)

  • mode (string) – One of {min, max}. If save_best_only is true, the decision to overwrite the current save file is made based on either the maximization or the minimization of the monitored quantity. For val_accuracy, this should be max, for val_loss this should be min, etc. (Default value = ‘min’)

  • period (int) – Interval (number of epochs) between checkpoints. (Default value = 1)

  • temporary_filename (string, optional) – Temporary filename for the checkpoint so that the last checkpoint can be written atomically. See the atomic_write argument.

  • atomic_write (bool) – Whether to right atomically the checkpoint. See the description above for details. (Default value = True)

  • open_mode (str) – mode option passed to open(). (Default value = ‘wb’)

class poutyne.framework.callbacks.ModelCheckpoint(*args, restore_best=False, **kwargs)[source]

Save the model after every epoch. See poutyne.framework.PeriodicSaveCallback for the arguments’ descriptions.

Parameters

restore_best (bool) – If restore_best is true, the weights of the network will be reset to the last best checkpoint done. This option only works when save_best_only is also true. (Default value = False)

See:

poutyne.framework.PeriodicSaveCallback

class poutyne.framework.callbacks.OptimizerCheckpoint(filename, *, monitor='val_loss', mode='min', save_best_only=False, period=1, verbose=False, temporary_filename=None, atomic_write=True, open_mode='wb')[source]

Save the state of the optimizer after every epoch. The optimizer can be reloaded as follows.

model = Model(model, optimizer, loss_function)
model.load_optimizer_state(filename)

See poutyne.framework.PeriodicSaveCallback for the arguments’ descriptions.

See:

poutyne.framework.PeriodicSaveCallback

class poutyne.framework.callbacks.LRSchedulerCheckpoint(lr_scheduler, *args, **kwargs)[source]

Save the state of an LR scheduler callback after every epoch. The LR scheduler callback should not be passed to the fit*() methods since it is called by this callback instead. The LR scheduler can be reloaded as follows.

lr_scheduler = AnLRSchedulerCallback(...)
lr_scheduler.load_state(filename)

See poutyne.framework.PeriodicSaveCallback for the arguments’ descriptions.

Parameters

lr_scheduler – An LR scheduler callback.

See:

poutyne.framework.PeriodicSaveCallback

class poutyne.framework.callbacks.PeriodicSaveLambda(func, *args, **kwargs)[source]

Call a lambda with a file descriptor after every epoch. See poutyne.framework.PeriodicSaveCallback for the arguments’ descriptions.

Parameters

func (fd, int, dict -> None) – The lambda that will be called with a file descriptor, the epoch number and the epoch logs.

See:

poutyne.framework.PeriodicSaveCallback

LR Schedulers

Poutyne’s callbacks for learning rate schedulers are just wrappers around PyTorch’s learning rate schedulers and thus have the same arguments except for the optimizer that has to be omitted.

class poutyne.framework.callbacks.lr_scheduler.CosineAnnealingLR(*args, **kwargs)
See:

PyTorch CosineAnnealingLR

class poutyne.framework.callbacks.lr_scheduler.CosineAnnealingWarmRestarts(*args, **kwargs)
See:

PyTorch CosineAnnealingWarmRestarts

class poutyne.framework.callbacks.lr_scheduler.CyclicLR(*args, **kwargs)
See:

PyTorch CyclicLR

class poutyne.framework.callbacks.lr_scheduler.ExponentialLR(*args, **kwargs)
See:

PyTorch ExponentialLR

class poutyne.framework.callbacks.lr_scheduler.LambdaLR(*args, **kwargs)
See:

PyTorch LambdaLR

class poutyne.framework.callbacks.lr_scheduler.MultiStepLR(*args, **kwargs)
See:

PyTorch MultiStepLR

class poutyne.framework.callbacks.lr_scheduler.StepLR(*args, **kwargs)
See:

PyTorch StepLR

class poutyne.framework.callbacks.lr_scheduler.ReduceLROnPlateau(*args, monitor='val_loss', **kwargs)[source]
Parameters

monitor (string) – The quantity to monitor. (Default value = ‘val_loss’)

See:

PyTorch ReduceLROnPlateau

Policies

The policies module is an alternative way to configure your training process. It gives you fine grained control over the process.

The training is divided into phases with the Phase class. A Phase contains parameter spaces (e.g. learning rate, or momentum, or both) for the optimizer. You chain Phase instances by passing them to the OptimizerPolicy OptimizerPolicy is a Callback that uses the phasese, steps through them, and sets the parameters of the optimizer.

class poutyne.framework.callbacks.policies.Phase(*, lr=None, momentum=None)[source]

A Phase defines how to configure an optimizer.

For each train step it returns a dictionary that contains the configuration for the optimizer.

Parameters
  • lr – a configuration space for the learning rate (optional).

  • momentum – a configuration space for the momentum (optional).

class poutyne.framework.callbacks.policies.OptimizerPolicy(phases: List, *, initial_step: int = 0)[source]

Combine different Phase instances in an OptimizerPolicy and execute the policies in a row.

Parameters
  • phases – A list of Phase instances.

  • initial_step – The step to start the policy in. Used for restarting.

poutyne.framework.callbacks.policies.linspace(start: float, end: float, steps: int)[source]

A lazy linear parameter space that goes from start to end in steps steps.

Parameters
  • start – the start point.

  • end – the end point.

  • steps – the number of steps between start and end.

Example

>>> list(linspace(0, 1, 3))
[0.0, 0.5, 1.0]
poutyne.framework.callbacks.policies.cosinespace(start, end, steps)[source]

A lazy cosine parameter space that goes from start to end in steps steps.

Parameters
  • start – the start point.

  • end – the end point.

  • steps – the number of steps between start and end.

Example

>>> list(cosinespace(0, 1, 3))
[0.0, 0.5, 1.0]

High Level Policies

Ready to use policies.

poutyne.framework.callbacks.policies.one_cycle_phases(steps: int, lr: Tuple[float, float] = (0.1, 1), momentum: Tuple[float, float] = (0.95, 0.85), finetune_lr: float = 0.01, finetune_fraction: float = 0.1) → List[poutyne.framework.callbacks.policies.Phase][source]

The “one-cycle” policy as described in the paper “Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates”.

You might want to read the paper and adjust the parameters.

Parameters
  • steps – the total number of steps to take.

  • lr – tuple for the triangular learning rate (start, middle).

  • momentum – tuple for the triangular momentum (start, middle).

  • finetune_lr – target learning rate for the final finetuning. Should be smaller than min(lr).

  • finetune_fraction – fraction of steps used for the finetuning. Must be between 0 and 1.

Returns

A list of configured Phase instances.

References

“Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates”

Leslie N. Smith, Nicholay Topin https://arxiv.org/abs/1708.07120

poutyne.framework.callbacks.policies.sgdr_phases(base_cycle_length: int, cycles: int, lr: Tuple[float, float] = (1.0, 0.1), cycle_mult: int = 2) → List[poutyne.framework.callbacks.policies.Phase][source]

The “SGDR” policy as described in the paper “SGDR: Stochastic Gradient Descent with Warm Restarts”.

Note the total number of steps is calculated like this: total_steps = sum(base_cycle_length * (cycle_mult ** i) for i in range(cycles))

You might want to read the paper and adjust the parameters.

Parameters
  • base_cycle_length – number of steps for the first cycle.

  • cycles – the number of repetitions.

  • lr – tuple for the learning rate for one cycle: (start, end).

  • cycle_mult – multiply the last cycle length with this every cycle. The length of a cycle grows exponentially.

Returns

A list of configured Phase instances.

References

“SGDR: Stochastic Gradient Descent with Warm Restarts”

Ilya Loshchilov, Frank Hutter https://arxiv.org/abs/1608.03983