Experiment

class poutyne.Experiment(directory: str, network: torch.nn.modules.module.Module, *, device: Union[torch.device, List[torch.device], List[str], None, str] = None, logging: bool = True, optimizer: Union[torch.optim.optimizer.Optimizer, str] = 'sgd', loss_function: Optional[Union[Callable, str]] = None, batch_metrics: Optional[List] = None, epoch_metrics: Optional[List] = None, monitoring: bool = True, monitor_metric: Optional[str] = None, monitor_mode: Optional[str] = None, task: Optional[str] = None)[source]

The Experiment class provides a straightforward experimentation tool for efficient and entirely customizable finetuning of the whole neural network training procedure with PyTorch. The Experiment object takes care of the training and testing processes while also managing to keep traces of all pertinent information via the automatic logging option.

Parameters
  • directory (str) – Path to the experiment’s working directory. Will be used for automatic logging.

  • network (torch.nn.Module) – A PyTorch network.

  • device (Union[torch.torch.device, List[torch.torch.device], str, None]) – The device to which the model is sent or for multi-GPUs, the list of devices to which the model is to be sent. When using a string for a multiple GPUs, the option is “all”, for “take them all.” By default, the current device is used as the main one. If None, the model will be kept on its current device. (Default value = None)

  • logging (bool) – Whether or not to log the experiment’s progress. If true, various logging callbacks will be inserted to output training and testing stats as well as to save model checkpoints, for example, automatically. See train() and test() for more details. (Default value = True)

  • optimizer (Union[torch.optim.Optimizer, str]) – If Pytorch Optimizer, must already be initialized. If str, should be the optimizer’s name in Pytorch (i.e. ‘Adam’ for torch.optim.Adam). (Default value = ‘sgd’)

  • loss_function (Union[Callable, str], optional) – loss layer or custom loss function. It can also be a string with the same name as a PyTorch loss function (either the functional or object name). The loss function must have the signature loss_function(input, target) where input is the prediction of the network and target is the ground truth. If None, will default to, in priority order, either the model’s own loss function or the default loss function associated with the task. (Default value = None)

  • batch_metrics (List, optional) – List of functions with the same signature as the loss function. Each metric can be any PyTorch loss function. It can also be a string with the same name as a PyTorch loss function (either the functional or object name). ‘accuracy’ (or just ‘acc’) is also a valid metric. Each metric function is called on each batch of the optimization and on the validation batches at the end of the epoch. (Default value = None)

  • epoch_metrics (List, optional) – List of functions with the same signature as EpochMetric (Default value = None)

  • monitoring (bool) – Whether or not to monitor the training. If True will track the best epoch. If False, monitor_metric and monitor_mode are not used, and when testing, the last epoch is used to test the model instead of the best epoch. (Default value = True)

  • monitor_metric (str, optional) –

    Which metric to consider for best model performance calculation. Should be in the format ‘{metric_name}’ or ‘val_{metric_name}’ (i.e. ‘val_loss’). If None, will follow the value suggested by task or default to ‘val_loss’. If monitoring is set to False, will be ignore.

    Warning

    If you do not plan on using a validation set, you must set the monitor metric to another value.

  • monitor_mode (str, optional) – Which mode, either ‘min’ or ‘max’, should be used when considering the monitor_metric value. If None, will follow the value suggested by task or default to ‘min’. If monitoring is set to False, will be ignore.

  • task (str, optional) – Any str beginning with either ‘classif’ or ‘reg’. Specifying a task can assign default values to the loss_function, batch_metrics, monitor_mode and monitor_mode. For task that begins with ‘reg’, the only default value is the loss function that is the mean squared error. When beginning with ‘classif’, the default loss function is the cross-entropy loss. The default batch metrics will be the accuracy, the default epoch metrics will be the F1 score and the default monitoring will be set on ‘val_acc’ with a ‘max’ mode. (Default value = None)

Examples

Using a PyTorch DataLoader, on classification task with SGD optimizer:

import torch
from torch.utils.data import DataLoader, TensorDataset
from poutyne import Experiment

num_features = 20
num_classes = 5

# Our training dataset with 800 samples.
num_train_samples = 800
train_x = torch.rand(num_train_samples, num_features)
train_y = torch.randint(num_classes, (num_train_samples, ), dtype=torch.long)
train_dataset = TensorDataset(train_x, train_y)
train_generator = DataLoader(train_dataset, batch_size=32)

# Our validation dataset with 200 samples.
num_valid_samples = 200
valid_x = torch.rand(num_valid_samples, num_features)
valid_y = torch.randint(num_classes, (num_valid_samples, ), dtype=torch.long)
valid_dataset = TensorDataset(valid_x, valid_y)
valid_generator = DataLoader(valid_dataset, batch_size=32)

# Our network
pytorch_network = torch.nn.Linear(num_features, num_train_samples)

# Initialization of our experimentation and network training
exp = Experiment('./simple_example',
                 pytorch_network,
                 optimizer='sgd',
                 task='classif')
exp.train(train_generator, valid_generator, epochs=5)

The above code will yield an output similar to the below lines. Note the automatic checkpoint saving in the experiment directory when the monitored metric improved.

Epoch 1/5 0.09s Step 25/25: loss: 6.351375, acc: 1.375000, val_loss: 6.236106, val_acc: 5.000000
Epoch 1: val_acc improved from -inf to 5.00000, saving file to ./simple_example/checkpoint_epoch_1.ckpt
Epoch 2/5 0.10s Step 25/25: loss: 6.054254, acc: 14.000000, val_loss: 5.944495, val_acc: 19.500000
Epoch 2: val_acc improved from 5.00000 to 19.50000, saving file to ./simple_example/checkpoint_epoch_2.ckpt
Epoch 3/5 0.09s Step 25/25: loss: 5.759377, acc: 22.875000, val_loss: 5.655412, val_acc: 21.000000
Epoch 3: val_acc improved from 19.50000 to 21.00000, saving file to ./simple_example/checkpoint_epoch_3.ckpt
...

Training can now easily be resumed from the best checkpoint:

exp.train(train_generator, valid_generator, epochs=10)
Restoring model from ./simple_example/checkpoint_epoch_3.ckpt
Loading weights from ./simple_example/checkpoint.ckpt and starting at epoch 6.
Loading optimizer state from ./simple_example/checkpoint.optim and starting at epoch 6.
Epoch 6/10 0.16s Step 25/25: loss: 4.897135, acc: 22.875000, val_loss: 4.813141, val_acc: 20.500000
Epoch 7/10 0.10s Step 25/25: loss: 4.621514, acc: 22.625000, val_loss: 4.545359, val_acc: 20.500000
Epoch 8/10 0.24s Step 25/25: loss: 4.354721, acc: 23.625000, val_loss: 4.287117, val_acc: 20.500000
...

Testing is also very intuitive:

exp.test(test_generator)
Restoring model from ./simple_example/checkpoint_epoch_9.ckpt
Found best checkpoint at epoch: 9
lr: 0.01, loss: 4.09892, acc: 23.625, val_loss: 4.04057, val_acc: 21.5
On best model: test_loss: 4.06664, test_acc: 17.5

Finally, all the pertinent metrics specified to the Experiment at each epoch are stored in a specific logging file, found here at ‘./simple_example/log.tsv’.

epoch       time                lr      loss                    acc     val_loss            val_acc
1       0.0721172170015052  0.01    6.351375141143799       1.375   6.23610631942749        5.0
2       0.0298177790245972  0.01    6.054253826141357       14.000  5.94449516296386        19.5
3       0.0637106419890187  0.01    5.759376544952392       22.875  5.65541223526001        21.0
...

Also, we could use more than one GPU (on a single node) by using the device argument

# Initialization of our experimentation and network training
exp = Experiment('./simple_example',
                 pytorch_network,
                 optimizer='sgd',
                 task='classif',
                 device="all")
exp.train(train_generator, valid_generator, epochs=5)
get_path(*paths: str) str[source]

Returns the path inside the experiment directory.

get_best_epoch_stats() Dict[source]

Returns all computed statistics corresponding to the best epoch according to the monitor_metric and monitor_mode attributes.

Returns

dict where each key is a column name in the logging output file and values are the ones found at the best epoch.

get_saved_epochs()[source]

Returns a pandas DataFrame which each row corresponds to an epoch having a saved checkpoint.

Returns

pandas DataFrame which each row corresponds to an epoch having a saved checkpoint.

train(train_generator, valid_generator=None, **kwargs) List[Dict][source]

Trains or finetunes the model on a dataset using a generator. If a previous training already occurred and lasted a total of n_previous epochs, then the model’s weights will be set to the last checkpoint and the training will be resumed for epochs range (n_previous, epochs].

If the Experiment has logging enabled (i.e. self.logging is True), numerous callbacks will be automatically included. Notably, two ModelCheckpoint objects will take care of saving the last and every new best (according to monitor mode) model weights in appropriate checkpoint files. OptimizerCheckpoint and LRSchedulerCheckpoint will also respectively handle the saving of the optimizer and LR scheduler’s respective states for future retrieval. Moreover, a AtomicCSVLogger will save all available epoch statistics in an output .tsv file. Lastly, a TensorBoardLogger handles automatic TensorBoard logging of various neural network statistics.

Warning

With Jupyter Notebooks in Firefox, if colorama is installed and colors are enabled (as it is by default), a great number of epochs and steps per epoch can cause a spike in memory usage in Firefox. The problem does not occur in Google Chrome/Chromium. To avoid this problem, you can disable the colors by passing progress_options={'coloring': False}. See this Github issue for details.

Parameters
  • train_generator – Generator-like object for the training set. See fit_generator() for details on the types of generators supported.

  • valid_generator (optional) – Generator-like object for the validation set. See fit_generator() for details on the types of generators supported. (Default value = None)

  • callbacks (List[Callback]) – List of callbacks that will be called during training. These callbacks are added after those used in this method (see above). This allows to assume that they are called after those. (Default value = None)

  • lr_schedulers – List of learning rate schedulers. (Default value = None)

  • keep_only_last_best (bool) – Whether only the last saved best checkpoint is kept. Applies only when save_every_epoch is false. (Default value = False)

  • save_every_epoch (bool, optional) – Whether or not to save the experiment model’s weights after every epoch. (Default value = False)

  • disable_tensorboard (bool, optional) – Whether or not to disable the automatic tensorboard logging callbacks. (Default value = False)

  • seed (int, optional) – Seed used to make the sampling deterministic. (Default value = 42)

  • kwargs – Any keyword arguments to pass to fit_generator().

Returns

List of dict containing the history of each epoch.

train_dataset(train_dataset, valid_dataset=None, **kwargs) List[Dict][source]

Trains or finetunes the model on a dataset. If a previous training already occurred and lasted a total of n_previous epochs, then the model’s weights will be set to the last checkpoint and the training will be resumed for epochs range (n_previous, epochs].

If the Experiment has logging enabled (i.e. self.logging is True), numerous callbacks will be automatically included. Notably, two ModelCheckpoint objects will take care of saving the last and every new best (according to monitor mode) model weights in appropriate checkpoint files. OptimizerCheckpoint and LRSchedulerCheckpoint will also respectively handle the saving of the optimizer and LR scheduler’s respective states for future retrieval. Moreover, a AtomicCSVLogger will save all available epoch statistics in an output .tsv file. Lastly, a TensorBoardLogger handles automatic TensorBoard logging of various neural network statistics.

Parameters
  • train_dataset (Dataset) – Training dataset.

  • valid_dataset (Dataset) – Validation dataset.

  • callbacks (List[Callback]) – List of callbacks that will be called during training. These callbacks are added after those used in this method (see above). This allows to assume that they are called after those. (Default value = None)

  • lr_schedulers – List of learning rate schedulers. (Default value = None)

  • keep_only_last_best (bool) – Whether only the last saved best checkpoint is kept. Applies only when save_every_epoch is false. (Default value = False)

  • save_every_epoch (bool, optional) – Whether or not to save the experiment model’s weights after every epoch. (Default value = False)

  • disable_tensorboard (bool, optional) – Whether or not to disable the automatic tensorboard logging callbacks. (Default value = False)

  • seed (int, optional) – Seed used to make the sampling deterministic. (Default value = 42)

  • kwargs – Any keyword arguments to pass to fit_dataset().

Returns

List of dict containing the history of each epoch.

train_data(x, y, validation_data=None, **kwargs) List[Dict][source]

Trains or finetunes the model on data under the form of NumPy arrays or torch tensors. If a previous training already occurred and lasted a total of n_previous epochs, then the model’s weights will be set to the last checkpoint and the training will be resumed for epochs range (n_previous, epochs].

If the Experiment has logging enabled (i.e. self.logging is True), numerous callbacks will be automatically included. Notably, two ModelCheckpoint objects will take care of saving the last and every new best (according to monitor mode) model weights in appropriate checkpoint files. OptimizerCheckpoint and LRSchedulerCheckpoint will also respectively handle the saving of the optimizer and LR scheduler’s respective states for future retrieval. Moreover, a AtomicCSVLogger will save all available epoch statistics in an output .tsv file. Lastly, a TensorBoardLogger handles automatic TensorBoard logging of various neural network statistics.

Parameters
  • x (Union[Tensor, ndarray] or Union[tuple, list] of Union[Tensor, ndarray]) – Training dataset. Union[Tensor, ndarray] if the model has a single input. Union[tuple, list] of Union[Tensor, ndarray] if the model has multiple inputs.

  • y (Union[Tensor, ndarray] or Union[tuple, list] of Union[Tensor, ndarray]) – Target. Union[Tensor, ndarray] if the model has a single output. Union[tuple, list] of Union[Tensor, ndarray] if the model has multiple outputs.

  • validation_data (Tuple[x_val, y_val]) – Same format as x and y previously described. Validation dataset on which to evaluate the loss and any model metrics at the end of each epoch. The model will not be trained on this data. (Default value = None)

  • callbacks (List[Callback]) – List of callbacks that will be called during training. These callbacks are added after those used in this method (see above). This allows to assume that they are called after those. (Default value = None)

  • lr_schedulers – List of learning rate schedulers. (Default value = None)

  • keep_only_last_best (bool) – Whether only the last saved best checkpoint is kept. Applies only when save_every_epoch is false. (Default value = False)

  • save_every_epoch (bool, optional) – Whether or not to save the experiment model’s weights after every epoch. (Default value = False)

  • disable_tensorboard (bool, optional) – Whether or not to disable the automatic tensorboard logging callbacks. (Default value = False)

  • seed (int, optional) – Seed used to make the sampling deterministic. (Default value = 42)

  • kwargs – Any keyword arguments to pass to fit().

Returns

List of dict containing the history of each epoch.

load_checkpoint(checkpoint: Union[int, str], *, verbose: bool = False, strict: bool = True) Optional[Dict][source]

Loads the model’s weights with the weights at a given checkpoint epoch.

Parameters
  • checkpoint (Union[int, str]) –

    Which checkpoint to load the model’s weights form.

    • If ‘best’, will load the best weights according to monitor_metric and monitor_mode.

    • If ‘last’, will load the last model checkpoint.

    • If int, will load the checkpoint of the specified epoch.

    • If a path (str), will load the model pickled state_dict weights (for instance, saved as torch.save(a_pytorch_network.state_dict(), "./a_path.p")).

  • verbose (bool, optional) – Whether or not to print the checkpoint filename, and the best epoch number and stats when checkpoint is ‘best’. (Default value = False)

Returns

If checkpoint is ‘best’, will return the best epoch stats, as per get_best_epoch_stats(), if checkpoint is ‘last’, will return the last epoch stats, if checkpoint is a int, will return the epoch number stats, if a path, will return the stats of that specific checkpoint. else None.

test(test_generator, **kwargs)[source]

Computes and returns the loss and the metrics of the model on a given test examples generator.

If the Experiment has logging enabled (i.e. self.logging is True), a checkpoint (the best one by default) is loaded and test and validation statistics are saved in a specific test output .tsv file. Otherwise, the current weights of the network is used for testing and statistics are only shown in the standard output.

Parameters
  • test_generator – Generator-like object for the test set. See fit_generator() for details on the types of generators supported.

  • checkpoint (Union[str, int]) –

    Which model checkpoint weights to load for the test evaluation.

    • If ‘best’, will load the best weights according to monitor_metric and monitor_mode.

    • If ‘last’, will load the last model checkpoint.

    • If int, will load the checkpoint of the specified epoch.

    • If a path (str), will load the model pickled state_dict weights (for instance, saved as torch.save(a_pytorch_network.state_dict(), "./a_path.p")).

    This argument has no effect when logging is disabled. (Default value = ‘best’)

  • seed (int, optional) – Seed used to make the sampling deterministic. (Default value = 42)

  • name (str) – Prefix of the test log file. (Default value = ‘test’)

  • kwargs – Any keyword arguments to pass to evaluate_generator().

If the Experiment has logging enabled (i.e. self.logging is True), one callback will be automatically included to save the test metrics. Moreover, a AtomicCSVLogger will save the test metrics in an output .tsv file.

Returns

dict sorting of all the test metrics values by their names.

test_dataset(test_dataset, **kwargs) Dict[source]

Computes and returns the loss and the metrics of the model on a given test dataset.

If the Experiment has logging enabled (i.e. self.logging is True), a checkpoint (the best one by default) is loaded and test and validation statistics are saved in a specific test output .tsv file. Otherwise, the current weights of the network is used for testing and statistics are only shown in the standard output.

Parameters
  • test_dataset (Dataset) – Test dataset.

  • checkpoint (Union[str, int]) –

    Which model checkpoint weights to load for the test evaluation.

    • If ‘best’, will load the best weights according to monitor_metric and monitor_mode.

    • If ‘last’, will load the last model checkpoint.

    • If int, will load the checkpoint of the specified epoch.

    • If a path (str), will load the model pickled state_dict weights (for instance, saved as torch.save(a_pytorch_network.state_dict(), "./a_path.p")).

    This argument has no effect when logging is disabled. (Default value = ‘best’)

  • seed (int, optional) – Seed used to make the sampling deterministic. (Default value = 42)

  • name (str) – Prefix of the test log file. (Default value = ‘test’)

  • kwargs – Any keyword arguments to pass to evaluate_dataset().

If the Experiment has logging enabled (i.e. self.logging is True), one callback will be automatically included to save the test metrics. Moreover, a AtomicCSVLogger will save the test metrics in an output .tsv file.

Returns

dict sorting of all the test metrics values by their names.

test_data(x, y, **kwargs) Dict[source]

Computes and returns the loss and the metrics of the model on a given test dataset.

If the Experiment has logging enabled (i.e. self.logging is True), a checkpoint (the best one by default) is loaded and test and validation statistics are saved in a specific test output .tsv file. Otherwise, the current weights of the network is used for testing and statistics are only shown in the standard output.

Parameters
  • x (Union[Tensor, ndarray] or Union[tuple, list] of Union[Tensor, ndarray]) – Input to the model. Union[Tensor, ndarray] if the model has a single input. Union[tuple, list] of Union[Tensor, ndarray] if the model has multiple inputs.

  • y (Union[Tensor, ndarray] or Union[tuple, list] of Union[Tensor, ndarray]) – Target, corresponding ground truth. Union[Tensor, ndarray] if the model has a single output. Union[tuple, list] of Union[Tensor, ndarray] if the model has multiple outputs.

  • checkpoint (Union[str, int]) –

    Which model checkpoint weights to load for the test evaluation.

    • If ‘best’, will load the best weights according to monitor_metric and monitor_mode.

    • If ‘last’, will load the last model checkpoint.

    • If int, will load the checkpoint of the specified epoch.

    • If a path (str), will load the model pickled state_dict weights (for instance, saved as torch.save(a_pytorch_network.state_dict(), "./a_path.p")).

    This argument has no effect when logging is disabled. (Default value = ‘best’)

  • seed (int, optional) – Seed used to make the sampling deterministic. (Default value = 42)

  • name (str) – Prefix of the test log file. (Default value = ‘test’)

  • kwargs – Any keyword arguments to pass to evaluate().

If the Experiment has logging enabled (i.e. self.logging is True), one callback will be automatically included to save the test metrics. Moreover, a AtomicCSVLogger will save the test metrics in an output .tsv file.

Returns

dict sorting of all the test metrics values by their names.

infer(generator, **kwargs) Any[source]

Returns the predictions of the network given batches of samples x, where the tensors are converted into Numpy arrays.

Parameters
  • generator – Generator-like object for the dataset. The generator must yield a batch of samples. See the fit_generator() method for details on the types of generators supported. This should only yield input data x and NOT the target y.

  • checkpoint (Union[str, int]) –

    Which model checkpoint weights to load for the prediction.

    • If ‘best’, will load the best weights according to monitor_metric and monitor_mode.

    • If ‘last’, will load the last model checkpoint.

    • If int, will load the checkpoint of the specified epoch.

    • If a path (str), will load the model pickled state_dict weights (for instance, saved as torch.save(a_pytorch_network.state_dict(), "./a_path.p")).

    This argument has no effect when logging is disabled. (Default value = ‘best’)

  • kwargs – Any keyword arguments to pass to predict_generator().

Returns

Depends on the value of concatenate_returns. By default, (concatenate_returns is true), the data structures (tensor, tuple, list, dict) returned as predictions for the batches are merged together. In the merge, the tensors are converted into Numpy arrays and are then concatenated together. If concatenate_returns is false, then a list of the predictions for the batches is returned with tensors converted into Numpy arrays.

infer_dataset(dataset, **kwargs) Any[source]

Returns the inferred predictions of the network given a dataset, where the tensors are converted into Numpy arrays.

Parameters
  • dataset (Dataset) – Dataset. Must not return y, just x.

  • checkpoint (Union[str, int]) –

    Which model checkpoint weights to load for the prediction.

    • If ‘best’, will load the best weights according to monitor_metric and monitor_mode.

    • If ‘last’, will load the last model checkpoint.

    • If int, will load the checkpoint of the specified epoch.

    • If a path (str), will load the model pickled state_dict weights (for instance, saved as torch.save(a_pytorch_network.state_dict(), "./a_path.p")).

    This argument has no effect when logging is disabled. (Default value = ‘best’)

  • kwargs – Any keyword arguments to pass to predict_dataset().

Returns

Return the predictions in the format outputted by the model.

infer_data(x, **kwargs) Any[source]

Returns the inferred predictions of the network given a dataset x, where the tensors are converted into Numpy arrays.

Parameters
  • x (Union[Tensor, ndarray] or Union[tuple, list] of Union[Tensor, ndarray]) – Input to the model. Union[Tensor, ndarray] if the model has a single input. Union[tuple, list] of Union[Tensor, ndarray] if the model has multiple inputs.

  • checkpoint (Union[str, int]) –

    Which model checkpoint weights to load for the prediction.

    • If ‘best’, will load the best weights according to monitor_metric and monitor_mode.

    • If ‘last’, will load the last model checkpoint.

    • If int, will load the checkpoint of the specified epoch.

    • If a path (str), will load the model pickled state_dict weights (for instance, saved as torch.save(a_pytorch_network.state_dict(), "./a_path.p")).

    This argument has no effect when logging is disabled. (Default value = ‘best’)

  • kwargs – Any keyword arguments to pass to predict().

Returns

Return the predictions in the format outputted by the model.

is_better_than(another_experiment) bool[source]

Compare the results of the Experiment with another experiment. To compare, both Experiments need to be logged, monitor the same metric and the same monitor mode (“min” or “max”).

Parameters

another_experiment ( Experiment) – Another Poutyne experiment to compare results with.

Returns

Whether the Experiment is better than the Experiment to compare with.