Train models

The following code details the functions and classes that are available to train an image classification model (also multilabel image classification) and opimize its hyperparameters. Progressive learning can also be performed to add new classes to a model that was already trained using the library without losing all the information learned.

An example of how to use these functions can be found in Train and optimize model.

Image classification including Multilabel classification

class decavision.model_training.tfrecords_image_classifier.ImageClassifier(tfrecords_folder, batch_size=128, transfer_model='Inception', augment=True, input_shape=None, multilabel=False)

Class to train an image classification model by using transfer learning. A hyperparameter optimization tool is also provided. Data must be saved in tfrecords format. See data_preparation.generate_tfrecords to go from image data to tfrecords.

Parameters
  • tfrecords_folder (str) – location of tfrecords (can be on google storage if authenticated), saved in folders train and val, filenames of the form filenumber-numberofimages.tfrec

  • batch_size (int) – size of batches of data used for training

  • transfer_model (str) – pretrained model to use for transfer learning, can be one of Inception, Xception, Inception_Resnet, Resnet, (EfficientNet) B0, B3, B5, B7 or (EfficientnetV2) V2-S, V2-M, V2-L

  • augment (boolean) – Whether to augment the training data, default is True

  • input_shape (tuple(int,int)) – shape of the input images for the model, if not specified, recommended sizes are used for each one

  • multilable (boolean) – if each image is attached to multiple classes

data_augment(image, label)

Data augmentation pipeline which augments the data by randomly flipping, changing brightness and saturation for each batch during training a model.

References: https://www.wouterbulten.nl/blog/tech/data-augmentation-using-tensorflow-data-dataset/#code https://www.tensorflow.org/tutorials/images/data_augmentation

Returns

augmented images and labels

fit(save_model=None, export_model=None, patience=0, epochs=5, hidden_size=1024, learning_rate=0.001, learning_rate_fine_tuning=0.0001, dropout=0.5, l2_lambda=0.0005, fine_tuning=True, verbose=True, logs=None, activation='swish')

Train an image classification model based on a pretrained model. A classification layer is added to the pretrained model, with potentially an extra combination of Dense, Dropout and Batchnorm. Only added layers are trained, unless there is some fine tuning, in which case a second round of training is done with the last block of the pretrained model unfrozen. Training can be stopped if no sufficient improvement in accuracy.

If one of the Efficientnet Bs is used, the model includes a layer that normalizes the pixels. This processing step is not included in the other models so it has to be done on the data separately.

Parameters
  • learning_rate (float) – learning rate used when training extra layers

  • learning_rate_fine_tuning (float) – learning rate used when fine tuning pretrained model

  • epochs (int) – number of epochs done when training (doubled if fine tuning)

  • activation (str) – activation function to use in extra layer, any keras activation is possible

  • hidden_size (int) – number of neurons in extra layer, no layer if 0

  • save_model (str) – specify a name for the trained model to save it in .h5 format

  • export_model (str) – specify a name for the trained model to save it in .pb format

  • dropout (float) – rate of dropout to use in extra layer (<1)

  • verbose (bool) – show details of training or not

  • fine_tuning (bool) – fine tune pretrained model or not

  • l2_lambda (float) – amount of L2 regularization to include in extra layer

  • patience (int) – if non zero, stop training when improvement in val accuracy is not observed for the given number of epochs. If used, best model is restored when training is stopped

  • logs (str) – if specified, tensorboard is used and logs are saved at this location

get_training_dataset()

Extract data from training tfrecords located in tfrecords_folder. Data is shuffled and augmented.

Returns

iterable dataset with content of training tfrecords (images and labels)

Return type

tf.data.dataset

get_validation_dataset()

Extract data from validation tfrecords located in tfrecords_folder.

Returns

iterable dataset with content of validation tfrecords (images and labels)

Return type

tf.data.dataset

hyperparameter_optimization(num_iterations=20, n_random_starts=10, patience=0, save_results=False)

Try different combinations of hyperparameters to find the best model possible. Start by trying random combinations and after some time learn from th previous tries. Scikit-optimize checkoint is saved at each step in the working directory. If checkpoint present in working directory, optimization starts back from where it left off. Logs of all tries are also saved in working directory. Hyperparameters that are varied are epochs, hidden_size, learning_rate, learning_rate_fine_tuning, fine_tuning, dropout and l2_lambda. Possible to save best combination at the end of the optimization.

Parameters
  • n_random_starts (int) – number of random combinations of hyperparameters first tried

  • num_iterations (int) – total number of hyperparameter combinations to try (aim for a 1:1 to 2:1 ratio num_iterations/n_random_starts)

  • patience (int) – if non zero, stop training when improvement in val accuracy is not observed for the given number of epochs. If used, best model is restored when training is stopped

  • save_results (bool) – decide to save optimal hyperparameters in hyperparameters_dimensions.pickle when done

Progressive learning

class decavision.model_training.progressive_learning.ProgressiveLearner(tfrecords_folder, model_path, transfer_model, batch_size=128)

Class to update an already trained model with new classes without losing too much of the information learned about the old classes.

Parameters
  • tfrecords_folder (str) – location of tfrecords (can be on google storage if authenticated), saved in folders train and val, filenames of the form filenumber-numberofimages.tfrec

  • model_path (str) – path to .h5 model trained with this library on the old classes

  • transfer_model (str) – pretrained model that was used to train the old model, can be one of Inception, Xception, Inception_Resnet, Resnet, B0, B3, B5, B7, V2-S, V2-M, V2-L or V2-XL

  • batch_size (int) – size of batches of data used for training

fit(learning_rate=0.001, learning_rate_fine_tuning=0.0001, epochs=5, save_model=False, verbose=True, fine_tuning=True, logs=None)

Train an image classification model based on a model trained with a smaller number of classes. The whole model is trained, unless there is some fine tuning, in which case a second round of training is done with the last layers of the model unfrozen. Training can be stopped if no sufficient improvement in accuracy.

Parameters
  • learning_rate (float) – learning rate used when training whole model

  • learning_rate_fine_tuning (float) – learning rate used when fine tuning last layers

  • epochs (int) – number of epochs done when training (doubled if fine tuning)

  • save_model (str) – specify a name for the trained model to save it, model is saved in .h5 if the name contains the extension and in .pb if no extension in the name

  • verbose (bool) – show details of training or not

  • fine_tuning (bool) – fine tune last layers or not

  • min_accuracy (float) – if specified, stop training when improvement in accuracy is smaller than min_accuracy

  • logs (str) – if specified, tensorboard is used and logs are saved at this location

hyperparameter_optimization()

This class is not implemented for progressive learning.