Skip to content Skip to sidebar Skip to footer

Passing `training=true` When Using Doing Tensorflow Training

TensorFlow's official tutorial says that we should pass base_model(trainin=False) during training in order for the BN layer not to update mean and variance. my question is: why? wh

Solution 1:

When model is trained from initialization, batchnorm should be enabled to tune their mean and variance as you mentioned. Finetuning or transfer learning is a bit different thing: you already has a model that can do more than you need and you want to perform particular specialization of pre-trained model to do your task/work on your data set. In this case part of weights are frozen and only some layers closest to output are changed. Since BN layers are used all around model you should froze them as well. Check again this explanation:

Important note about BatchNormalization layers Many models contain tf.keras.layers.BatchNormalization layers. This layer is a special case and precautions should be taken in the context of fine-tuning, as shown later in this tutorial.

When you set layer.trainable = False, the BatchNormalization layer will run in inference mode, and will not update its mean and variance statistics.

When you unfreeze a model that contains BatchNormalization layers in order to do fine-tuning, you should keep the BatchNormalization layers in inference mode by passing training = False when calling the base model. Otherwise, the updates applied to the non-trainable weights will destroy what the model has learned.

Source: transfer learning, details regarding freeze

Post a Comment for "Passing `training=true` When Using Doing Tensorflow Training"