Transfer Learning: Model Is Giving Unchanged Loss Results. Is It Not Training?

January 14, 2023 Post a Comment

I'm trying to train a Regression Model on Inception V3. Inputs are images of size (96,320,3). There are a total of 16k+ images out of which 12k+ are for training and the rest for v

Solution 1:

As your problem is a regression problem, the activation of the last layer should be linear instead of relu. And also the learning rate is too high, you should consider to lower it according to your overall set up. Here I'm showing a code sample with MNIST.

# data 
(xtrain, train_target), (xtest, test_target) = tf.keras.datasets.mnist.load_data()
# train_x, MNIST is gray scale, so in order to use it in pretrained weights , extending it to 3 axix
x_train = np.expand_dims(xtrain, axis=-1)
x_train = np.repeat(x_train, 3, axis=-1)
x_train = x_train.astype('float32') / 255
# prepare the label for regression model 
ytrain4 = tf.square(tf.cast(train_target, tf.float32))

# base model 
inception = InceptionV3(weights='imagenet', include_top=False, input_shape=(75,75,3))
inception.trainable = False

# inputs layer
driving_input = tf.keras.layers.Input(shape=(28,28,3))
resized_input = tf.keras.layers.Lambda(lambda image: tf.image.resize(image,(75,75)))(driving_input)
inp = inception(resized_input)

# top model 
x = GlobalAveragePooling2D()(inp)
x = Dense(512, activation = 'relu')(x)
x = Dense(256, activation = 'relu')(x)
x = Dropout(0.25)(x)
x = Dense(128, activation = 'relu')(x)
x = Dense(64, activation = 'relu')(x)
x = Dropout(0.25)(x)
result = Dense(1, activation = 'linear')(x)

# hyper-param
lr_schedule = tf.keras.optimizers.schedules.ExponentialDecay(initial_learning_rate=0.0001, 
                                                             decay_steps=100000, decay_rate=0.95)
optimizer = tf.keras.optimizers.Adam(learning_rate=lr_schedule)
loss = tf.keras.losses.Huber(delta=0.5, reduction="auto", name="huber_loss")

# build models
model = tf.keras.Model(inputs = driving_input, outputs = result)
model.compile(optimizer=optimizer, loss=loss)

# callbacks
checkpoint = tf.keras.callbacks.ModelCheckpoint(filepath="./ckpts/model.h5", monitor='val_loss', save_best_only=True)
stopper = tf.keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=0.0003, patience = 10)

batch_size = 32
epochs = 10

# fit 
model.fit(x=x_train, y=ytrain4, shuffle=True, validation_split=0.2, epochs=epochs, 
          batch_size=batch_size, verbose=1, callbacks=[checkpoint, stopper])

Output

1500/1500 [==============================] - 27s 18ms/step - loss: 5.2239 - val_loss: 3.6060
Epoch 2/10
1500/1500 [==============================] - 26s 17ms/step - loss: 3.5634 - val_loss: 2.9022
Epoch 3/10
1500/1500 [==============================] - 26s 17ms/step - loss: 3.0629 - val_loss: 2.5063
Epoch 4/10
1500/1500 [==============================] - 26s 17ms/step - loss: 2.7615 - val_loss: 2.3764
Epoch 5/10
1500/1500 [==============================] - 26s 17ms/step - loss: 2.5371 - val_loss: 2.1303
Epoch 6/10
1500/1500 [==============================] - 26s 17ms/step - loss: 2.3848 - val_loss: 2.1373
Epoch 7/10
1500/1500 [==============================] - 26s 17ms/step - loss: 2.2653 - val_loss: 1.9039
Epoch 8/10
1500/1500 [==============================] - 26s 17ms/step - loss: 2.1581 - val_loss: 1.9087
Epoch 9/10
1500/1500 [==============================] - 26s 17ms/step - loss: 2.0518 - val_loss: 1.7193
Epoch 10/10
1500/1500 [==============================] - 26s 17ms/step - loss: 1.9699 - val_loss: 1.8837

Baca Juga

Getting Started with Python

Transfer Learning: Model Is Giving Unchanged Loss Results. Is It Not Training?

Solution 1:

Post a Comment for "Transfer Learning: Model Is Giving Unchanged Loss Results. Is It Not Training?"