cyclic-learning-rate's Introduction

Cyclic Learning Rate (CLR)

TensorFlow implementation of cyclic learning rate from the paper: Smith, Leslie N. "Cyclical learning rates for training neural networks." 2017.

What is CLR ?

CLR is used to enhance the way the learning rate is scheduled during training, to provide better convergence and help in regularizing deep learning models. It eliminates the need to experimentally find the best values for the global learning rate. Allowing the learning rate to cyclically vary between lower and upper boundary values. The idea is to divide the training process into cycles determined by a stepsize parameter, which defines the number of iterations in half a cycle. The author claims that it is often good to set the stepsize to:

stepsize = (2 to 10) times the number of iterations in an epoch

The learning rate is computed as:

cycle = floor( 1 + global_step / ( 2 * step_size ) )
x = abs( global_step / step_size – 2 * cycle + 1 )
clr = learning_rate + ( max_lr – learning_rate ) * max( 0 , 1 - x )

The author proposes three variations of this policy:

'triangular': Default, linearly increasing then linearly decreasing the learning rate at each cycle.
'triangular2': The same as the triangular policy except that the learning rate difference is cut in half at the end of each cycle. This means the learning rate difference drops after each cycle.
'exp_range': The learning rate varies between the minimum and maximum boundaries and each boundary value declines by an exponential factor of:

𝑓 = 𝑔𝑎𝑚𝑚𝑎^𝑔𝑙𝑜𝑏𝑎𝑙_𝑠𝑡𝑒𝑝

Where global_step is a number indicating the current iteration and gamma is a constant passed as an argument to the CLR callback.

Usage

Upgrade to the latest version of TensorFlow:

!pip install --upgrade tensorflow

import tensorflow as tf
tf.__version__

[out]: '1.9.0'

Eager mode

Eager mode evaluates operations immediately, without building graphs.

Enable eager execution:

import tensorflow as tf
tf.enable_eager_execution()

Generate cyclic learning rates:

import clr
import matplotlib.pyplot as plt
%matplotlib inline

print(tf.executing_eagerly()) # => True

rates = []

for i in range(0, 250):
    x = clr.cyclic_learning_rate(i, mode='exp_range', gamma=.997)
    rates.append(x())

plt.xlabel('iterations (epochs)')
plt.ylabel('learning rate')
plt.plot(range(250), rates)

#plt.savefig('exp_range.png', dpi=600)

[out]:
True

Graph mode

import tensorflow as tf
import clr
import matplotlib.pyplot as plt
%matplotlib inline

print(tf.executing_eagerly()) # => False

rates = []

with tf.Session() as sess:
    for i in range(0, 250):
        rates.append(sess.run(clr.cyclic_learning_rate(i, mode='exp_range', gamma=.997)))

plt.xlabel('iterations (epochs)')
plt.ylabel('learning rate')
plt.plot(range(250), rates)

#plt.savefig('exp_range.png', dpi=600)

[out]:
False

Training Example:

'triangular2' mode cyclic learning rate:

...
global_step = tf.Variable(0, trainable=False)
optimizer = tf.train.AdamOptimizer(learning_rate=
  clr.cyclic_learning_rate(global_step=global_step, mode='triangular2'))
train_op = optimizer.minimize(loss_op, global_step=global_step)
...
 with tf.Session() as sess:
    sess.run(init)
    for step in range(1, num_steps+1):
      assign_op = global_step.assign(step)
      sess.run(assign_op)
...

Running Functional Tests

from clr_test import CyclicLearningRateTest

CyclicLearningRateTest().test_triangular()
CyclicLearningRateTest().test_triangular2()
CyclicLearningRateTest().test_exp_range()

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Inspired by Brad Kenstler keras CLR implementation.

cyclic-learning-rate's People

Contributors

Stargazers

Watchers

cyclic-learning-rate's Issues

Not working in tf 2.0?

Hi,
I was trying to use cyclic lr in tf2.0. Below is my test code:

import tensorflow as tf
print(tf.__version__)
from debug import clr


mnist = tf.keras.datasets.mnist
(x_train, y_train), (_, _) = mnist.load_data()
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_train = x_train / 255.0
y_train = y_train.astype('int32')

buffer_size = 5000
batch_size = 100
ds = tf.data.Dataset.from_tensor_slices((x_train, y_train)).shuffle(buffer_size)
ds = ds.batch(batch_size)

class My_Model(tf.keras.Model):
  def __init__(self):
    super(My_Model, self).__init__()
    self.flatten = tf.keras.layers.Flatten()
    self.fc1 = tf.keras.layers.Dense(10)

  def call(self, x, training=True):
    x = self.flatten(x)
    x = self.fc1(x)
    return x

def loss(logits, labels):
  return tf.reduce_mean(
      tf.nn.sparse_softmax_cross_entropy_with_logits(
          logits=logits, labels=labels))

step_counter = tf.Variable(0,  trainable = False, dtype=tf.int64)

learning_rate = clr.cyclic_learning_rate(step_counter,
                              learning_rate=0.01,
                              max_lr=0.1,
                              step_size=100,
                              gamma=0.99994,
                              mode='triangular',
                              name=None)
    

opt = tf.optimizers.Adam(learning_rate=learning_rate())

model = My_Model()

    
epochs = 2
for epoch in range(2):
  for batch, (images, labels) in enumerate(ds):
    with tf.GradientTape() as tape:
      logits = model(images, training=True)
      loss_value = loss(logits, labels)
      # print('loss', loss_value.numpy())
    grads = tape.gradient(loss_value, model.variables)
    opt.apply_gradients(zip(grads, model.trainable_variables))
    print('LR {}'.format(opt.learning_rate.numpy()))
    step_counter.assign_add(1)
    print('Step Counter  {}'.format(step_counter.numpy()))

For some reason, the learning rate is not being updated? Is there something wrong in the way it is being used?

Thank you,
Best,
SK

Recommend Projects

mhmoodlan / cyclic-learning-rate Goto Github PK

cyclic-learning-rate's Introduction

Cyclic Learning Rate (CLR)

TOC

What is CLR ?

Usage

Eager mode

Graph mode

Training Example:

Running Functional Tests

License

cyclic-learning-rate's People

Contributors

Stargazers

Watchers

Forkers

cyclic-learning-rate's Issues

Not working in tf 2.0?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent