Giter VIP home page Giter VIP logo

cyclic-learning-rate's Introduction

Cyclic Learning Rate (CLR)

TensorFlow implementation of cyclic learning rate from the paper: Smith, Leslie N. "Cyclical learning rates for training neural networks." 2017.

TOC

  1. What is CLR ?
  2. Usage
  3. Functional Tests
  4. License

What is CLR ?

CLR is used to enhance the way the learning rate is scheduled during training, to provide better convergence and help in regularizing deep learning models. It eliminates the need to experimentally find the best values for the global learning rate. Allowing the learning rate to cyclically vary between lower and upper boundary values. The idea is to divide the training process into cycles determined by a stepsize parameter, which defines the number of iterations in half a cycle. The author claims that it is often good to set the stepsize to:

stepsize = (2 to 10) times the number of iterations in an epoch

The learning rate is computed as:

cycle = floor( 1 + global_step / ( 2 * step_size ) )
x = abs( global_step / step_size2 * cycle + 1 )
clr = learning_rate + ( max_lrlearning_rate ) * max( 0 , 1 - x )

The author proposes three variations of this policy:

  • 'triangular': Default, linearly increasing then linearly decreasing the learning rate at each cycle.
  • 'triangular2': The same as the triangular policy except that the learning rate difference is cut in half at the end of each cycle. This means the learning rate difference drops after each cycle.
  • 'exp_range': The learning rate varies between the minimum and maximum boundaries and each boundary value declines by an exponential factor of:
𝑓 = 𝑔𝑎𝑚𝑚𝑎^𝑔𝑙𝑜𝑏𝑎𝑙_𝑠𝑡𝑒𝑝

Where global_step is a number indicating the current iteration and gamma is a constant passed as an argument to the CLR callback.

Usage

Upgrade to the latest version of TensorFlow:

!pip install --upgrade tensorflow
import tensorflow as tf
tf.__version__

[out]: '1.9.0'

Eager mode

Eager mode evaluates operations immediately, without building graphs.

Enable eager execution:

import tensorflow as tf
tf.enable_eager_execution()

Generate cyclic learning rates:

import clr
import matplotlib.pyplot as plt
%matplotlib inline

print(tf.executing_eagerly()) # => True

rates = []

for i in range(0, 250):
    x = clr.cyclic_learning_rate(i, mode='exp_range', gamma=.997)
    rates.append(x())

plt.xlabel('iterations (epochs)')
plt.ylabel('learning rate')
plt.plot(range(250), rates)

#plt.savefig('exp_range.png', dpi=600)
[out]:
True

png

Graph mode

import tensorflow as tf
import clr
import matplotlib.pyplot as plt
%matplotlib inline

print(tf.executing_eagerly()) # => False

rates = []

with tf.Session() as sess:
    for i in range(0, 250):
        rates.append(sess.run(clr.cyclic_learning_rate(i, mode='exp_range', gamma=.997)))

plt.xlabel('iterations (epochs)')
plt.ylabel('learning rate')
plt.plot(range(250), rates)

#plt.savefig('exp_range.png', dpi=600)
[out]:
False

png

Training Example:

  • 'triangular2' mode cyclic learning rate:
...
global_step = tf.Variable(0, trainable=False)
optimizer = tf.train.AdamOptimizer(learning_rate=
  clr.cyclic_learning_rate(global_step=global_step, mode='triangular2'))
train_op = optimizer.minimize(loss_op, global_step=global_step)
...
 with tf.Session() as sess:
    sess.run(init)
    for step in range(1, num_steps+1):
      assign_op = global_step.assign(step)
      sess.run(assign_op)
...

Running Functional Tests

from clr_test import CyclicLearningRateTest

CyclicLearningRateTest().test_triangular()
CyclicLearningRateTest().test_triangular2()
CyclicLearningRateTest().test_exp_range()

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Inspired by Brad Kenstler keras CLR implementation.

cyclic-learning-rate's People

Contributors

mhmoodlan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

cyclic-learning-rate's Issues

Not working in tf 2.0?

Hi,
I was trying to use cyclic lr in tf2.0. Below is my test code:

import tensorflow as tf
print(tf.__version__)
from debug import clr


mnist = tf.keras.datasets.mnist
(x_train, y_train), (_, _) = mnist.load_data()
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_train = x_train / 255.0
y_train = y_train.astype('int32')

buffer_size = 5000
batch_size = 100
ds = tf.data.Dataset.from_tensor_slices((x_train, y_train)).shuffle(buffer_size)
ds = ds.batch(batch_size)

class My_Model(tf.keras.Model):
  def __init__(self):
    super(My_Model, self).__init__()
    self.flatten = tf.keras.layers.Flatten()
    self.fc1 = tf.keras.layers.Dense(10)

  def call(self, x, training=True):
    x = self.flatten(x)
    x = self.fc1(x)
    return x

def loss(logits, labels):
  return tf.reduce_mean(
      tf.nn.sparse_softmax_cross_entropy_with_logits(
          logits=logits, labels=labels))

step_counter = tf.Variable(0,  trainable = False, dtype=tf.int64)

learning_rate = clr.cyclic_learning_rate(step_counter,
                              learning_rate=0.01,
                              max_lr=0.1,
                              step_size=100,
                              gamma=0.99994,
                              mode='triangular',
                              name=None)
    

opt = tf.optimizers.Adam(learning_rate=learning_rate())

model = My_Model()

    
epochs = 2
for epoch in range(2):
  for batch, (images, labels) in enumerate(ds):
    with tf.GradientTape() as tape:
      logits = model(images, training=True)
      loss_value = loss(logits, labels)
      # print('loss', loss_value.numpy())
    grads = tape.gradient(loss_value, model.variables)
    opt.apply_gradients(zip(grads, model.trainable_variables))
    print('LR {}'.format(opt.learning_rate.numpy()))
    step_counter.assign_add(1)
    print('Step Counter  {}'.format(step_counter.numpy()))

For some reason, the learning rate is not being updated? Is there something wrong in the way it is being used?

Thank you,
Best,
SK

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.