Giter VIP home page Giter VIP logo

caps-learn's Introduction

CaPS-Learn: Convergence-aware Parameter Skipped Learning System in Distributed Environment

2021 CSE61401 AI framework Project

Hyunjoon Jeong 20215350, UNIST

Requirement

torch == 1.8.0 (recommand)
torchvision == 0.8.0 (recommand)
tqdm == 4.60.0 (recommand)

How to Use

When you train your model, you just wrap your optimizer by CapsOptimizer.
Refer below usage example and see capslearn/run/resnet50_run.py.
You can test CapsOptimizer using script files in capslearn/scripts.

import torch
import capslearn.torch.optimizer as opt
from capslearn.torch.distributed import DistributedDataParallel

...

optimizer = torch.optim.Adam(model.paramters(), lr=learning_rate)
optimizer = opt.CapsOptimizer(optimizer, **CapsParameter)

Run example

See capslearn/scripts examples.
https://github.com/with1015/CaPS-Learn/tree/main/capslearn/scripts

python3 resnet50_run.py \
  --epoch 100 \
  --batch-size 32 \
  --workers 16 \
  --lr 0.001 \
  --unchange-rate 90.0 \
  --lower-bound 0.0 \
  --scheduling-freq 10 \
  --history-length 5 \
  --round-factor -1 \
  --random-select 0.001 \
  --world-size 4 \
  --rank 0 \
  --master-addr $MASTER_ADDR \
  --master-port $MASTER_PORT \
  $DATA_DIR

Input Parameter List

# Training parameter
data $DIR : dataset directory
--workers $INT : number of workers to use in data load
--epochs $INT : number of epochs to train
--batch-size $INT : batch size to train
--lr $FLOAT : Learning rate
--momentum $FLOAT : Momentum using in optimizer
--weight-decay $FLOAT : Weight decay using in optimizer

# Distributed training parameter
--gpu $INT : GPU ID to train in current node
--rank $INT : Rank to determine each worker
--world-size $INT : World-size for DistributedDataParallel (must same with total number of workers)
--master-addr $STR : Master node address to load parameter server
--master-port $STR : Port to use for communication

# CapsOptimizer options
--unchange-rate $FLOAT : Start rate of unchange paramter ratio in one layer.
--scheduling-freq $INT : Scheduling frequency to adjust unchage rate
--lower-bound $FLOAT : Set the lowest value of unchange rate scheduling
--max-bound $FLOAT : Set the maximum value of unchange rate scheduling
--history-lenght $INT : Determine size of history queue for unchage rate scheduling
--random-select $INT : Use random index selection in CapsOptimizer instead of naive solution
--hbs-init $INT : Use history-based selection with initial step instead of naive solution
--round-factor $INT : Rounding digit in comparison between layers (not recommend to use)

caps-learn's People

Contributors

with1015 avatar

Stargazers

정현준/AI솔루션 팀 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.