Giter VIP home page Giter VIP logo

Comments (5)

ZeguanXiao avatar ZeguanXiao commented on July 28, 2024 3

@speediedan Happy to see your quick response. Approach (2) is exactly what I want. It gives users more flexibility to control the learning scheduler's arguments. And this approach may be easily compatible with Schedulers from other frameworks such as transformers.

from finetuning-scheduler.

speediedan avatar speediedan commented on July 28, 2024

Thanks for your valuable feedback @ZeguanXiao! Delighted you're finding Finetuning Scheduler at least somewhat useful so far and happy to help further extend it to better support your (and likely other researchers) use cases.

When you indicate you'd like to re-initialize the lr_scheduler after one or more phase transitions, I can think of at least two different interpretations/approaches:

  1. Re-initialize the scheduler to its initial configuration at the beginning of each new phase for all param groups
    • There isn't a reset() method available on the pytorch _LRScheduler base class but I've looked at it and this should be implementable in a way that is compatible with most of the standard pytorch schedulers (though a few would need special handling for scheduler-specific counter attributes)
    • This would be more cleanly implementable though if the desired lr_scheduler arguments were set as a required dict arg to FinetuningScheduler (e.g. reinit_lr.)
    • The upside of this approach would be that it cleanly integrates into both implicit and explicit finetuning schedule modes
  2. Instantiate a completely new lr_scheduler given a desired scheduler dict definition:
    • Rather than just specifying a per-phase 'lr', passing a lr_scheduler configuration dict to any specified finetuning phase, the optimizer state in that phase could be wrapped in a new lr_scheduler

    • i.e. something like

      # in finetuning schedule definition
      0:
        new_scheduler:
          schedule_init:
            class_path: torch.optim.lr_scheduler.CosineAnnealingWarmRestarts
            init_args:
              T_0: 1
              T_mult: 2
              eta_min: 1.0e-07
            pl_lrs_cfg:
              interval: epoch
              frequency: 1
              name: CosineAnnealingWithWarmRestartsLR
        params:
        ...
      1: 
        new_scheduler:
          class_path: torch.optim.lr_scheduler.StepLR
          init_args:
            step_size: 1
            gamma: 0.7
        ...
      
      # in finetuning_scheduler extension
      # instantiate new scheduler around optimizer, then wrap in PL config object etc...
      ...
      config.scheduler = LRSchedulerConfig(**scheduler)
    • Note that with this approach, the new_scheduler would be applied to all param groups beginning at the specified stage since Finetuning Scheduler only supports single optimizer, single scheduler configurations at the present time

I'm guessing the functionality in approach (2) is closer to that you're envisioning but please let me know. Right now my suspicion is a composition of the best of both approaches would be possible, using (1) for implicit mode and (2) for explicit mode. As soon as I get some bandwidth, I'll try implementing/testing and let you know how it goes. Anyway, thanks again for your thoughts, let me know what you think! Also, feel free to reach out to me on the pytorch-lightning slack workspace.

from finetuning-scheduler.

speediedan avatar speediedan commented on July 28, 2024

Thanks again for the feedback and your enhancement suggestion @ZeguanXiao ... I just merged a PR that adds LR scheduler reinitialization functionality in both explicit and implicit scheduled finetuning modes. πŸ˜„ πŸŽ‰ πŸš€

In the "Advanced Usage" section of the latest documentation, I've added detailed instructions on using the new functionality along with a couple examples.

A new version of the Finetuning Scheduler extension is now available if you want to check it out (v0.1.4). Would love to hear how you're using the Finetuning Scheduler extension (whether here on github or via email or slack). Feel free to reach out anytime. Good luck with your research!

from finetuning-scheduler.

ZeguanXiao avatar ZeguanXiao commented on July 28, 2024

@speediedan Thanks for the great contribution. I have used the new feature for several days, and my use case is described as follows. ( The figure and .yaml are not exactly matched, just show the lr trend.)
屏幕快照 2022-06-12 δΈ‹εˆ10 27 13

defaults:
  - default.yaml

ft_schedule:
  "0":
    params:
      - xxx.xxx
  "1":
    params:
      - xxx.xxx
    lr: 2.0e-5
    new_lr_scheduler:
      lr_scheduler_init:
        class_path: src.utils.linear_warmup_scheduler.LinearWarmupLR
        init_args:
          num_warmup_steps: 500
          num_training_steps: 2500
      pl_lrs_cfg:
        interval: step
        frequency: 1
      init_pg_lrs: [2.0e-5, 2.0e-5]
  1. I use Hydra + PyTorch Lightning (PL). When passing a file name as an argument to ft_schedule, we can't log the detailed fts configuration with PL logger. I just let fts config as a Hydra config, so it is possible to log and easy to override in command line.

  2. In the current implementation, the fts callback only supports integers as phrase keys but Hydra doesn't. As I want to override fts config in command line, I just set phrase keys as strings in .yaml file and manually convert to integers after.

  3. Currently the callback only supports _LRScheduler. For people who want to use huggingface transformers get_*_schedule_with_warmup functions, here is an example (derived from Pytorch and transformers).

class OneLambdaLR(torch.optim.lr_scheduler._LRScheduler):
    def __init__(self, optimizer, lr_lambda, last_epoch=-1, verbose=False):
        self.optimizer = optimizer
        self.lr_lambda = lr_lambda
        super(OneLambdaLR, self).__init__(optimizer, last_epoch, verbose)

    def state_dict(self):
        state_dict = {key: value for key, value in self.__dict__.items() if key not in ('optimizer', 'lr_lambda')}
        state_dict['lr_lambda'] = None

        if not isinstance(self.lr_lambda, types.FunctionType):
            state_dict['lr_lambda'] = self.lr_lambda.__dict__.copy()
        return state_dict

    def load_state_dict(self, state_dict):
        lr_lambda = state_dict.pop('lr_lambda')
        self.__dict__.update(state_dict)
        # Restore state_dict keys in order to prevent side effects
        state_dict['lr_lambda'] = lr_lambda

        if lr_lambda is not None:
            self.lr_lambda.__dict__.update(lr_lambda)

    def get_lr(self):
        if not self._get_lr_called_within_step:
            warnings.warn("To get the last learning rate computed by the scheduler, "
                          "please use `get_last_lr()`.")

        return [lr * self.lr_lambda(self.last_epoch) for lr in self.base_lrs]

class LinearWarmupLR(OneLambdaLR):
    def __init__(self, optimizer, num_warmup_steps, num_training_steps, last_epoch=-1):
        def lr_lambda(current_step: int):
            if current_step < num_warmup_steps:
                return float(current_step) / float(max(1, num_warmup_steps))
            return max(
                0.0, float(num_training_steps - current_step) / float(max(1, num_training_steps - num_warmup_steps))
            )

        super(LinearWarmupLR, self).__init__(optimizer, lr_lambda, last_epoch)

from finetuning-scheduler.

speediedan avatar speediedan commented on July 28, 2024

Thanks so much for the valuable feedback @ZeguanXiao! Impressive work, I look forward to see how your research progresses! πŸš€ πŸŽ‰ Please find my thoughts and updates in-line below:

  1. I use Hydra + PyTorch Lightning (PL). When passing a file name as an argument to ft_schedule, we can't log the detailed fts configuration with PL logger. I just let fts config as a Hydra config, so it is possible to log and easy to override in command line.

Nice! This is the intended design (allowing ft_schedule to be a dict or file for user flexibility)

  1. In the current implementation, the fts callback only supports integers as phrase keys but Hydra doesn't. As I want to override fts config in command line, I just set phrase keys as strings in .yaml file and manually convert to integers after.

To make things easier for users with similar needs, I've added some new functionality (not yet merged into master or released) that converts non-integer schedule keys if they can be converted to integers and proceeds with training if other schedule sanity checks pass (it dumps the modified schedule for the user to review as well)

  1. Currently the callback only supports _LRScheduler. For people who want to use huggingface transformers get_*_schedule_with_warmup functions, here is an example (derived from Pytorch and transformers).

Great suggestion! To enrich support for using LambdaLR style lr schedulers with fts, I've added an option that allows for implicit extension of existing lr_lambdas lists for schedulers with that attribute. I've also added some tests that verify LambdaLR scheduler handling proceeds as expected.

LambdaLR schedulers (like the HF ones and the one you reference above) should be directly usable as can be observed in those tests.

In the future, my plan is to move away from nominal subtyping preferring instead structural subtyping. Unfortunately, since PL still supports 3.7 and full structural subtyping support via Protocols isn't available in typing until python 3.8, I think it's best to hold off switching to the structural subtyping approach until 3.8 is the minimum python supported.

As soon as I get a chance to make the relevant documentation updates for these features, I'll add them to master and they should subsequently be included in the next Finetuning Scheduler release. Thanks again for your great feedback/work. Good luck with your continued research and feel free to reach out anytime!

from finetuning-scheduler.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.