flennerhag / warpgrad Goto Github PK

View Code? Open in Web Editor NEW

92.0 92.0 18.0 808 KB

Meta-Learning with Warped Gradient Descent

Home Page: https://openreview.net/forum?id=rkeiQlBFPB

License: Apache License 2.0

Shell 0.61% Python 99.39%

warpgrad's People

Contributors

Stargazers

Watchers

Forkers

sajid3 uctoronto jakezdk j-dymond marcociccone raegher wwxfromtju brjathu zeta1999 susan-rao rcmalli gliese581gg luithw sjunhongshen lol88 limaries30 pengchengpcx vasiliyeskin

warpgrad's Issues

Parameter setting for the n-way k-shot learning

Hi Flennerhag,

I want to reproduce the accuracy of Warp-Leap in Table 1 for the Omniglot dataset.
However, I am confused with the parameter setting to run n-way k-shot learning with this repository.
In main.py of Omniglot implementation, it seems that you set the parameter for 20-way (from --classes input), but I do not see the argument to set the number of shots there.

Could you please show how to set the number of shots for this implementation, i.e., 20-way 100-shot or maybe another scenario for 5-way 5-shot?

Thanks,
Dwi

About reproducing Table 4 in Appendix G

Hi Flannerhag,

Congratulations for the nice work! I really enjoyed reading your paper.

Just a simple question about reproducing Table 4 in Appendix G. Could you let me know the reference code you used for KFAC? Or, do you have any plan to release your code for KFAC or any other baseline experiments based on random initializations?

And also, could you provide a brief explanation about to how use run_multi.py?

Thank you!

Best regards,
Hae Beom Lee.

Continual Learning experiments

Will you open source the continual learning experiments?

Thanks

Out of GPU Memory

Hi! I met a problem. When I apply my dataset and my resnet50 model and run the comman "python main.py --meta_model warp_leap --suffix myrun18", I found that the GPU memory will keep growing until the limit is exceeded and the program passively stops. Then I found that when I comment self._state_buffer[slot].append(clone_state(state, device='cpu'))
in warpgrad/warpgrad.py, the problem won't appear. But isn't this code supposed to increase the CPU's memory and don't change GPU memory? Why GPU memory would become larger and larger? It is so weired.

How to create minimal working example?

I am struggling to implement a minimal example using this repository as a library. Based on a given code sample in the readme file, could you specify the required helper classes should I use in the following steps?

Assuming that I have three layer neural network initialized as follow:

class SimpleModel(nn.Module):
    # The second linear layer will be used as warp layer.
    def __init__(self,hidden_dim):
        super(SimpleModel, self).__init__()
        self.linear1 = nn.Linear(2,hidden_dim)
        self.relu1 = nn.ReLU()
        self.linear2 = nn.Linear(hidden_dim,hidden_dim)
        self.relu2 = nn.ReLU()
        self.linear3 = nn.Linear(hidden_dim,2)
    def forward(self,x):
        out = self.linear1(x)
        out = self.relu1(out)
        out = self.linear2(out)
        out = self.relu2(out)
        out = self.linear3(out)
        return out

# It seems like any model that will be given as an argument to the initialization of the `Warp` class should have a function called `init_adaptation`.

Creating the dataloaders:

How should I define the task generator classes to use with model.register_task(mytask)?

After successfully handling the first two steps, according to the instruction, I should be able to initialize the Warpgrad classes as follows:

updater = warpgrad.DualUpdater(criterion=nn.CrossEntropyLoss)
buffer = warpgrad.ReplayBuffer()

# I will already be going to use inner_optimizer and meta_optimizer for the task adaptation problem.
# What is the connection of these optimization parameters with the algorithm itself?

optimizer_parameters = warpgrad.OptimizerParameters(
                trainable=False,
                default_lr=1e-3,
                default_momentum=0.9)
simple_model = SimpleModel(hidden_dim=100)

model = warpgrad.Warp(simple_model , 
                        [simple_model.linear1,simple_model.relu1,simple_model.relu2,simple_model.linear3],   # List of nn.Modules for adaptation
                        [simple_model.linear2],  # List of nn.Modules for warping
                        updater, 
                        buffer,
                        optimizer_parameters) 

# Should we use the optimizers defined in the Warpgrad library as `meta_opt_class`?
meta_opt = meta_opt_class(model.warp_parameters(), **meta_opt_kwargs)

Implementing the training loop

def meta_step_fn():
    meta_opt.step()
    meta_opt.zero_grad()

for meta_step in range(meta_steps):
    meta_batch = mytaskgenerator.sample()
    for task in meta_batch:
        model.init_adaptation()     # Initialize adaptation on the model side
        # How should task should be defined? It seems like it needs to be a specific class.
        model.register_task(task)   # Register task in replay buffer
        model.collect()             # Turn parameter collection on

        # Is this optimizer should also be from the warpgrad library?
        opt = opt_cls(model.adapt_parameters(), **opt_kwargs)

        for x, y in task:
            loss = criterion(model(x), y)
            loss.backward()
            opt.step()
            opt.zero_grad()

        # Evaluation
        model.eval()
        model.no_collect()
        your_evaluation(task, model)

    # Meta-objective
    model.backward(meta_step_fn)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.