Giter VIP home page Giter VIP logo

Comments (9)

Yangqing avatar Yangqing commented on April 27, 2024

Something like a regularizer that could be attached to a layer, similar to
the one I wrote in decaf (see e.g.
https://github.com/UCB-ICSI-Vision-Group/decaf-release/blob/master/decaf/base.py#L217).
Caffe hasn't got a regularizer in place yet, mainly because I was simply
using weight decay for the imagenet training.

Yangqing

On Mon, Jan 27, 2014 at 8:46 AM, aravindhm [email protected] wrote:

Is there an easy way to implement L1 regularization on the weight matrix
of a fully connected network. Similarly I want to penalize the L1 norm of
features in each layer. What is the best way to do that using caffe?

Reply to this email directly or view it on GitHubhttps://github.com//issues/60
.

from caffe.

kloudkl avatar kloudkl commented on April 27, 2024

@aravindhm, I found that you have already implemented an L1 norm layer in you own branch. Would you please take a look at my implementation (#113) following the advice of @Yangqing and tell me whether we are solving the same problem? As far as I can see, your contribution is relatively independent of mine and is well worth being merged back into the master branch here.

There are a large number of public or private forks of Caffe out there. I had a look at some of the branches that are updated recently. Authors are working on various problems very actively. A diverse community will certainly accelerate the evolution of this project. It is a very healthy phenomenon.

But at the same time, I hope there are as few duplicate efforts as possible. I would like the project owners, contributors and everyone who cares about this project to discuss the issue and find out a solution.

from caffe.

aravindhm avatar aravindhm commented on April 27, 2024

My branch has made too many modifications. All the changes were made to the boost-eigen (I couldn't buy MKL) branch and a few were made to the master. Some of these are still broken. They include

  1. A method to share weights across blobs by averaging out the gradient before updating it. The momentum is not averaged.
  2. Make the euclidean layer output a loss layer (top->size() <= 1), so that the network at test time prints an error instead of having no output. I had added a gpu implementation but that didn't given any performance gain because of very few floating point operations per byte loaded. I therefore removed it.
  3. A tanh layer. Sparse convolutional autoencoders used this instead of ReLU.
  4. Change reLU layer to use thrust as otherwise it launches many threads and they don't do much work individually. I found that a problem because the older Tesla GPUs (like the ones on amazon ec2 cg1.4xlarge) cannot launch as many threads as the newer ones and I get a configuration error when the kernel is called.
  5. L1Normlayer - I'm still working on this. The gradient check fails. But the regularizer layer implemented by @kloudkl is much better. The L1Norm layer is not useful in any popular architecture except as a regularization.
  6. A bunch of example files to dump network parameters into stdout or network parameter differences into stdout etc.

Since the commits for these are interleaved, it makes merging very tough. Can merging be done on a file to file basis?

from caffe.

aravindhm avatar aravindhm commented on April 27, 2024

If a dev branch is made I can copy atleast the tanh layer into it and have that merged without disturbing other branches?

from caffe.

shelhamer avatar shelhamer commented on April 27, 2024

Merging can be done commit-by-commit through cherry-picking, and with interactive rebasing–see github help topic and git book chapter–anything is possible.

To start, branch from whatever branch has all your intermingled work, and then you can sift out the desired changes from there. For instance, you could create a tanh branch, weight-sharing branch, etc.

Rebasing is how I have been integrating boost-eigen changes while still tracking master and selecting from merges like #97 . Cherry-picking is sometimes helpful, but relying on it all the time usually suggests deeper workflow issues to sort out.

Hope these tips help.

from caffe.

kloudkl avatar kloudkl commented on April 27, 2024

It seems that @aravindhm has solved the problem in #116. We don't have to write a step by step guide by ourselves. A how to contribute doc with links to the most helpful external guides or tutorials is enough.

from caffe.

aravindhm avatar aravindhm commented on April 27, 2024

I didn't cherry pick this time. I created a new local copy of master and made a branch of master (tanh). I copied the files in manually - very small effort in this case and sent a pull request.

from caffe.

kloudkl avatar kloudkl commented on April 27, 2024

@aravindhm, your branch has a lot more good features and I hope they will be picked out and merged back too if you would like to. If they are mixed together in the commits, copying each of them separately is perhaps the only way to go. Any method that works is the best since we don't have to be bound by the tools.

from caffe.

shelhamer avatar shelhamer commented on April 27, 2024

Sparsity penalties are addressed by #113.

from caffe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.