Giter VIP home page Giter VIP logo

Comments (3)

Yangqing avatar Yangqing commented on April 27, 2024

I am in general against writing a matrix class, or using an existing matrix
class (in which case it would be very tricky to synchronize CPU and GPU
operations). What we essentially should need is a Tensor class that
achieves 4-dimensional array operations, but that involves some substantial
changes to more than half of the code.

I am also a little against Matlab style implementations. For example, the
code:

activations = input.exp();
probs = activations.rdiv(activations.sum(dim));

effectively allocates two arrays, activations and probs, and then discards
them on the fly. Of course this could be written in a more careful way by
preallocating arrays, like exp(input, &activation), but it would introduce
careless codes more often. The current code actually requires you to
explicitly define such "buffer" blobs, which I believe is important in
writing effectively codes.

I do like the idea of separating interface from actual implementations. The
Blob class is sort of halfway here - I was in a fast iteration when writing
all those codes, but one can imaging better separation between the blob
operation interfaces and the actual blob implementations (e.g. do
add(blob1, blob2), or conv(blob1, blob2)), which is essentially what you
are proposing here. At this stage, I don't think refactoring is an urgent
issue though.

Yangqing

On Thu, Jan 23, 2014 at 9:20 PM, kloudkl [email protected] wrote:

Currently, the algorithm codes are quite aware of the memory layout of the
underlying data. Adding a Matrix class in-between helps seperate concerns
of different modules which is a good practice in software engineering.

The biggest benefits is to simplify coding and improve the development
productivity. It will also ease understanding of the existing and future
algorithms. As a result, we will see accelerate the development and
adoption progress.

The Matrix class is intended to be a view of 2D array contained in a Blob.
Its main functionality is to provide high level wrapper of the common
operations.

using boost::move;
templateclass Matrix {public:
Matrix();
Matrix(shared_ptr<Blob > blob);
Matrix mul(Matrix& that) {
Matrix product;
caffe_gpu_gemm(...);
return move(product);
}
Matrix add(Matrix& that);
minus, div, rdiv, sqr, pow, exp, conv, sum, max, min, mean, std, ones, zeros, rand, randn, size, rows, cols, row, col, roi, t/transpose, rot90, ...private:
shared_ptr<Blob > blob_;
size_t num_;
size_t channel_;
size_t offset_;}

So that we can write like codes like the following snippets.
The convolution:

output = image.conv(filter);

The fully connected layer:

output = weight.mul(input).add(bias);

The ReLU activation:

activation = input.max(0);

The Softmax activation

activations = input.exp();probs = activations.rdiv(activations.sum(dim));

As you can see, the API is highly inspired by the MATLAB counterparts
which also motivates ArrayFire C++. But of course the snippets are only a
rough sketch. Many more details need to be considered. For example, if the
performance price of boost move operations is too high, it could be replace
by shared_ptr which would complicate the user codes a little. Another
question is should we pass in the shared_ptr of the result matrix instead
of returning it. More importantly, the GPU codes may greatly differ from
the CPU codes depending on whether CUDA can play well with the proposed API
syntax.

Therefore, this issue's scope is limited to the implementation of the
Matrix classes for both kinds of devices. Porting algorithms should be
delayed until benchmark results shows no performance gap between the low
level API and the proposed high level ones.

Welcome efforts to refine the APIs and help implement them.

Reply to this email directly or view it on GitHubhttps://github.com//issues/54
.

from caffe.

kloudkl avatar kloudkl commented on April 27, 2024

Thanks for your suggestions! In a larger context of this proposal, I am wondering for a while what are the vision, scope, dos with priorities and dont's of Caffe? If you have a plan that can direct the community towards a shared destination, it would concentrate the limited resources out there and lead to more effective development and wider adoption in the near future.

from caffe.

Yangqing avatar Yangqing commented on April 27, 2024

Closed per #85.

from caffe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.