Giter VIP home page Giter VIP logo

Comments (6)

sugarme avatar sugarme commented on June 18, 2024

@oliverdain ,

Thanks for suggestion. I was wondering how would it be in combination with libtorch installation for CPU and GPU or it just handles gotch installation only?

Feel free to PR. Please update installation guide in README.md file as well. Thanks.

from gotch.

oliverdain avatar oliverdain commented on June 18, 2024

Hi @sugarme . I've started on this. I got libtorch installed for CPU and exported all the env vars, etc. go test ./... in the clone of this repo builds some, but not all packages, correctly and throws what look like real compilation errors to me. This is with libtorch 2.1.2 and with libtorch 2.1.0 (the latter being what's in your setup.sh files).

I see that you automatically generated many of the bindings but didn't see much info on how that was done. I think maybe they just need to be re-generated?

There's actually quite a few more failures than this but this should give you the idea:

$ go test ./...
?       github.com/sugarme/gotch        [no test files]
ok      github.com/sugarme/gotch/dutil  0.678s
?       github.com/sugarme/gotch/example/augmentation   [no test files]
?       github.com/sugarme/gotch/example/basic  [no test files]
?       github.com/sugarme/gotch/example/char-rnn       [no test files]
?       github.com/sugarme/gotch/example/cifar  [no test files]
?       github.com/sugarme/gotch/example/convert-model  [no test files]
?       github.com/sugarme/gotch/example/debug-memory   [no test files]
?       github.com/sugarme/gotch/example/jit    [no test files]
?       github.com/sugarme/gotch/example/jit-train      [no test files]
?       github.com/sugarme/gotch/example/mem    [no test files]
# github.com/sugarme/gotch/example/neural-style-transfer
example/neural-style-transfer/main.go:156:19: cannot use &inputLayers[idx] (value of type **ts.Tensor) as *ts.Tensor value in argument to styleLoss
example/neural-style-transfer/main.go:156:38: cannot use &styleLayers[idx] (value of type **ts.Tensor) as *ts.Tensor value in argument to styleLoss
example/neural-style-transfer/main.go:162:38: cannot use &contentLayers[idx] (value of type **ts.Tensor) as *ts.Tensor value in argument to inputLayers[idx].MustMseLoss
?       github.com/sugarme/gotch/example/mnist  [no test files]
?       github.com/sugarme/gotch/example/mnist-fp16     [no test files]
# github.com/sugarme/gotch/example/scheduler
example/scheduler/main.go:39:18: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to o.AddParamGroup
?       github.com/sugarme/gotch/example/pickle [no test files]
?       github.com/sugarme/gotch/example/pretrained-model       [no test files]
?       github.com/sugarme/gotch/example/tensor-grad    [no test files]
# github.com/sugarme/gotch/example/translation
example/translation/main.go:84:22: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/translation/main.go:106:29: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/translation/main.go:110:57: attnWeights.MustBmm(encOutputsTs, true).MustSqueeze1 undefined (type *ts.Tensor has no field or method MustSqueeze1)
example/translation/main.go:113:20: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/translation/main.go:159:26: cannot use encOutputs (variable of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustStack
example/translation/main.go:219:26: cannot use encOutputs (variable of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustStack
# github.com/sugarme/gotch/example/yolo
example/yolo/darknet.go:401:29: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/yolo/darknet.go:515:23: cannot use layers (variable of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/yolo/darknet.go:543:21: cannot use detections (variable of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/yolo/main.go:193:20: imageTmp.MustDiv1 undefined (type *ts.Tensor has no field or method MustDiv1)
?       github.com/sugarme/gotch/example/tensor-io      [no test files]
?       github.com/sugarme/gotch/example/transfer-learning      [no test files]
?       github.com/sugarme/gotch/example/yolo/freetype  [no test files]
?       github.com/sugarme/gotch/libtch [no test files]
ok      github.com/sugarme/gotch/half   (cached)
?       github.com/sugarme/gotch/vision [no test files]
?       github.com/sugarme/gotch/vision/aug     [no test files]
+---------------------------------------------------------------------------+
| Memory Stats: Start                                                       |
+---------------------------------------------------------------------------+
|  Allocated heap objects                                              773  |
|  Released heap objects                                                19  |
|  Living heap objects                                                 754  |
|  Memory in use by heap objects (bytes)                            298272  |
|  Reserved memory (by Go runtime for heap, stack,...) (bytes)    11926544  |
|  Total pause time by GC (nanoseconds)                                  0  |
|  Number of GC called                                                   0  |
+---------------------------------------------------------------------------+
vs created...
vs deleted...
+---------------------------------------------------------------------------+
| Memory Stats: Final                                                       |
+---------------------------------------------------------------------------+
|  Allocated heap objects                                             6726  |
|  Released heap objects                                              5977  |
|  Living heap objects                                                 749  |
|  Memory in use by heap objects (bytes)                            325360  |
|  Reserved memory (by Go runtime for heap, stack,...) (bytes)    83575064  |
|  Total pause time by GC (nanoseconds)                             542866  |
|  Number of GC called                                                  10  |
+---------------------------------------------------------------------------+
Loss: 23.000
Loss: 0.336
Loss: 0.307
Loss: 0.281
Loss: 0.257
2024/01/17 17:16:12 Libtorch API Error: element 0 of tensors does not require grad and does not have a grad_fn
Exception raised from run_backward at ../torch/csrc/autograd/autograd.cpp:109 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7fe8eb5ac1fb in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xbf (0x7fe8eb5a6d6f in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libc10.so)
frame #2: <unknown function> + 0x45f4db8 (0x7fe8d9bf4db8 in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libtorch_cpu.so)
frame #3: torch::autograd::backward(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, c10::optional<bool>, bool, std::vector<at::Tensor, std::allocator<at::Tensor> > const&) + 0x6a (0x7fe8d9bf82ea in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libtorch_cpu.so)
frame #4: <unknown function> + 0x46599bd (0x7fe8d9c599bd in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libtorch_cpu.so)
frame #5: at::Tensor::_backward(c10::ArrayRef<at::Tensor>, c10::optional<at::Tensor> const&, c10::optional<bool>, bool) const + 0x49 (0x7fe8d69a09f9 in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libtorch_cpu.so)
frame #6: at_backward + 0x4f (0x742edf in /tmp/go-build428359870/b198/nn.test)
frame #7: /tmp/go-build428359870/b198/nn.test() [0x55d2c4]

goroutine 34 [running]:
runtime/debug.Stack()
        /usr/local/go/src/runtime/debug/stack.go:24 +0x65
github.com/sugarme/gotch/ts.TorchErr()
        /home/oliver/Documents/code/gotch/ts/error.go:45 +0x4b
github.com/sugarme/gotch/ts.(*Tensor).Backward(0x616159?)
        /home/oliver/Documents/code/gotch/ts/tensor.go:813 +0x2a
github.com/sugarme/gotch/ts.(*Tensor).MustBackward(0x681ef9?)
        /home/oliver/Documents/code/gotch/ts/tensor.go:821 +0x19
github.com/sugarme/gotch/nn.(*Optimizer).BackwardStep(0xc00007d3e0, 0xc000014018?)
        /home/oliver/Documents/code/gotch/nn/optimizer.go:316 +0x7b
github.com/sugarme/gotch/nn_test.TestOptimizer(0xc00050ed00)
        /home/oliver/Documents/code/gotch/nn/optimizer_test.go:53 +0x57e
testing.tRunner(0xc00050ed00, 0x83ad18)
        /usr/local/go/src/testing/testing.go:1576 +0x10b
created by testing.(*T).Run
        /usr/local/go/src/testing/testing.go:1629 +0x3ea

FAIL    github.com/sugarme/gotch/nn     17.413s
--- FAIL: ExampleLoadInfo (55.28s)

from gotch.

sugarme avatar sugarme commented on June 18, 2024

@oliverdain,

From the log, I think gotch had actually compiled. Some tests failed as result of API changes from []ts.Tensor (older API) to []*ts.Tensor that I thought would have been fixed already. As long as you can create a simple example with some tensor operations and get what you expect, then I think gotch is Okay with the binding and APIs, just leave out the unit tests in subpackages.

Please stick to libtorch 2.1.0 as the latest.

I created a branch https://github.com/sugarme/gotch/tree/buildtag that you can PR for tracking.

Cheers,

from gotch.

oliverdain avatar oliverdain commented on June 18, 2024

Just sent you a PR that fixes the unit tests. I'll work on the build tag thing tomorrow.

from gotch.

sugarme avatar sugarme commented on June 18, 2024

@oliverdain,

Thanks for the fix.

from gotch.

nullbull avatar nullbull commented on June 18, 2024

I very much agree with this user's suggestion to use Tag to replace script production code. I plan to use this library in the production environment, but because this code is generated, there is no way to execute the script to replace this code using the company's platform, so I can only fork it out. Force lib.go to be hardcoded for CPU use.

from gotch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.