This is a really helpful project! I'm trying to get this working with our build but th

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Consider using build tags instead of a bash script about gotch HOT 6 OPEN

oliverdain commented on June 18, 2024

Consider using build tags instead of a bash script

from gotch.

Comments (6)

sugarme commented on June 18, 2024

@oliverdain ,

Thanks for suggestion. I was wondering how would it be in combination with libtorch installation for CPU and GPU or it just handles gotch installation only?

Feel free to PR. Please update installation guide in README.md file as well. Thanks.

from gotch.

oliverdain commented on June 18, 2024

Hi @sugarme . I've started on this. I got libtorch installed for CPU and exported all the env vars, etc. go test ./... in the clone of this repo builds some, but not all packages, correctly and throws what look like real compilation errors to me. This is with libtorch 2.1.2 and with libtorch 2.1.0 (the latter being what's in your setup.sh files).

I see that you automatically generated many of the bindings but didn't see much info on how that was done. I think maybe they just need to be re-generated?

There's actually quite a few more failures than this but this should give you the idea:

$ go test ./...
?       github.com/sugarme/gotch        [no test files]
ok      github.com/sugarme/gotch/dutil  0.678s
?       github.com/sugarme/gotch/example/augmentation   [no test files]
?       github.com/sugarme/gotch/example/basic  [no test files]
?       github.com/sugarme/gotch/example/char-rnn       [no test files]
?       github.com/sugarme/gotch/example/cifar  [no test files]
?       github.com/sugarme/gotch/example/convert-model  [no test files]
?       github.com/sugarme/gotch/example/debug-memory   [no test files]
?       github.com/sugarme/gotch/example/jit    [no test files]
?       github.com/sugarme/gotch/example/jit-train      [no test files]
?       github.com/sugarme/gotch/example/mem    [no test files]
# github.com/sugarme/gotch/example/neural-style-transfer
example/neural-style-transfer/main.go:156:19: cannot use &inputLayers[idx] (value of type **ts.Tensor) as *ts.Tensor value in argument to styleLoss
example/neural-style-transfer/main.go:156:38: cannot use &styleLayers[idx] (value of type **ts.Tensor) as *ts.Tensor value in argument to styleLoss
example/neural-style-transfer/main.go:162:38: cannot use &contentLayers[idx] (value of type **ts.Tensor) as *ts.Tensor value in argument to inputLayers[idx].MustMseLoss
?       github.com/sugarme/gotch/example/mnist  [no test files]
?       github.com/sugarme/gotch/example/mnist-fp16     [no test files]
# github.com/sugarme/gotch/example/scheduler
example/scheduler/main.go:39:18: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to o.AddParamGroup
?       github.com/sugarme/gotch/example/pickle [no test files]
?       github.com/sugarme/gotch/example/pretrained-model       [no test files]
?       github.com/sugarme/gotch/example/tensor-grad    [no test files]
# github.com/sugarme/gotch/example/translation
example/translation/main.go:84:22: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/translation/main.go:106:29: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/translation/main.go:110:57: attnWeights.MustBmm(encOutputsTs, true).MustSqueeze1 undefined (type *ts.Tensor has no field or method MustSqueeze1)
example/translation/main.go:113:20: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/translation/main.go:159:26: cannot use encOutputs (variable of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustStack
example/translation/main.go:219:26: cannot use encOutputs (variable of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustStack
# github.com/sugarme/gotch/example/yolo
example/yolo/darknet.go:401:29: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/yolo/darknet.go:515:23: cannot use layers (variable of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/yolo/darknet.go:543:21: cannot use detections (variable of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/yolo/main.go:193:20: imageTmp.MustDiv1 undefined (type *ts.Tensor has no field or method MustDiv1)
?       github.com/sugarme/gotch/example/tensor-io      [no test files]
?       github.com/sugarme/gotch/example/transfer-learning      [no test files]
?       github.com/sugarme/gotch/example/yolo/freetype  [no test files]
?       github.com/sugarme/gotch/libtch [no test files]
ok      github.com/sugarme/gotch/half   (cached)
?       github.com/sugarme/gotch/vision [no test files]
?       github.com/sugarme/gotch/vision/aug     [no test files]
+---------------------------------------------------------------------------+
| Memory Stats: Start                                                       |
+---------------------------------------------------------------------------+
|  Allocated heap objects                                              773  |
|  Released heap objects                                                19  |
|  Living heap objects                                                 754  |
|  Memory in use by heap objects (bytes)                            298272  |
|  Reserved memory (by Go runtime for heap, stack,...) (bytes)    11926544  |
|  Total pause time by GC (nanoseconds)                                  0  |
|  Number of GC called                                                   0  |
+---------------------------------------------------------------------------+
vs created...
vs deleted...
+---------------------------------------------------------------------------+
| Memory Stats: Final                                                       |
+---------------------------------------------------------------------------+
|  Allocated heap objects                                             6726  |
|  Released heap objects                                              5977  |
|  Living heap objects                                                 749  |
|  Memory in use by heap objects (bytes)                            325360  |
|  Reserved memory (by Go runtime for heap, stack,...) (bytes)    83575064  |
|  Total pause time by GC (nanoseconds)                             542866  |
|  Number of GC called                                                  10  |
+---------------------------------------------------------------------------+
Loss: 23.000
Loss: 0.336
Loss: 0.307
Loss: 0.281
Loss: 0.257
2024/01/17 17:16:12 Libtorch API Error: element 0 of tensors does not require grad and does not have a grad_fn
Exception raised from run_backward at ../torch/csrc/autograd/autograd.cpp:109 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7fe8eb5ac1fb in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xbf (0x7fe8eb5a6d6f in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libc10.so)
frame #2: <unknown function> + 0x45f4db8 (0x7fe8d9bf4db8 in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libtorch_cpu.so)
frame #3: torch::autograd::backward(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, c10::optional<bool>, bool, std::vector<at::Tensor, std::allocator<at::Tensor> > const&) + 0x6a (0x7fe8d9bf82ea in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libtorch_cpu.so)
frame #4: <unknown function> + 0x46599bd (0x7fe8d9c599bd in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libtorch_cpu.so)
frame #5: at::Tensor::_backward(c10::ArrayRef<at::Tensor>, c10::optional<at::Tensor> const&, c10::optional<bool>, bool) const + 0x49 (0x7fe8d69a09f9 in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libtorch_cpu.so)
frame #6: at_backward + 0x4f (0x742edf in /tmp/go-build428359870/b198/nn.test)
frame #7: /tmp/go-build428359870/b198/nn.test() [0x55d2c4]

goroutine 34 [running]:
runtime/debug.Stack()
        /usr/local/go/src/runtime/debug/stack.go:24 +0x65
github.com/sugarme/gotch/ts.TorchErr()
        /home/oliver/Documents/code/gotch/ts/error.go:45 +0x4b
github.com/sugarme/gotch/ts.(*Tensor).Backward(0x616159?)
        /home/oliver/Documents/code/gotch/ts/tensor.go:813 +0x2a
github.com/sugarme/gotch/ts.(*Tensor).MustBackward(0x681ef9?)
        /home/oliver/Documents/code/gotch/ts/tensor.go:821 +0x19
github.com/sugarme/gotch/nn.(*Optimizer).BackwardStep(0xc00007d3e0, 0xc000014018?)
        /home/oliver/Documents/code/gotch/nn/optimizer.go:316 +0x7b
github.com/sugarme/gotch/nn_test.TestOptimizer(0xc00050ed00)
        /home/oliver/Documents/code/gotch/nn/optimizer_test.go:53 +0x57e
testing.tRunner(0xc00050ed00, 0x83ad18)
        /usr/local/go/src/testing/testing.go:1576 +0x10b
created by testing.(*T).Run
        /usr/local/go/src/testing/testing.go:1629 +0x3ea

FAIL    github.com/sugarme/gotch/nn     17.413s
--- FAIL: ExampleLoadInfo (55.28s)

from gotch.

sugarme commented on June 18, 2024

@oliverdain,

From the log, I think gotch had actually compiled. Some tests failed as result of API changes from []ts.Tensor (older API) to []*ts.Tensor that I thought would have been fixed already. As long as you can create a simple example with some tensor operations and get what you expect, then I think gotch is Okay with the binding and APIs, just leave out the unit tests in subpackages.

Please stick to libtorch 2.1.0 as the latest.

I created a branch https://github.com/sugarme/gotch/tree/buildtag that you can PR for tracking.

Cheers,

from gotch.

oliverdain commented on June 18, 2024

Just sent you a PR that fixes the unit tests. I'll work on the build tag thing tomorrow.

from gotch.

sugarme commented on June 18, 2024

@oliverdain,

Thanks for the fix.

from gotch.

nullbull commented on June 18, 2024

I very much agree with this user's suggestion to use Tag to replace script production code. I plan to use this library in the production environment, but because this code is generated, there is no way to execute the script to replace this code using the company's platform, so I can only fork it out. Force lib.go to be hardcoded for CPU use.

from gotch.

Consider using build tags instead of a bash script about gotch HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent