Comments (6)
Thanks for suggestion. I was wondering how would it be in combination with libtorch installation for CPU and GPU or it just handles gotch installation only?
Feel free to PR. Please update installation guide in README.md
file as well. Thanks.
from gotch.
Hi @sugarme . I've started on this. I got libtorch installed for CPU and exported all the env vars, etc. go test ./...
in the clone of this repo builds some, but not all packages, correctly and throws what look like real compilation errors to me. This is with libtorch 2.1.2 and with libtorch 2.1.0 (the latter being what's in your setup.sh files).
I see that you automatically generated many of the bindings but didn't see much info on how that was done. I think maybe they just need to be re-generated?
There's actually quite a few more failures than this but this should give you the idea:
$ go test ./...
? github.com/sugarme/gotch [no test files]
ok github.com/sugarme/gotch/dutil 0.678s
? github.com/sugarme/gotch/example/augmentation [no test files]
? github.com/sugarme/gotch/example/basic [no test files]
? github.com/sugarme/gotch/example/char-rnn [no test files]
? github.com/sugarme/gotch/example/cifar [no test files]
? github.com/sugarme/gotch/example/convert-model [no test files]
? github.com/sugarme/gotch/example/debug-memory [no test files]
? github.com/sugarme/gotch/example/jit [no test files]
? github.com/sugarme/gotch/example/jit-train [no test files]
? github.com/sugarme/gotch/example/mem [no test files]
# github.com/sugarme/gotch/example/neural-style-transfer
example/neural-style-transfer/main.go:156:19: cannot use &inputLayers[idx] (value of type **ts.Tensor) as *ts.Tensor value in argument to styleLoss
example/neural-style-transfer/main.go:156:38: cannot use &styleLayers[idx] (value of type **ts.Tensor) as *ts.Tensor value in argument to styleLoss
example/neural-style-transfer/main.go:162:38: cannot use &contentLayers[idx] (value of type **ts.Tensor) as *ts.Tensor value in argument to inputLayers[idx].MustMseLoss
? github.com/sugarme/gotch/example/mnist [no test files]
? github.com/sugarme/gotch/example/mnist-fp16 [no test files]
# github.com/sugarme/gotch/example/scheduler
example/scheduler/main.go:39:18: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to o.AddParamGroup
? github.com/sugarme/gotch/example/pickle [no test files]
? github.com/sugarme/gotch/example/pretrained-model [no test files]
? github.com/sugarme/gotch/example/tensor-grad [no test files]
# github.com/sugarme/gotch/example/translation
example/translation/main.go:84:22: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/translation/main.go:106:29: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/translation/main.go:110:57: attnWeights.MustBmm(encOutputsTs, true).MustSqueeze1 undefined (type *ts.Tensor has no field or method MustSqueeze1)
example/translation/main.go:113:20: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/translation/main.go:159:26: cannot use encOutputs (variable of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustStack
example/translation/main.go:219:26: cannot use encOutputs (variable of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustStack
# github.com/sugarme/gotch/example/yolo
example/yolo/darknet.go:401:29: cannot use []ts.Tensor{…} (value of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/yolo/darknet.go:515:23: cannot use layers (variable of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/yolo/darknet.go:543:21: cannot use detections (variable of type []ts.Tensor) as []*ts.Tensor value in argument to ts.MustCat
example/yolo/main.go:193:20: imageTmp.MustDiv1 undefined (type *ts.Tensor has no field or method MustDiv1)
? github.com/sugarme/gotch/example/tensor-io [no test files]
? github.com/sugarme/gotch/example/transfer-learning [no test files]
? github.com/sugarme/gotch/example/yolo/freetype [no test files]
? github.com/sugarme/gotch/libtch [no test files]
ok github.com/sugarme/gotch/half (cached)
? github.com/sugarme/gotch/vision [no test files]
? github.com/sugarme/gotch/vision/aug [no test files]
+---------------------------------------------------------------------------+
| Memory Stats: Start |
+---------------------------------------------------------------------------+
| Allocated heap objects 773 |
| Released heap objects 19 |
| Living heap objects 754 |
| Memory in use by heap objects (bytes) 298272 |
| Reserved memory (by Go runtime for heap, stack,...) (bytes) 11926544 |
| Total pause time by GC (nanoseconds) 0 |
| Number of GC called 0 |
+---------------------------------------------------------------------------+
vs created...
vs deleted...
+---------------------------------------------------------------------------+
| Memory Stats: Final |
+---------------------------------------------------------------------------+
| Allocated heap objects 6726 |
| Released heap objects 5977 |
| Living heap objects 749 |
| Memory in use by heap objects (bytes) 325360 |
| Reserved memory (by Go runtime for heap, stack,...) (bytes) 83575064 |
| Total pause time by GC (nanoseconds) 542866 |
| Number of GC called 10 |
+---------------------------------------------------------------------------+
Loss: 23.000
Loss: 0.336
Loss: 0.307
Loss: 0.281
Loss: 0.257
2024/01/17 17:16:12 Libtorch API Error: element 0 of tensors does not require grad and does not have a grad_fn
Exception raised from run_backward at ../torch/csrc/autograd/autograd.cpp:109 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) + 0x6b (0x7fe8eb5ac1fb in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libc10.so)
frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 0xbf (0x7fe8eb5a6d6f in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libc10.so)
frame #2: <unknown function> + 0x45f4db8 (0x7fe8d9bf4db8 in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libtorch_cpu.so)
frame #3: torch::autograd::backward(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, c10::optional<bool>, bool, std::vector<at::Tensor, std::allocator<at::Tensor> > const&) + 0x6a (0x7fe8d9bf82ea in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libtorch_cpu.so)
frame #4: <unknown function> + 0x46599bd (0x7fe8d9c599bd in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libtorch_cpu.so)
frame #5: at::Tensor::_backward(c10::ArrayRef<at::Tensor>, c10::optional<at::Tensor> const&, c10::optional<bool>, bool) const + 0x49 (0x7fe8d69a09f9 in /home/oliver/Documents/code/main/go/dist/libs/libtorch_cpu/lib/libtorch_cpu.so)
frame #6: at_backward + 0x4f (0x742edf in /tmp/go-build428359870/b198/nn.test)
frame #7: /tmp/go-build428359870/b198/nn.test() [0x55d2c4]
goroutine 34 [running]:
runtime/debug.Stack()
/usr/local/go/src/runtime/debug/stack.go:24 +0x65
github.com/sugarme/gotch/ts.TorchErr()
/home/oliver/Documents/code/gotch/ts/error.go:45 +0x4b
github.com/sugarme/gotch/ts.(*Tensor).Backward(0x616159?)
/home/oliver/Documents/code/gotch/ts/tensor.go:813 +0x2a
github.com/sugarme/gotch/ts.(*Tensor).MustBackward(0x681ef9?)
/home/oliver/Documents/code/gotch/ts/tensor.go:821 +0x19
github.com/sugarme/gotch/nn.(*Optimizer).BackwardStep(0xc00007d3e0, 0xc000014018?)
/home/oliver/Documents/code/gotch/nn/optimizer.go:316 +0x7b
github.com/sugarme/gotch/nn_test.TestOptimizer(0xc00050ed00)
/home/oliver/Documents/code/gotch/nn/optimizer_test.go:53 +0x57e
testing.tRunner(0xc00050ed00, 0x83ad18)
/usr/local/go/src/testing/testing.go:1576 +0x10b
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:1629 +0x3ea
FAIL github.com/sugarme/gotch/nn 17.413s
--- FAIL: ExampleLoadInfo (55.28s)
from gotch.
From the log, I think gotch had actually compiled. Some tests failed as result of API changes from []ts.Tensor
(older API) to []*ts.Tensor
that I thought would have been fixed already. As long as you can create a simple example with some tensor operations and get what you expect, then I think gotch is Okay with the binding and APIs, just leave out the unit tests in subpackages.
Please stick to libtorch 2.1.0 as the latest.
I created a branch https://github.com/sugarme/gotch/tree/buildtag that you can PR for tracking.
Cheers,
from gotch.
Just sent you a PR that fixes the unit tests. I'll work on the build tag thing tomorrow.
from gotch.
Thanks for the fix.
from gotch.
I very much agree with this user's suggestion to use Tag to replace script production code. I plan to use this library in the production environment, but because this code is generated, there is no way to execute the script to replace this code using the company's platform, so I can only fork it out. Force lib.go to be hardcoded for CPU use.
from gotch.
Related Issues (20)
- v2.0 support HOT 3
- Possible Memory Leak From C.malloc(0) HOT 3
- how to load model pytorch_model.bin HOT 5
- Indexing documentation HOT 1
- Can't build project with gotch v0.9.0 HOT 3
- Production use for gotch just for inference HOT 3
- Concurrency issue in generating tensor name in newTensor HOT 3
- Can memory leak in tensor-generated.go because of malloc(0) ? HOT 8
- Memory Leak in JIT Model under Multi-Goroutine Environment HOT 9
- how can i convert gocv.Mat image data to Tensor ? HOT 2
- `*ts.CModule` does not implement `ts.Module` interface correctly
- Cannot Run the Application with Cgo HOT 1
- Cannot Run the Application using Libtorch 2.1 (CPU) Docker Image
- TestOptimizer is flaky
- install steps on macos HOT 3
- Cgo Memory Leak HOT 14
- Q) Is there a function or method to clear the cached memory? HOT 1
- Float64Values() shows an error 'Unsupported Go type: []float64' HOT 2
- Releasing tensor causes segmentation fault error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gotch.