Giter VIP home page Giter VIP logo

gotch's Introduction

Gotch LicenseGo.Dev reference

Overview

gotch creates a thin wrapper to Pytorch C++ APIs (Libtorch) to make use of its already optimized C++ tensor APIs (3039) and dynamic graph computation with CUDA support and provides idiomatic Go APIs for developing and implementing Deep Learning in Go.

Some features are

  • Comprehensive Pytorch tensor APIs (2525)
  • Fully featured Pytorch dynamic graph computation
  • JIT interface to run model trained/saved using PyTorch Python API
  • Load pretrained Pytorch models and run inference
  • Pure Go APIs to build and train neural network models with both CPU and GPU support
  • Most recent image models
  • NLP Language models - Transformer in separate package built with gotch and pure Go Tokenizer.

gotch is in active development mode and may have API breaking changes. Feel free to pull request, report issues or discuss any concerns. All contributions are welcome.

gotch current version is v0.9.1

Dependencies

  • Libtorch C++ v2.1.0 library of Pytorch
  • Clang-17/Clang++-17 compilers

Installation

  • Default CUDA version is 11.8 if CUDA is available otherwise using CPU version.
  • Default Pytorch C++ API version is 2.1.0

NOTE: libtorch will be installed at /usr/local/lib

CPU

Step 1: Setup libtorch (skip this step if a valid libtorch already installed in your machine!)

    wget https://github.com/sugarme/gotch/releases/download/v0.9.0/setup-libtorch.sh
    chmod +x setup-libtorch.sh
    export CUDA_VER=cpu && bash setup-libtorch.sh

Update Environment: in Debian/Ubuntu, add/update the following lines to .bashrc file

    export GOTCH_LIBTORCH="/usr/local/lib/libtorch"
    export LIBRARY_PATH="$LIBRARY_PATH:$GOTCH_LIBTORCH/lib"
    export CPATH="$CPATH:$GOTCH_LIBTORCH/lib:$GOTCH_LIBTORCH/include:$GOTCH_LIBTORCH/include/torch/csrc/api/include"
    export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$GOTCH_LIBTORCH/lib"

Step 2: Setup gotch

    wget https://github.com/sugarme/gotch/releases/download/v0.9.0/setup-gotch.sh
    chmod +x setup-gotch.sh
    export CUDA_VER=cpu && export GOTCH_VER=v0.9.1 && bash setup-gotch.sh

GPU

NOTE: make sure your machine has working CUDA.

Step 1: Setup libtorch (skip this step if a valid libtorch already installed in your machine!)

    wget https://github.com/sugarme/gotch/releases/download/v0.9.0/setup-libtorch.sh
    chmod +x setup-libtorch.sh

    export CUDA_VER=11.8 && bash setup-libtorch.sh

Update Environment: in Debian/Ubuntu, add/update the following lines to .bashrc file

    export GOTCH_LIBTORCH="/usr/local/lib/libtorch"
    export LIBRARY_PATH="$LIBRARY_PATH:$GOTCH_LIBTORCH/lib"
    export CPATH="$CPATH:$GOTCH_LIBTORCH/lib:$GOTCH_LIBTORCH/include:$GOTCH_LIBTORCH/include/torch/csrc/api/include"
    LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$GOTCH_LIBTORCH/lib:/usr/lib64-nvidia:/usr/local/cuda-${CUDA_VERSION}/lib64"

Step 2: Setup gotch

    wget https://github.com/sugarme/gotch/releases/download/v0.9.0/setup-gotch.sh
    chmod +x setup-gotch.sh
    export CUDA_VER=11.8 && export GOTCH_VER=v0.9.1 && bash setup-gotch.sh

Examples

Basic tensor operations

import (
	"fmt"

	"github.com/sugarme/gotch"
	"github.com/sugarme/gotch/ts"
)

func basicOps() {

xs := ts.MustRand([]int64{3, 5, 6}, gotch.Float, gotch.CPU)
fmt.Printf("%8.3f\n", xs)
fmt.Printf("%i", xs)

/*
(1,.,.) =
   0.391     0.055     0.638     0.514     0.757     0.446  
   0.817     0.075     0.437     0.452     0.077     0.492  
   0.504     0.945     0.863     0.243     0.254     0.640  
   0.850     0.132     0.763     0.572     0.216     0.116  
   0.410     0.660     0.156     0.336     0.885     0.391  

(2,.,.) =
   0.952     0.731     0.380     0.390     0.374     0.001  
   0.455     0.142     0.088     0.039     0.862     0.939  
   0.621     0.198     0.728     0.914     0.168     0.057  
   0.655     0.231     0.680     0.069     0.803     0.243  
   0.853     0.729     0.983     0.534     0.749     0.624  

(3,.,.) =
   0.734     0.447     0.914     0.956     0.269     0.000  
   0.427     0.034     0.477     0.535     0.440     0.972  
   0.407     0.945     0.099     0.184     0.778     0.058  
   0.482     0.996     0.085     0.605     0.282     0.671  
   0.887     0.029     0.005     0.216     0.354     0.262  



TENSOR INFO:
        Shape:          [3 5 6]
        DType:          float32
        Device:         {CPU 1}
        Defined:        true
*/

// Basic tensor operations
ts1 := ts.MustArange(ts.IntScalar(6), gotch.Int64, gotch.CPU).MustView([]int64{2, 3}, true)
defer ts1.MustDrop()
ts2 := ts.MustOnes([]int64{3, 4}, gotch.Int64, gotch.CPU)
defer ts2.MustDrop()

mul := ts1.MustMatmul(ts2, false)
defer mul.MustDrop()

fmt.Printf("ts1:\n%2d", ts1)
fmt.Printf("ts2:\n%2d", ts2)
fmt.Printf("mul tensor (ts1 x ts2):\n%2d", mul)

/*
ts1:
 0   1   2  
 3   4   5  

ts2:
 1   1   1   1  
 1   1   1   1  
 1   1   1   1  

mul tensor (ts1 x ts2):
 3   3   3   3  
12  12  12  12  
*/


// In-place operation
ts3 := ts.MustOnes([]int64{2, 3}, gotch.Float, gotch.CPU)
fmt.Printf("Before:\n%v", ts3)
ts3.MustAddScalar_(ts.FloatScalar(2.0))
fmt.Printf("After (ts3 + 2.0):\n%v", ts3)

/*
Before:
1  1  1  
1  1  1  

After (ts3 + 2.0):
3  3  3  
3  3  3  
*/
}

Simplified Convolutional neural network

import (
    "fmt"

    "github.com/sugarme/gotch"
    "github.com/sugarme/gotch/nn"
    "github.com/sugarme/gotch/ts"
)

type Net struct {
    conv1 *nn.Conv2D
    conv2 *nn.Conv2D
    fc    *nn.Linear
}

func newNet(vs *nn.Path) *Net {
    conv1 := nn.NewConv2D(vs, 1, 16, 2, nn.DefaultConv2DConfig())
    conv2 := nn.NewConv2D(vs, 16, 10, 2, nn.DefaultConv2DConfig())
    fc := nn.NewLinear(vs, 10, 10, nn.DefaultLinearConfig())

    return &Net{
        conv1,
        conv2,
        fc,
    }
}

func (n Net) ForwardT(xs *ts.Tensor, train bool) *ts.Tensor {
    xs = xs.MustView([]int64{-1, 1, 8, 8}, false)

    outC1 := xs.Apply(n.conv1)
    outMP1 := outC1.MaxPool2DDefault(2, true)
    defer outMP1.MustDrop()

    outC2 := outMP1.Apply(n.conv2)
    outMP2 := outC2.MaxPool2DDefault(2, true)
    outView2 := outMP2.MustView([]int64{-1, 10}, true)
    defer outView2.MustDrop()

    outFC := outView2.Apply(n.fc)
    return outFC.MustRelu(true)
}

func main() {

    vs := nn.NewVarStore(gotch.CPU)
    net := newNet(vs.Root())

    xs := ts.MustOnes([]int64{8, 8}, gotch.Float, gotch.CPU)

    logits := net.ForwardT(xs, false)
    fmt.Printf("Logits: %0.3f", logits)
}

//Logits: 0.000  0.000  0.000  0.225  0.321  0.147  0.000  0.207  0.000  0.000

Play with gotch on Google Colab or locally

Getting Started

License

gotch is Apache 2.0 licensed.

Acknowledgement

  • This project has been inspired and used many concepts from tch-rs Libtorch Rust binding.

gotch's People

Contributors

pjongy avatar strongerxi avatar sugarme avatar tony84727 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gotch's Issues

Cuda is not available

CPU only works fine; but when I run

gotch.NewCuda().CudaIfAvailable()

I get "Cuda is not available."

Can anybody help me out?

I have very carefully followed every step in the installation guide. And even got the creating $GOTCH_LIB_FILE for GPU message at the end of the installation process.

OS: Ubuntu 20.04
GPU: RTX 3060

/usr/local/cuda-11.3 ✅✅

$ ls /usr/local/cuda/include | grep cudnn
cudnn.h
$ ls /usr/local/cuda/lib64 | grep cudnn  
libcudnn_adv_infer.so
libcudnn_adv_infer.so.8
(...)
libcudnn_static.a
libcudnn_static_v8.a

$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Mon_May__3_19:15:13_PDT_2021
Cuda compilation tools, release 11.3, V11.3.109
Build cuda_11.3.r11.3/compiler.29920130_0

$ echo $LD_LIBRARY_PATH;
$ echo $CUDA_VERSION;
$ echo $CUDA_VERSION;
$ echo $CU_VERSION;
$ echo $GOTCH_LIBTORCH;
$ echo $LIBRARY_PATH;
$ echo $CPATH;
$ echo $LD_LIBRARY_PATH;

/usr/local/cuda-11.3/lib64::/usr/local/lib/libtorch/lib:/usr/lib64-nvidia:/usr/local/cuda-11.3/lib64
11.3
11.3

/usr/local/lib/libtorch
:/usr/local/lib/libtorch/lib
:/usr/local/lib/libtorch/lib:/usr/local/lib/libtorch/include:/usr/local/lib/libtorch/include/torch/csrc/api/include
/usr/local/cuda-11.3/lib64::/usr/local/lib/libtorch/lib:/usr/lib64-nvidia:/usr/local/cuda-11.3/lib64
$ nvidia-smi

Tue Aug 23 00:56:51 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:03:00.0  On |                  N/A |
|  0%   39C    P8    19W / 170W |    283MiB / 12288MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1049      G   /usr/lib/xorg/Xorg                 29MiB |
|    0   N/A  N/A      1567      G   /usr/lib/xorg/Xorg                 97MiB |
|    0   N/A  N/A      1701      G   /usr/bin/gnome-shell               40MiB |
|    0   N/A  N/A      2081      G   ...867554538896762554,131072       65MiB |
|    0   N/A  N/A     31849      G   ...RendererForSitePerProcess       39MiB |
+-----------------------------------------------------------------------------+

One interesting thing:

when I run the go program with go run *.go it prints the error message "Cuda is not available." as I mentioned before; but when I run go build *.go && ./main it gets stuck. No output, nothing.

nvidia drivers + CUDA 11.3 + cudnn were installed using this gist / script in a fresh Ubuntu 20.04 partition

then installed libtorch according to the README guide using export CUDA_VER=11.3 && bash setup-libtorch.sh

and finally gotch using export CUDA_VER=11.3 && export GOTCH_VER=v0.7.0 && bash setup-gotch.sh

no errors linking nor compiling. everything just executed fine as supposed.

I have already tried with Ubuntu 22.04 + CUDA 11.7 + libtorch 1.12 (not 1.11) but wouldn't even compile.
Maybe I'll try Ubuntu 18 + CUDA 10.2
Or downgrading nvidia drivers

🤷‍♂️🤷‍♂️

Any help appreciated

Error trying to get Tuple of tensor with NewIValue

Hi, i'm trying to convert 2 tensors into a tuple. However when i try to use ts.NewIValue([]ts.IValue{tensora, tensorb}) i'm getting this error: NewIValue method call - 'slice -> struct' case - Unsupported type (slice). I managed to make it work by editing tensor/jit.go file in the line 90 with:

switch reflect.TypeOf(v).Elem().String() {
case "tensor.IValue":

Is this something that should be fixed? or i'm using NewIValue incorrectly?

ImageNet.LoadImageAndResize224 crashes for some files.

1
This file causes an error:


fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x1 addr=0x25 pc=0x25]

runtime stack:
runtime.throw(0x130a881, 0x2a)
	/usr/local/go/src/runtime/panic.go:1116 +0x72
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:726 +0x269

goroutine 1 [syscall]:
runtime.cgocall(0x102e255, 0xc00061fab8, 0x13093bf)
	/usr/local/go/src/runtime/cgocall.go:133 +0x5b fp=0xc00061fa70 sp=0xc00061fa38 pc=0x461e7b
github.com/sugarme/gotch/libtch._Cfunc_atg_totype(0x7f23980136b0, 0x7f2398012950, 0x6)
	_cgo_gotypes.go:22434 +0x45 fp=0xc00061fab8 sp=0xc00061fa70 pc=0xe6b905
github.com/sugarme/gotch/libtch.AtgTotype.func1(0x7f23980136b0, 0x7f2398012950, 0xc000000006)
	/home/george/go/pkg/mod/github.com/sugarme/[email protected]/libtch/c-generated.go:6146 +0xac fp=0xc00061fb10 sp=0xc00061fab8 pc=0xee030c
github.com/sugarme/gotch/libtch.AtgTotype(0x7f23980136b0, 0x7f2398012950, 0xc000000006)
	/home/george/go/pkg/mod/github.com/sugarme/[email protected]/libtch/c-generated.go:6146 +0x4c fp=0xc00061fb60 sp=0xc00061fb10 pc=0xe958cc
github.com/sugarme/gotch/tensor.(*Tensor).Totype(0xc0000ae010, 0x13cea60, 0x1175080, 0x0, 0x0, 0x0, 0x0)
	/home/george/go/pkg/mod/github.com/sugarme/[email protected]/tensor/tensor-generated.go:17916 +0x106 fp=0xc00061fc00 sp=0xc00061fb60 pc=0xfedd46
github.com/sugarme/gotch/vision.(*ImageNet).Normalize(0xc00000f680, 0xc0000ae010, 0x0, 0x0, 0x0)
	/home/george/go/pkg/mod/github.com/sugarme/[email protected]/vision/imagenet.go:36 +0xfb fp=0xc00061fd08 sp=0xc00061fc00 pc=0x1006fdb
github.com/sugarme/gotch/vision.(*ImageNet).LoadImageAndResize224(0xc00000f680, 0x1315e01, 0x46, 0x0, 0x0, 0x0)
	/home/george/go/pkg/mod/github.com/sugarme/[email protected]/vision/imagenet.go:149 +0x24d fp=0xc00061fdd8 sp=0xc00061fd08 pc=0x10083ed
main.ProbsImage(0x1315e01, 0x46, 0xc000010208, 0x0, 0x0, 0x0)
	/home/george/Projects/unigui-go/examples/engine/main.go:229 +0x85 fp=0xc00061fe90 sp=0xc00061fdd8 pc=0x100c325
main.main()
	/home/george/Projects/unigui-go/examples/engine/main.go:194 +0x5e fp=0xc00061ff88 sp=0xc00061fe90 pc=0x100bf5e
runtime.main()
	/usr/local/go/src/runtime/proc.go:204 +0x1cf fp=0xc00061ffe0 sp=0xc00061ff88 pc=0x49df6f
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1374 +0x1 fp=0xc00061ffe8 sp=0xc00061ffe0 pc=0x4d4421

goroutine 6 [select]:
go.opencensus.io/stats/view.(*worker).start(0xc00007e7d0)
	/home/george/go/pkg/mod/[email protected]/stats/view/worker.go:154 +0x19e
created by go.opencensus.io/stats/view.init.0
	/home/george/go/pkg/mod/[email protected]/stats/view/worker.go:32 +0x5b

Best way to serialize tensor into bytes?

As title. This is useful for transferring tensor over network, which should be a common thing given golang's typical use cases.

Right now I'm using the following, but it involves some unnecessary file io.

gotch/ts/tensor.go

Lines 800 to 804 in 17f2c49

func (ts *Tensor) MustSave(path string) {
if err := ts.Save(path); err != nil {
log.Fatal(err)
}
}

[Question] Is it possible to load *.pkl model?

I trained a model and export it as export.pkl file (ResNet50). When try to load it I get the error:
Libtorch API Error: [enforce fail at inline_container.cc:222] . file not found: archive/constants.pkl
Is it possible to load such file?

NewRandnInit didn't works as expected when mean=0.0

package main

import(
    "fmt"
    "github.com/sugarme/gotch/nn"
)

func main() {
	mat_3x5 := nn.NewRandnInit(0.0, 1.0).InitTensor([]int64{3, 5}, gotch.CPU)
	fmt.Printf("%v", mat_3x5)
}

output

   1.000     1.000     1.000     1.000     1.000
   1.000     1.000     1.000     1.000     1.000
   1.000     1.000     1.000     1.000     1.000

According to this line of code, when mean=0, all elements will be initialized with stdev

data[i] = float32(rand.NormFloat64()*r.mean + r.stdev)

This is different from PyTorch's randn function

""" Returns a tensor filled with random numbers from a normal distribution
with mean `0` and variance `1`"""

torch.randn(3, 5)

############### output ###########################
tensor([[ 0.1769, -2.0933, -0.8882,  0.0051,  0.9833],
        [-0.6342,  0.4093,  0.6266,  0.3935,  0.2045],
        [ 0.3055, -0.4522, -1.7044,  1.8426,  0.4553]])

I know MustRandn can solve this problem, but NewRandnInit as the default initialization method of Embedding will directly affect the initialization weights.

Is this a bug or is it designed this way for some reason?

install error on osx

Hi,

I have a small demo example I want to try but get the following error:

go run main.go
# github.com/sugarme/gotch/libtch
torch_api.cpp:1:9: fatal error: 'torch/csrc/autograd/engine.h' file not found

than I used: https://download.pytorch.org/libtorch/cpu/libtorch-macos-1.11.0.zip

and get this error:

go run main.go                                                                                                       

# github.com/sugarme/gotch/libtch
/Users/arnold1/go/pkg/mod/github.com/sugarme/[email protected]/libtch/tensor.go:100:16: cannot use c_ptr (variable of type *_Ctype_long) as type *_Ctype_longlong in variable declaration
/Users/arnold1/go/pkg/mod/github.com/sugarme/[email protected]/libtch/tensor.go:142:50: cannot use cindexes (variable of type *_Ctype_long) as type *_Ctype_longlong in variable declaration
/Users/arnold1/go/pkg/mod/github.com/sugarme/[email protected]/libtch/tensor.go:151:49: cannot use cindexes (variable of type *_Ctype_long) as type *_Ctype_longlong in variable declaration

Suggestion: Use Try suffix instead of Must prefix

Most users will likely use the "Must" versions of the API. Having to type "Must" for everything is thus unfortunate. It would be better to have the plain methods be of the "Must" variety, and then have the error-returning ones have a suffix like Try. This is the convention I adopted after extensive consideration of these issues :)

gotch.CUDA.IsAvailable() return false but cuda is available in fact

1. nvcc -V shows cuda version is 10.2

(base) root@7c8a9f63a67f:/dev/shm/gotorch_projects/test1# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89

2. export CUDA_VER=10.2 && bash setup-libtorch.sh is successful.

3. Environment is ok. Following lines have been added to .bashrc file(/usr/lib64-nvidia is not found)

export GOTCH_LIBTORCH="/usr/local/lib/libtorch"
export LIBRARY_PATH="$LIBRARY_PATH:$GOTCH_LIBTORCH/lib"
export CPATH="$CPATH:$GOTCH_LIBTORCH/lib:$GOTCH_LIBTORCH/include:$GOTCH_LIBTORCH/include/torch/csrc/api/include"
LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$GOTCH_LIBTORCH/lib:/usr/local/cuda-10.2/lib64"

4. export CUDA_VER=10.2 && export GOTCH_VER=v0.7.0 && bash setup-gotch.sh is successful.

5. my example

package main

import (
	"fmt"

	"github.com/sugarme/gotch"
)

func main() {
	if gotch.CUDA.IsAvailable() {
		fmt.Println("CUDA is available")
	} else {
		fmt.Println("No Find")
	}
}

Result shows "No Find", because gotch.CUDA.IsAvailable() return false.
In addition, I can train the neural network using the GPU by gotorch (https://github.com/wangkuiyi/gotorch) or pytorch. So cuda is available in fact!

Call `ts.Drop()` `ts.MustDrop()` multiple times

Make ts.Drop() can be called multiple times.

func (ts Tensor) Drop() (err error) {

	if !ts.MustDefined() {
		return nil
	}

	lib.AtFree(ts.ctensor)
	if err = TorchErr(); err != nil {
		return err
	}

	// NOTE. assign to a new undefined tensor, then check `ts.MustDefined`
	// before deleting at C land. Hence `Drop` method can be called
	// multiple times without worrying about double C memory delete panic.
	// Other pattern is `defer ts.MustDrop()` whenever a tensor is created.
	ts = NewTensor()

	return nil
}

How to pass Dict[str, str] parameters to the JIT model instead of Dict[Any, Any]

gotch version: 0.4.5

pytorch version: 1.9.0

libtorch version: 1.9.0

I use the code below to generate a model that accepts Dict[str, str] as input parameters

import torch
from torch import nn
from typing import *

class TestModel(nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, x:Dict[str, str]):
        print(x["name"])

torch.jit.save(torch.jit.script(TestModel()), '/tmp/test.pt')

I call this model through the code below

package main

import (
	"fmt"

	ts "github.com/sugarme/gotch/tensor"
)

func main() {
    x := []ts.IValue{*ts.NewIValue(map[string]string{ "name": "gotch"})}
	
    test_model, _ := ts.ModuleLoad("/tmp/test.pt")
    _, err := test_model.ForwardIs(x)
    if err != nil {
        fmt.Println(err)
    }
}

Error thrown

Libtorch API Error: forward() Expected a value of type 'Dict[str, str]' for argument 'x' but instead found type 'Dict[Any, Any]'.

If I declare the parameter type as Dict[Any, Any], the above problem can be solved, but the Any type has many restrictions, and I cannot use it normally on PyTorch, such as accessing dictionary value by index

import torch
from torch import nn
from typing import *

class TestModel(nn.Module):
    def __init__(self):
        super().__init__()

    def forward(self, x:Dict[Any, Any]):
        print(x["name"])

torch.jit.save(torch.jit.script(TestModel()), '/tmp/test.pt')

The above code fails to generate the jit model and throws the following exception

RuntimeError: 
Arguments for call are not valid.
The following variants are available:
  
  aten::__getitem__.t(t[](a) list, int idx) -> (t(*)):
  Could not match type Dict[Any, Any] to List[t] in argument 'list': Cannot match List[t] to Dict[Any, Any].
  
  aten::__getitem__.str(str s, int index) -> (str):
  Expected a value of type 'str' for argument 's' but instead found type 'Dict[Any, Any]'.
  
  aten::__getitem__.Dict_str(Dict(str, t) self, str key) -> (t(*)):
  Expected a value of type 'Dict[str, t]' for argument 'self' but instead found type 'Dict[Any, Any]'.
  
  aten::__getitem__.Dict_int(Dict(int, t) self, int key) -> (t(*)):
  Expected a value of type 'Dict[int, t]' for argument 'self' but instead found type 'Dict[Any, Any]'.
  
  aten::__getitem__.Dict_bool(Dict(bool, t) self, bool key) -> (t(*)):
  Expected a value of type 'Dict[bool, t]' for argument 'self' but instead found type 'Dict[Any, Any]'.
  
  aten::__getitem__.Dict_float(Dict(float, t) self, float key) -> (t(*)):
  Expected a value of type 'Dict[float, t]' for argument 'self' but instead found type 'Dict[Any, Any]'.
  
  aten::__getitem__.Dict_complex(Dict(complex, t) self, complex key) -> (t(*)):
  Expected a value of type 'Dict[complex, t]' for argument 'self' but instead found type 'Dict[Any, Any]'.
  
  aten::__getitem__.Dict_Tensor(Dict(Tensor, t) self, Tensor key) -> (t(*)):
  Expected a value of type 'Dict[Tensor, t]' for argument 'self' but instead found type 'Dict[Any, Any]'.

The original call is:
  File "<ipython-input-136-12d467a7bc62>", line 10
    def forward(self, x:Dict[Any, Any]):
        print(x["name"])
              ~~~~ <--- HERE

More generally, how to pass a Dict of specified k-v type, such as Dict[str, List[str]], Dict[str, List[float64]], Dict[str, int64] instead of GenericDict, does LibTorch 1.9.0 support this now?

Any help would be appreciated

Cant find Python script to save model to .npz

Hi, in issue #50 you told "Convert Python model to numpy (.npz) then gotch provides ts.ReadNpz method to read .npz file. See example/convert-model." Than in https://github.com/sugarme/gotch/blob/master/example/convert-model/main.go
// NOTE. Python script to save model to .npz can be found at https://github.com/sugarme/pytorch-pretrained/bert/bert-base-uncased-to-npz.py
But https://github.com/sugarme/pytorch-pretrained/bert/bert-base-uncased-to-npz.py - page not found
Pls help me

Question

Hi,

do you also support PyTorch 2.0? if not where can i find PyTorch 2.0 for golang?

Thanks

example doesn't run on ubuntu

That's error because your Dockerfile didn't effectively update CGO flags at gotch package $GOPATH/$GOPATH/pkg/mod/github.com/sugarme/gotch@$GOTCH_VERSION/libtch/lib.go .

In your Dockerfile you should do these in order:

Install compiler dependencies
Install Go and setup GOPATH environment
Install libtorch
Install gotch
Have a look at Setup gotch CPU shell script for more detail (it needs to read GOPATH env to know where is gotch package located and update its lib.go file for CGO flags before your Go example can compile and run).

@sugarme I have some issues still with the demo example - what to install regarding compiler dependencies?

where I find which CGO flags I need to set? I see there is something in the https://github.com/sugarme/gotch/blob/master/setup-gotch.sh - but who also sets the SRCDIR in that script?
also the GOPATH in the setup-gotch.sh is set automatically if not defined...

will go build my app also need to set the same flags as for the lib?

do you have an example which shows how to compile the main.go somewhere in this repo?

here is my updated Dockerfile:

FROM ubuntu:22.04

ENV DEBIAN_FRONTEND noninteractive

# Install dependencies
RUN apt-get update && apt-get install -y --no-install-recommends build-essential ca-certificates cmake curl unzip nano wget g++

WORKDIR /home/developer
ENV HOME /home/developer
ENV GOPATH "$HOME/go"

# Install golang
RUN wget -c https://go.dev/dl/go1.19.5.linux-amd64.tar.gz \
&& rm -rf /usr/local/go && tar -C /usr/local -xzf go1.19.5.linux-amd64.tar.gz \
&& rm go1.19.5.linux-amd64.tar.gz
ENV PATH=$PATH:/usr/local/go/bin

# Install Libtorch - CPU
RUN wget https://raw.githubusercontent.com/sugarme/gotch/master/setup-libtorch.sh
RUN chmod +x setup-libtorch.sh
RUN sed -i 's/sudo//g' setup-libtorch.sh
ENV CUDA_VER=cpu
RUN bash setup-libtorch.sh

ENV GOTCH_LIBTORCH="/usr/local/lib/libtorch"
ENV LIBRARY_PATH="$LIBRARY_PATH:$GOTCH_LIBTORCH/lib"
ENV CPATH="$CPATH:$GOTCH_LIBTORCH/lib:$GOTCH_LIBTORCH/include:$GOTCH_LIBTORCH/include/torch/csrc/api/include"
ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$GOTCH_LIBTORCH/lib"

# Setup gotch
RUN wget https://raw.githubusercontent.com/sugarme/gotch/master/setup-gotch.sh
RUN chmod +x setup-gotch.sh
RUN sed -i 's/sudo//g' setup-gotch.sh
ENV CUDA_VER=cpu
ENV GOTCH_VER=v0.7.0
RUN bash setup-gotch.sh

COPY main.go /home/developer/pytorch_demo/main.go

WORKDIR /home/developer/pytorch_demo
RUN go mod init main
RUN go mod tidy
#RUN go build main.go

ENTRYPOINT bash

my env variables:

$ printenv
HOSTNAME=17dbfb11a10c
PWD=/home/developer/pytorch_demo
HOME=/home/developer
CUDA_VER=cpu
TERM=xterm
LIBRARY_PATH=:/usr/local/lib/libtorch/lib
SHLVL=1
GOTCH_LIBTORCH=/usr/local/lib/libtorch
LD_LIBRARY_PATH=:/usr/local/lib/libtorch/lib
GOTCH_VER=v0.7.0
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/go/bin
CPATH=:/usr/local/lib/libtorch/lib:/usr/local/lib/libtorch/include:/usr/local/lib/libtorch/include/torch/csrc/api/include
DEBIAN_FRONTEND=noninteractive
GOPATH=/home/developer/go
_=/usr/bin/printenv

go build main.go errors:

go build main.go
# command-line-arguments
/usr/local/go/pkg/tool/linux_amd64/link: running g++ failed: exit status 1
/usr/bin/ld: cannot find -lcuda: No such file or directory
/usr/bin/ld: cannot find -lcudart: No such file or directory
/usr/bin/ld: cannot find -lcublas: No such file or directory
/usr/bin/ld: cannot find -lcudnn: No such file or directory
/usr/bin/ld: cannot find -lcaffe2_nvrtc: No such file or directory
/usr/bin/ld: cannot find -lnvrtc-builtins: No such file or directory
/usr/bin/ld: cannot find -lnvrtc: No such file or directory
/usr/bin/ld: cannot find -lnvToolsExt: No such file or directory
/usr/bin/ld: cannot find -lc10_cuda: No such file or directory
/usr/bin/ld: cannot find -ltorch_cuda: No such file or directory
collect2: error: ld returned 1 exit status

Memory leaks

I'm getting memory leaks even after calling MustDrop() on tensors.

This is my code:

package gopt

import (
	"log"
	"net/http"
	"errors"
	"os"

	"github.com/sugarme/gotch"
	"github.com/sugarme/gotch/ts"
	"github.com/sugarme/gotch/vision"
)

func GetFileContentType(path string) (string, error) {
	// Open File
	f, err := os.Open(path)
	if err != nil {
		return "", err
	}
	defer f.Close()

	// Only the first 512 bytes are used to sniff the content type.
	buffer := make([]byte, 512)

	_, err = f.Read(buffer)
	if err != nil {
		return "", err
	}

	// Use the net/http package's handy DectectContentType function. Always returns a valid
	// content-type by returning "application/octet-stream" if no others seemed to match.
	contentType := http.DetectContentType(buffer)

	return contentType, nil
}

type GoPt struct {
	Model  *ts.CModule
	Labels []string
	Imnet  *vision.ImageNet
}

func (gopt *GoPt) LoadModel(path string) {
	// Create ImageNet object to use for input resizing
	gopt.Imnet = vision.NewImageNet()

	model, err := ts.ModuleLoadOnDevice(path, gotch.CPU)
	if err != nil {
		log.Fatal(err)
	}
	gopt.Model = model
}

func (gopt *GoPt) Predict(path string) (string, error) {
	// Check the file is an image first
	contentType, err := GetFileContentType(path)
	if err != nil {
		return "", err
	}

	if contentType != "image/jpeg" {
		return "", errors.New("must be an image file")
	}

	// Load the image file and resize it
	image, err := gopt.Imnet.LoadImageAndResize224(path)
	if err != nil {
		return "", err
	}

	// Apply the forward pass of the model to get the logits.
	unsqueezed := image.MustUnsqueeze(int64(0), false)
	image.MustDrop()
	raw_output := unsqueezed.ApplyCModule(gopt.Model)
	unsqueezed.MustDrop()
	output := raw_output.MustSoftmax(-1, gotch.Float, true)
	raw_output.MustDrop()

	// Convert to list of floats to represent label probabilities
	probs := output.Vals().([]float32)
	output.MustDrop()

	maxVal := probs[0]
	maxIndex := 0
	for i, v := range probs {
		if (v > maxVal) {
			maxVal = v
			maxIndex = i
		}
	}
	maxVal = nil

	return gopt.Labels[maxIndex], nil
}

Whenever I call Predict in a loop, the memory usage is blowing up.

Here's the code calling it:

package main

import (
	"flag"
	"fmt"
	"io/ioutil"
	"log"
	"time"

	"github.com/jbloxsome/gopt/gopt"
)

var (
	modelPath string
)

func init() {
	flag.StringVar(&modelPath, "modelpath", "./model.pt", "full path to exported pytorch model.")
}

func main() {
	flag.Parse()

	gopt := gopt.GoPt{
		Labels: []string{
			"false",
			"true",
		},
	}

	gopt.LoadModel(modelPath)

	root := "/test/false"

	files, err := ioutil.ReadDir(root)
	if err != nil {
		log.Fatal(err)
	}

	positive := 0
	negative := 0

	for i, file := range files {
		fmt.Println(i, file.Name())

		pred, err := gopt.Predict(root + "/" + file.Name())
		if err != nil {
			fmt.Println(err)
		}
		if pred == "true" {
			positive = positive + 1
		}
		if pred == "false" {
			negative = negative + 1
		}

		fmt.Println(pred)
	}

	fmt.Println("positives", positive)
	fmt.Println("negatives", negative)
	fmt.Println("total", positive+negative)
	percentPositive := 100 * positive / (positive + negative)
	fmt.Println("positive (%)", percentPositive)
	time.Sleep(time.Second * 60)
}

Add more optimizer functions

  • Add more optimizer functions e.g. AdamW corresponding to C++ functions in libtch/torch_api.h.
  • Implement lib.AtoAddParameters so that nn/varstore can use it instead of currently using lib.AtoAddParametersOld

Install issue on Amazon Cloud

[ec2-user@ip-172-31-1-42 ~]$ export CUDA_VER=cpu && bash setup-libtorch.sh
https://download.pytorch.org/libtorch/cucpu/libtorch-cxx11-abi-shared-with-deps-1.7.0%2Bcucpu.zip
wget: unrecognized option '--show-progress'
Usage: wget [OPTION]... [URL]...

Try `wget --help' for more options.
unzip: cannot find or open /tmp/libtorch-cxx11-abi-shared-with-deps-1.7.0%2Bcucpu.zip, /tmp/libtorch-cxx11-abi-shared-with-deps-1.7.0%2Bcucpu.zip.zip or /tmp/libtorch-cxx11-abi-shared-with-deps-1.7.0%2Bcucpu.zip.ZIP.
rm: cannot remove ‘/tmp/libtorch-cxx11-abi-shared-with-deps-1.7.0%2Bcucpu.zip’: No such file or directory

I wanted to install the libs because terrible issues forced me to do it.

  1. I can not use dynamic torch libraries on amazon with compiled gotch (go build command). It seems no way to copy .so torch libraries to some place and force the program to use them.
  2. I tried to make static version and get rid of libraries but it again was impossible . ld does not allow to make, although I gave it all possible options and more.
examples/engine$ go build -ldflags "-linkmode external -extldflags -static" *.go
# command-line-arguments
/usr/local/go/pkg/tool/linux_amd64/link: running g++ failed: exit status 1
/usr/bin/ld: cannot find -ltorch
/usr/bin/ld: cannot find -lc10
/usr/bin/ld: cannot find -ltorch_cpu
collect2: error: ld returned 1 exit status

The paths was configured as recommended and only using the program on Amazon and for static building brings problems.

ForwardIs may crash when the forward function of sasved model has more than 3 output tensor

I used ForwardIs func to get my model forward results. And I had a loop to call it.
It works well when my forward function of model only has 3 or lesser output.
But the goroutine crashed when the forward function of sasved model has more than 3 output tensor.
The forward function is like this, and I return same tensor to specify the num of output.

def forward(self, inputs):
        value, actor_features = self.base(inputs)
        dist = self.dist(actor_features)

        action = dist.sample()
        action_prob = dist.probs

        action_log_probs = dist.log_probs(action)
        dist_entropy = dist.entropy().mean()

        #return value, action, action_log_probs, action_prob
        #return value, action, action_prob
        #return action_prob, action_prob, action_prob
        return action_prob, action_prob, action_prob, action_prob
go crash log:
[0.3334044710101944 0.3333876191996451 0.33320790979016046]
[0.3334044710101944 0.3333876191996451 0.33320790979016046]
traj step
process framestate
agent step 6
<nil>
[0.3334044710101944 0.3333876191996451 0.33320790979016046]
[0.3334044710101944 0.3333876191996451 0.33320790979016046]
[0.3334044710101944 0.3333876191996451 0.33320790979016046]
[0.3334044710101944 0.3333876191996451 0.33320790979016046]
traj step
process framestate
agent step 7
fatal error: unexpected signal during runtime execution
[signal SIGSEGV: segmentation violation code=0x80 addr=0x0 pc=0x7f50eed548df]

runtime stack:
runtime.throw({0xc5cac1, 0x2})
	/usr/lib/go-1.17/src/runtime/panic.go:1198 +0x71
runtime.sigpanic()
	/usr/lib/go-1.17/src/runtime/signal_unix.go:719 +0x396

goroutine 19 [syscall]:
runtime.cgocall(0xa5a0a0, 0xc0000b5ca0)
	/usr/lib/go-1.17/src/runtime/cgocall.go:156 +0x5c fp=0xc0000b5c78 sp=0xc0000b5c40 pc=0x4b789c
github.com/sugarme/gotch/libtch._Cfunc_atm_forward_(0x2793ae0, 0x7f50b8006240, 0x1)
	_cgo_gotypes.go:32259 +0x4d fp=0xc0000b5ca0 sp=0xc0000b5c78 pc=0x89776d
github.com/sugarme/gotch/libtch.AtmForward_.func1(0x8, 0xc00018d110, 0x0)
	/home/ubuntu/go/pkg/mod/github.com/sugarme/[email protected]/libtch/tensor.go:829 +0x71 fp=0xc0000b5ce8 sp=0xc0000b5ca0 pc=0x9256f1
github.com/sugarme/gotch/libtch.AtmForward_(0x7f50b8006240, 0x0, 0x1)
	/home/ubuntu/go/pkg/mod/github.com/sugarme/[email protected]/libtch/tensor.go:829 +0x25 fp=0xc0000b5d10 sp=0xc0000b5ce8 pc=0x925645
github.com/sugarme/gotch/ts.(*CModule).ForwardIs(0xc0000b0270, {0xc0000b5eb0, 0x1, 0x2})
	/home/ubuntu/go/pkg/mod/github.com/sugarme/[email protected]/ts/jit.go:1115 +0x34c fp=0xc0000b5e38 sp=0xc0000b5d10 pc=0x97416c
ai_service/predict.ModelPredict({0xc0000a86a8, 0xc0000b0008, 0xc0000b5f78})
	/home/ubuntu/pom/rl_training/cpu_code/ai_service/src/predict/predict_util.go:33 +0xfa fp=0xc0000b5ef0 sp=0xc0000b5e38 pc=0xa4859a
ai_service/predict.(*BotPredictor).ProcessFrameState(0xc0000a8690, 0xc0000902a0, 0xc000090300)

go code:

package main

import (
	"fmt"

	"github.com/sugarme/gotch/ts"
)

var model *ts.CModule

func ModelManager() {
	var err error
	if model, err = ts.ModuleLoad("../model/epoch_2500.pt"); err != nil {
		fmt.Println(err)
	}
	model.SetEval()
	fmt.Println(model)
}

func ModelPredict() {
	obsVec_ := []float64{0.18, 0.32}
	inputTensor, _ := ts.NewTensorFromData(obsVec_, []int64{1, 2})
	inputIVal := ts.NewIValue(*inputTensor)
	if m, err := model.ForwardIs([]ts.IValue{*inputIVal}); err == nil {
		for _, outTensor := range m.Value().([]ts.Tensor) {
			fmt.Println(outTensor.Vals())
		}
	}
}

func main() {
	ModelManager()
	for i := 0; i < 100; i++ {
		ModelPredict()
	}
}

pytorch 1.8 support?

I see you started on pytorch 1.7 support, but didn't complete it. What's involved in getting to pytorch 1.8?

My current use case is simpler than your ambition. I just have a model (and more once I solve this) I want to use golang to serve. I need to load the jit module and get arguments to it and values back. I think golang is a better api serving solution than cpp :-)

How to register our custom operator from a shared library like C++

I customized some operators (via C++) and called them in the JIT model.

from torch.utils.cpp_extension import load_inline
import torch
from typing import *

op_source = """
int64_t sum_loop(int64_t n) 
{
    int64_t sum = 0;
    for (int64_t i = 0; i < n; i++){
        sum+=1;
    }
    return sum;
}


TORCH_LIBRARY(my_ops, m) {
  m.def("sum_loop_cpp", &sum_loop);
}
"""
load_inline(name="my_ops", cpp_sources=op_source, is_python_module=False, verbose=True)

@torch.jit.script
def func(x:int) -> int:
    return torch.ops.my_ops.sum_loop_cpp(x)

func.save('/tmp/func.pt')

jit_func = torch.jit.load('/tmp/func.pt')
jit_func(100000) # 100000

when load the JIT model through gotch(ts.ModuleLoad),

package main

import (
	"fmt"

	ts "github.com/sugarme/gotch/tensor"
)

func main() {
    a := []ts.IValue{*ts.NewIValue(int64(100000))}
    func, _ := ts.ModuleLoad("/tmp/func.pt")
    out, err := func.ForwardIs(a)
    if err != nil {
        fmt.Println(err)
    } else {
        fmt.Println(out)
    }
}

I think since I did not register my custom op in runtime, I got an exception

Libtorch API Error:
Unknown builtin op: my_ops::sum_loop_cpp.
Could not find any similar ops to my_ops::sum_loop_cpp. This op may not exist or may not be currently supported in TorchScript.
:
  File "<ipython-input-1-ad0082b9b01e>", line 24
@torch.jit.script
def func(x:int) -> int:
    return torch.ops.my_ops.sum_loop_cpp(x)
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
Serialized   File "code/__torch__.py", line 7
  def forward(self: __torch__.PlaceholderModule,
    x: int) -> int:
    return ops.my_ops.sum_loop_cpp(x)
           ~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE

how can I register the custom ops correctly, similar to the C++ case introduced in this document:
https://pytorch.org/tutorials/advanced/torch_script_custom_ops.html#using-the-torchscript-custom-operator-in-c

error in setup-libtorch.sh

#!/bin/bash

LIBTORCH_VERSION="${LIBTORCH_VER:-1.11.0}"
CUDA_VERSION="${CUDA_VER:-11.3}"
echo ${CUDA_VERSION}
if [ "${CUDA_VERSION}"=="cpu" ]; then
  CU_VERSION="cpu"
else
  CU_VERSION="cu${CUDA_VERSION//./}"
fi
echo ${CU_VERSION}

No matter what CUDA VER is entered, CU VERSION is always cpu.
There is an error in if, it should be if [ "${CUDA_VERSION}" == "cpu" ]; then. In other words, two Spaces are missing.

Model works on python but not on GO.

Sorry again for my stickiness, but i would reaaaaaally appreciate the help.

I managed to trace the model (scripting just doesn`t work neither on python and on GO).

It works on python (works poorly but i trained it like 2 hours, so its okay):
Screenshot from 2021-07-22 16-10-13

But on GO by loading same traced file i get this:
Screenshot from 2021-07-22 16-10-29

Looks like some final layers just don`t work or somethin.

Here is my GO code for loading:

func main() {



	device := gotch.CudaIfAvailable()

	image, err := vision.Load("/path/to/sample.jpg")
	if err != nil {
		log.Fatal(err)
	}

	imageTs, err := vision.Resize(image, 512, 512)
	if err != nil {
		log.Fatal(err)
	}

	usimage := imageTs.MustUnsqueeze(0, true)

	Img := usimage.MustTotype(gotch.Float, true)

	Img = Img.MustTo(device, true)

	model, err := ts.ModuleLoad("/path/to/traced_model.pt")
	if err != nil {
		log.Fatal(err)
	}

	output := Img.ApplyCModule(model)

	imag := output.MustDetach(false)
	result := imag.MustTo(gotch.CPU, true)

	// result = result.MustUnsqueeze(int64(0), false)

	saveFile := fmt.Sprintf("/path/to/res_%v", "sample.jpg")
	err = vision.Save(result, saveFile)
	if err != nil {
		log.Fatal(err)
	}

	// image := output.MustUnsqueeze(float64(0), false)

	fmt.Printf("done")

	// fmt.Printf("%20.20f\n", output)

}

Python code to load traced model:

cuda = torch.cuda.is_available()
if cuda:
    print('Cuda is available!')
    cudnn.benchmark = True
netG = torch.jit.load('/path/to/traced_model.pt')
if cuda:
    netG = netG.cuda()
for param in netG.parameters():
    param.requires_grad = False
netG.eval()
imgs = Image.open('/path/to/sample.jpg').resize((512,512))
img = transforms.ToTensor()(imgs).unsqueeze_(0).cuda()
g_images = netG(img)
save_image(g_images, '/path/to/res_sample.jpg')

Python code to trace the model:

cuda = torch.cuda.is_available()
if cuda:
    print('Cuda is available!')
    cudnn.benchmark = True
netG = STRnet2(3)
netG.load_state_dict(torch.load('/path/to/trained_state_dict/STE_2.pth'))
if cuda:
    netG = netG.cuda()
for param in netG.parameters():
    param.requires_grad = False
netG.eval()



img = torch.rand(1,3,512,512).cuda()
traced_script_module = torch.jit.trace(netG, img)
traced_script_module.save("/path/to/traced_model.pt")

Libtorch_cpu.so MKL linking error

I have created a Docker file to deploy gotcha and run the simplified convolutional neural network in a container but found a linking error:

RUN apt-get -y update
RUN apt-get -y install apt-utils
ENV DEBIAN_FRONTEND=noninteractive
RUN ln -fs /usr/share/zoneinfo/Europe/Moscow /etc/localtime
RUN apt-get install -y tzdata
RUN dpkg-reconfigure --frontend noninteractive tzdata
RUN apt-get -y install wget sudo unzip git intel-mkl-full cmake golang
ENV LIBTORCH_PATH="/usr/local/lib"
ENV GOTCH_LIBTORCH="$LIBTORCH_PATH/libtorch"
ENV LIBRARY_PATH="$LIBRARY_PATH:$GOTCH_LIBTORCH/lib:/usr/lib/x86_64-linux-gnu/mkl"
ENV CPATH="$CPATH:$GOTCH_LIBTORCH/lib:$GOTCH_LIBTORCH/include:$GOTCH_LIBTORCH/include/torch/csrc/api/include:/usr/include"
ENV GOPATH="/root/go"
#
# THE CODE BELOW DO NOT WORK BECAUSE OF SOME ERRORS IN setup-libtorch.sh
# RUN wget https://raw.githubusercontent.com/sugarme/gotch/master/setup-libtorch.sh \
#    && chmod +x setup-libtorch.sh \
#    && export CUDA_VER=cpu && bash setup-libtorch.sh
#
# This code works
RUN export LIBTORCH_ZIP="libtorch-cxx11-abi-shared-with-deps-1.10.0%2Bcpu.zip" \
    && wget  -q --show-progress --progress=bar:force:noscroll  -O "/tmp/$LIBTORCH_ZIP" "https://download.pytorch.org/libtorch/cpu/$LIBTORCH_ZIP" \
    && unzip "/tmp/$LIBTORCH_ZIP" -d $LIBTORCH_PATH \
    && rm "/tmp/$LIBTORCH_ZIP" \
    && ldconfig
#
# Option 1. DO NOT WORK
# main.go:5:2: no Go files in /root/go/src/github.com/sugarme/gotch
#
#RUN wget https://raw.githubusercontent.com/sugarme/gotch/master/setup-gotch.sh
#RUN chmod +x setup-gotch.sh
#RUN export CUDA_VER=cpu && export GOTCH_VER=v0.5.0 && bash setup-gotch.sh
#
# Option 2. DO NOT WORK
# main.go:5:2: no Go files in /root/go/src/github.com/sugarme/gotch
#
#RUN export GOTCH_TEST_DIR="/tmp/gotch-test" \
#    && mkdir $GOTCH_TEST_DIR \
#    && cd $GOTCH_TEST_DIR \
#    && go mod init "github.com/sugarme/gotch-test" \
#    && go get -d "github.com/sugarme/[email protected]" \
#    && rm -rf $GOTCH_TEST_DIR
#
# Option 3. Starts building but returns linking errors after a while:
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_backward_zd_out_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_backward_d_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_lapack_dsbrdbn'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_lapack_zlaeh2'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_forward_d_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_backward_s_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_backward_zd_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_trans_mkl_zimatcopy_square_t_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_forward_sc_out_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_lapack_claeh2'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_lapack_slaeh2'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_backward_d_out_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_lapack_dlaed2'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_backward_c_out_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_lapack_zhbrdbn'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_forward_dz_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_forward_c_out_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_forward_s_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_backward_s_out_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_backward_cs_out_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_trans_mkl_comatcopy2_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_lapack_dlaeh2'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_backward_z_out_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_forward_sc_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_backward_cs_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_forward_z_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_lapack_dlaed3'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_backward_z_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_lapack_slaed3'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_trans_mkl_cimatcopy_square_t_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_forward_z_out_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_forward_c_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_lapack_ssbrdbn'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_trans_mkl_zomatcopy2_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_forward_dz_out_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_forward_s_out_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_forward_d_out_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_dft_compute_backward_c_par'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_lapack_chbrdbn'
#/usr/bin/ld: /usr/local/lib/libtorch/lib/../lib/libtorch_cpu.so: undefined reference to `mkl_lapack_slaed2'
#
RUN go get -d "github.com/sugarme/gotch"
COPY lib.go.txt "$GOPATH/pkg/mod/github.com/sugarme/[email protected]/libtch/lib.go"
COPY lib.go.txt "$GOPATH/src/github.com/sugarme/gotch/libtch/lib.go"
RUN ldconfig
COPY test.go.txt /home/main.go
WORKDIR /home
RUN go build main. go
ENTRYPOINT bash

test.go.txt
lib.go.txt

Are there any considerations on how to avoid these errors?

CGO flags

Problem:
At the moment, gotch uses fixed Pytorch C++ API precompiled installed at /opt/libtorch/.

General idea is when a user uses gotch, they just use go get -u github.com/sugarme/gotch and/or go install ./..., there will be a trigger to check for environment variables for example LD_LIBRARY_PATH, LIBTORCH_PATH. If not specifying, it will use a default location to download and cache. CGO flags should link correctly to Libtorch path so that it is compiled successfully.

Solution:

  • Create CGO flags and environment variables to select a flexible location to install libtorch
  • Maybe, create a function trigger in init() to auto-download a Libtorch version (depending on OS and GPU support) and cache it. For example in $HOME/.cache/gotch/libtorch.

Failed to install gotch on MacOS

I encountered problems when I was trying to install gotch on MacOS

I've installed libtorch at /usr/local/lib/libtorch and executed setup-gotch.sh after setting environment variables. Here is the result:

GOPATH:'/usr/local/Cellar/go/1.15.6'
GOTCH_VERSION: 'v0.3.8'
CUDA_VERSION: '10.1'
go: creating new go.mod: module github.com/sugarme/gotch-test
/usr/local/Cellar/go/1.15.6/pkg/mod/github.com/sugarme/[email protected]/libtch/dummy_cuda_dependency.cpp existing. Deleting...
/usr/local/Cellar/go/1.15.6/pkg/mod/github.com/sugarme/[email protected]/libtch/lib.go existing. Deleting...
creating /usr/local/Cellar/go/1.15.6/pkg/mod/github.com/sugarme/[email protected]/libtch/dummy_cuda_dependency.cpp for GPU
creating /usr/local/Cellar/go/1.15.6/pkg/mod/github.com/sugarme/[email protected]/libtch/lib.go for GPU
sudo: ldconfig: command not found

however, I cannot use gotch in my go project.
I also tried to install gotch using go get, but I got

$ go get github.com/sugarme/gotch
go: github.com/sugarme/gotch upgrade => v0.3.11
# github.com/sugarme/gotch/libtch
ld: warning: directory not found for option '-L/usr/local/Cellar/go/1.15.6/pkg/mod/github.com/sugarme/[email protected]/libtch/libtorch/lib'
ld: warning: directory not found for option '-L/usr/local/cuda/lib64'
ld: library not found for -lcuda
clang: error: linker command failed with exit code 1 (use -v to see invocation)

windows env variables

On Windows I've added LIBTORCH and its path to the env variables. I'm getting the following error when I run go get "github.com/sugarme/gotch":

torch_api.cpp:1:9: fatal error: torch/csrc/autograd/engine.h: No such file or directory 1 | #include<torch/csrc/autograd/engine.h> compilation terminated.

What else would be needed? Many thanks!

cross-compiling for other platform

Hello,

I want to know can I build executable which loads pytorch model and does inference for other platforms? How to do it?

I've tried CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags '-s -w --extldflags "-static -fpic"' but failed. The executable cannot load libtorch on linux.

Thanks.

default libtorch folder

At the moment, gotch dictates Pytorch C++ API installation at /opt/libtorch. To make it more flexible and automated, one solution to try out is create environment variables and a Makefile to install its dependency (Libtorch) based on environment variables. I.e.,

  1. $LIBTORCH specifies path to Pytorch C++ API
  2. Makefile to update environments and install dependency:
    • For Linux, update .bashrc or equivalent, where /path/to/libtorch is the path to the directory that was created when unzipping the file.

      export LIBTORCH=/path/to/libtorch
      export LD_LIBRARY_PATH=${LIBTORCH}/lib:$LD_LIBRARY_PATH
      
    • For Windows, assuming that X:\path\to\libtorch is the unzipped libtorch directory.

      • Navigate to Control Panel -> View advanced system settings -> Environment variables.
      • Create the LIBTORCH variable and set it to X:\path\to\libtorch.
      • Append X:\path\to\libtorch\lib to the Path variable.
    • If you prefer to temporarily set environment variables, in PowerShell you can run

      $Env:LIBTORCH = "X:\path\to\libtorch"
      $Env:Path += ";X:\path\to\libtorch\lib"
      

How to load pretrained pytorch model.

Sorry for my stupidity, but i cant understand how to load model trained with pytorch? I have .pth file and can rewrite model.py file on GO (or save it to .pt with scripting), but how to do it generally?

How to convert lib.Ctensor to Tensor outside the tensor package?

when ForwardIs returns a pytorch Tensor type from the PyTorch JIT model , according to the code below,
https://github.com/sugarme/gotch/blob/master/tensor/jit.go#L449-L457

tensor := lib.AtiToTensor(cval.civalue) // <-----lib.Ctensor
if err := TorchErr(); err != nil {
	return nil, err
}
return &IValue{
	value: tensor, // ---> It might be better to change this to x Tensor{tensor} 
	kind:  TensorVal,
	name:  "Tensor",
}, nil

the value of IValue is the lib.Ctensor type , as far as I know (im new to golang), I cannot convert it to Tensor type outside the tensor package because the ctensor of type Tensor is not exported.

type Tensor struct {
	ctensor lib.Ctensor
}

Example Error Build

Build Error: go build -o /tmp/__debug_bin4144044237 -gcflags all=-N -l .

github.com/sugarme/gotch/libtch

torch_api.cpp:1:9: fatal error: torch/csrc/autograd/engine.h: Нет такого файла или каталога
1 | #include<torch/csrc/autograd/engine.h>
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated. (exit status 2)

Colab setup example using GopherNote (CPU or GPU)

Problem:
A Google colab setup using gophernote kernel to pivot with gotch should be helpful.

Solution:
There exisits a colab gophernote example. Should add environment variables to specify a correct location to install Libtorch so that gotch can compile successfully.
A trial with gophernote at local machine have proven its working well with gophernote but an attempt to setup at Google Colab somehow doest not work.

native Go tensor.Print()

Problem: atm, when do tensor.Print() there will be a C function to print tensor data to console. It would be nice to have a Go native tensor printer to print values from Go.
A practical value is that when using gotch with gophernote, tensor.Print() will appear. Such thing will enhance interative mode of gotch using gophernote.

Solution:
A quick thought would be get tensor values to Go and reconstruct a print-out based on tensor.Size().

Yolo model

Hi, i want to know the process of making the yolo model on the example file, i always getting error when i try to load my own model (ofc after exporting it as a torchscript
thanks

Missing APIs with pattern `tensor *atg_`

There are 9 missing APIs with pattern: tensor *atg_ in tensor-generated.go and must-tensor-generated.go.

These missing APIs should be patched in a separated file for Go wrapper in libtch/patch.go and appended to tensor/patch.go for Go APIs.

JIT model multiple inputs

Hi, just came across your project, Great work, seem to be the only go bindings that actually work!
I have a jitted model taking two inputs and returning multiple outputs.
In C++, this seems to be done like this:
pytorch/pytorch#18337

std::vector<torch::jit::IValue> i;
std::vector<float> x = {2.0, 1.0, 0.5, 3.0, 4.0};
std::vector<float> y = {3.0, 2.0};
i.push_back(torch::tensor(x));
i.push_back(torch::tensor(y));

torch::Tensor output = module->forward(i).toTensor();

How would I do this in gotch? Forward only accepts a single tensor.

Simple code consumes all memory in seconds

I analyze my photos in a cycle by calling the func :

func CalcProbs(imageFile string, model *ts.CModule, k int64 ) []TopItem {
	//runtime.GC()
	// Load the image file and resize it to the usual imagenet dimension of 224x224.
	image, err := imageNet.LoadImageAndResize224(imageFile)
	if err != nil {
		log.Fatal(err, imageFile)
	}
	// Apply the forward pass of the model to get the logits.
	output := image.MustUnsqueeze(int64(0), false).ApplyCModule(model).MustSoftmax(-1, gotch.Float, true)

	return Top(output, k)	
}

Photo size for all not more than 1 MB. After 50 calls it consumes all free memory ~26 GB. Manual calls runtime.GC() don't help. What it can be?
In Python a bit different code works OK for thousands photos..

A way to convert/load image from multipart to tensor

Discussed in #53

Originally posted by isdzulqor July 24, 2021
Is there a way to directly convert an image from multipart file instead of from path to tensor?

I wanna proceed with the image sent from Rest API to be inferenced.

I only found load from the path which is a string format

vision.Load(path)
imageNet.LoadImage(path)

I'm assuming that writing a multipart file to disk first will be so much costly

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.