Comments (10)
sigh! I dont have OSX Yosemite, and I dont have an OSX powered CUDA machine.
If anyone can give me ssh access to OSX powered CUDA with Yosemite (or if you know how I can get a contbuild to run on this combo), much appreciated. I'll setup continuous builds for these.
from cunn.
soumith - i've been exploring related issues in an effort to provide some support. sorry I am not conversant enough in the libs to help more. but I did discover some other failures that may be related -
torch.test(), and nn.test() both work fine UNTIL default tensor type is set to torch.FloatTensor. Then many failures. additionally nn.test fails more tests and segfaults regularly if cunn is used with default type as FloatTensor.
Unfortunately I did not see any change in nn.testcuda() when leaving default type unchanged. They could be unrelated - but it all smells connected.
s
from cunn.
fixed my cutorch finally! there is a lot of weird stuff going on with these malformed libraries and install_name_tool, I was only able to install it with cmake 3.1
so I'm able to reproduce, @sneiman are you running an old macbook with 512gb 650m?
these errors happen sometimes to me when I run tests, not every time.
@soumith what about a virtual machine?
from cunn.
I’m running mid-2012 Macbook Pro Retina with 16gB ram, Nvidia GeForce GT650M with 1gB vRAM, and 4 core 2.7 gHx i7.
Yosemite (my bad) 10.10.1.
Does anyone have a better experience on a newer MBP? For the record, in my experience this particular mbp vintage has a lot of little problems. Drivers that don’t run, strange usb behavior – for example, cannot tribe gaze tracker.
I am also dual booting xubuntu 14.04. It seems to have similar problems with the torch.test() and nn.test() with FloatTensor as default and with nn.testcuda(). Don’t take that to the bank – I was rushing to a meeting and did not keep good notes.
All of this is to have a workflow that makes it easy for me to go from OS X -> xubuntu on laptop, and run the same code for long training and parameter searches on the big GPU box sitting in my office.
I have not had problems with cutorch.
Got cunn to build – local build with cmake 3.0.2 and the edit you advised. 3.0.2 is latest on brew.
However, am still having gpu training issues. Using code that is identical to cpu it fails to train at all. OS X and xubuntu. It seems very fragile as well. Still not convinced there isn’t something more deeply wrong – like wrong stdlib in some library that the torch chain is dependent on.
I did finally get OS X and ubuntu behavior to be the same. Not the best behavior – but the same is good.
Still banging away at it,
S
This entire message is confidential. If it isn't intended for you, you may not use it – so please throw it away and forget about it.
From: Sergey Zagoruyko <[email protected]mailto:[email protected]>
Reply-To: torch/cunn <[email protected]mailto:[email protected]>
Date: Wednesday, December 3, 2014 at 5:56 PM
To: torch/cunn <[email protected]mailto:[email protected]>
Cc: Seth Neiman <[email protected]mailto:[email protected]>
Subject: Re: [cunn] nn.testcuda() produces unstable results on Yosemite 10.10.1 with CUDA 6.5 (#50)
fixed my cutorch finally! there is a lot of weird stuff going on with these malformed libraries and install_name_tool, I was only able to install it with cmake 3.1
so I'm able to reproduce, @sneimanhttps://github.com/sneiman are you running an old macbook with 512gb 650m?
these errors happen sometimes to me when I run tests, not every time.
@soumithhttps://github.com/soumith what about a virtual machine?
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/50#issuecomment-65525958.
from cunn.
Btw – I have not found that there is a straight setup with vm that gives useful gpu access for cuda. If there is Id love to know about it as xubuntu and OS X were not designed to dual boot.
S
This entire message is confidential. If it isn't intended for you, you may not use it – so please throw it away and forget about it.
From: Sergey Zagoruyko <[email protected]mailto:[email protected]>
Reply-To: torch/cunn <[email protected]mailto:[email protected]>
Date: Wednesday, December 3, 2014 at 5:56 PM
To: torch/cunn <[email protected]mailto:[email protected]>
Cc: Seth Neiman <[email protected]mailto:[email protected]>
Subject: Re: [cunn] nn.testcuda() produces unstable results on Yosemite 10.10.1 with CUDA 6.5 (#50)
fixed my cutorch finally! there is a lot of weird stuff going on with these malformed libraries and install_name_tool, I was only able to install it with cmake 3.1
so I'm able to reproduce, @sneimanhttps://github.com/sneiman are you running an old macbook with 512gb 650m?
these errors happen sometimes to me when I run tests, not every time.
@soumithhttps://github.com/soumith what about a virtual machine?
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/50#issuecomment-65525958.
from cunn.
I am also dual booting xubuntu 14.04. It seems to have similar problems with the torch.test() and nn.test() with FloatTensor as default
If you use FloatTensor as default, the jacobian tests will fail (as completely expected). This is because we define the perturbation amount for calculating finite difference based derivatives to be 1e-6
https://github.com/torch/nn/blob/master/Jacobian.lua#L70
from cunn.
Lots of modules on CPU use these jacobian tests to check for correctness.
from cunn.
Got it. Thx.
This entire message is confidential. If it isn't intended for you, you may not use it – so please throw it away and forget about it.
From: Soumith Chintala <[email protected]mailto:[email protected]>
Reply-To: torch/cunn <[email protected]mailto:[email protected]>
Date: Wednesday, December 3, 2014 at 6:33 PM
To: torch/cunn <[email protected]mailto:[email protected]>
Cc: Seth Neiman <[email protected]mailto:[email protected]>
Subject: Re: [cunn] nn.testcuda() produces unstable results on Yosemite 10.10.1 with CUDA 6.5 (#50)
I am also dual booting xubuntu 14.04. It seems to have similar problems with the torch.test() and nn.test() with FloatTensor as default
If you use FloatTensor as default, the jacobian tests will fail (as completely expected). This is because we define the perturbation amount for calculating finite difference based derivatives to be 1e-6
https://github.com/torch/nn/blob/master/Jacobian.lua#L70
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/50#issuecomment-65528837.
from cunn.
Seems to me that torch/trepl#13 fixed the error in the screenshot with concat operator. I've run testcuda several times, the only error I get is out of memory, so probably it was causing calling trepl which was giving this no concat operator error. Can be closed I think.
from cunn.
Thanks a lot sergey
from cunn.
Related Issues (20)
- CPU and GPU gave different results HOT 1
- use memory of GPU for a process HOT 3
- Jetson Tx2 compile error HOT 3
- wrong number of arguments for function call with ClassNLLCriterion HOT 5
- Utilizing multiple gpus with different memory to full use
- PReLU tests from test.lua give error
- CUDA fails after installation HOT 1
- Model faster in GeForce GTX 1060 6 GB compared to Google cloud K80 12 GB, missing something?
- THREAD PANIC and Segmentation fault when passing data parallelized model between threads
- Strange issue with DataParallelTable on 1080 Ti HOT 1
- How to implement own c/cuda module in cunn and nn?
- Inconsistence in SpatialMaxPooling and SpatialMaxUnpooling operations for torch.CudaTensor that leads to an error
- error: identifier "WARP_SHFL_XOR" is undefined
- Bug: spatialdepthwiseconvolution has memory leak problem HOT 2
- How to resume training from a certain epoch or how to update an already trained model in torch ?
- /torch/install/share/lua/5.1/cunn/THCUNN.lua:119: table index is nil issue HOT 1
- Fails to build against CUDA9 HOT 1
- CMake Error at THCUNN_generated_LookupTable.cu.o.cmake:267 (message):
- I got this to compile on Ubuntu 20.04 HOT 1
- Problems while Compiling with CUDA11, Ubuntu 20.04 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cunn.