andersbll / cudarray Goto Github PK

View Code? Open in Web Editor NEW

233.0 233.0 61.0 377 KB

CUDA-based NumPy

License: MIT License

Makefile 1.24% Python 51.20% C++ 17.66% Cuda 29.90%

cudarray's People

Contributors

Stargazers

Watchers

cudarray's Issues

function "lrnorm_bc01" missing in cuda-version installation

Hi @andersbll

Really nice package.

After "make && make install", I did "python setup.py install". But function "lrnorm_bc01" is missing in this "cuda-end" version installation. Did I miss anything during the installation?

My current fix is that: I searched the repo and only found "lrnorm_bc01.pyx" under "numpy_backend/". I had to manually change the "setup.py" to compile it, import it in the file "cudarray/init.py" and "python setup.py install".

Thanks

Initialization error when running code with a celery worker

I'm using your neural artistic style code to generate some art using Django with Celery to work through tasks.

I have CUDA enabled and everything works beautifully without Celery. As soon as I run the neural task through Celery, it spits out:

[2016-03-06 01:25:15,656: INFO/MainProcess] Received task: art.tasks.generate_art[8f039acb-e68c-4368-821f-dbf55d0b038b]
[2016-03-06 01:25:19,012: ERROR/MainProcess] Task art.tasks.generate_art[8f039acb-e68c-4368-821f-dbf55d0b038b] raised unexpected: ValueError(b'initialization error',)
Traceback (most recent call last):
File "/venv/deep/lib/python3.4/site-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(_args, *_kwargs)
File "/venv/deep/lib/python3.4/site-packages/celery/app/trace.py", line 438, in protected_call
return self.run(_args, *_kwargs)
File "/home/deep/art/tasks.py", line 32, in generate_art
art.run()
File "/home/deep/art/models.py", line 57, in run
return self.generate(subject_img=self.subject.file, style_img=style_img, **kwargs)
File "/home/deep/art/models.py", line 97, in generate
smoothness)
File "/home/deep/neural/style_network.py", line 88, in init
self.x.setup(x_shape)
File "/venv/deep/lib/python3.4/site-packages/deeppy-0.1.dev0-py3.4.egg/deeppy/parameter.py", line 33, in setup
File "/venv/deep/lib/python3.4/site-packages/deeppy-0.1.dev0-py3.4.egg/deeppy/filler.py", line 67, in array
File "/venv/deep/lib/python3.4/site-packages/cudarray-0.1.dev0-py3.4-linux-x86_64.egg/cudarray/cudarray.py", line 242, in array
return ndarray(np_array.shape, np_data=np_array)
File "/venv/deep/lib/python3.4/site-packages/cudarray-0.1.dev0-py3.4-linux-x86_64.egg/cudarray/cudarray.py", line 36, in init
self._data = ArrayData(self.size, dtype, np_data)
File "cudarray/wrap/array_data.pyx", line 16, in cudarray.wrap.array_data.ArrayData.init (./cudarray/wrap/array_data.cpp:1401)
File "cudarray/wrap/cudart.pyx", line 12, in cudarray.wrap.cudart.cudaCheck (./cudarray/wrap/cudart.cpp:763)
ValueError: b'initialization error'

You may be thinking "this is a Celery issue, go post there", but hang on.

I recompiled cudarray without CUDA (to use CPU), and tried it again with Celery and it works just as it should, but obviously I need CUDA enabled if I want it to finish any tasks within a month.

Looking further into this, I've noticed there's some others having some issues with Celery and CUDA/GPU things, but have found workarounds of telling it to use the GPU in some way before the task runs.

A theory on CUDA not working with Celery off the bat: http://stackoverflow.com/questions/24744755/why-am-i-getting-cumemalloc-failed-not-initialized-even-though-i-am-initializ

Here's someone who figured it out with Theano by simply telling it to use the GPU in the task itself: http://stackoverflow.com/questions/33354272/runtimeerror-when-using-theano-shared-variable-in-a-celery-celery-worker

I've tried the ol'

os.environ['LD_LIBRARY_PATH'] = '/usr/local/cuda/lib64'
os.environ['CUDARRAY_BACKEND'] = 'cuda'

in the task itself, but I get the same initialization error.

Is there another way to pass some "use the GPU plz" context to CUDA using cudarray in my task?

can anyone supply prebuild cudarray for windows?

I try to use https://github.com/mtyka/neural_artistic_style which needs cudarray

I have both "Python 2.7.11 |Anaconda 2.2.0 (32-bit)" and "Python 3.4.4 |Anaconda 2.3.0 (64-bit)". Please note Anaconda supplies ony py 3.4.4 currently, not py 3.5.

I try to compile cudarray without GPU for "Python 2.7.11 |Anaconda 2.2.0 (32-bit)", but neural_artistic_style says
[quote]
CUDArray: CUDA back-end not available, using NumPy.
... ...
File "E:\neural_artistic_style-master\matconvnet.py", line 38, in vgg_net
weights = np.transpose(weights, (3, 2, 0, 1)).astype(dp.float_)
MemoryError
[/quote]

However I cannot compile cudarray for "Python 3.4.4 |Anaconda 2.3.0 (64-bit)"

so, can anyone supply any hints to build for x64 or supply rebuild one?
thanks

without cuda option

First, this package combined with deeppy look great! Trying to get this installed on my OSX without an nvidia card. I tried:

python setup.py --without-cuda install

and received the error:

Traceback (most recent call last):
  File "setup.py", line 108, in <module>
    ext_modules=cuda_extensions(),
  File "setup.py", line 19, in cuda_extensions
    raise IOError('CUDA directory does not exist: %s' % cuda_dir)
IOError: CUDA directory does not exist: /usr/local/cuda

In fact, I don't see anywhere in the setup.py that would take that option. A bug or am I not setting up something up right?

El Capitan & Dynamic Libraries

I get this when I try to run the test script:

CUDArray: Failed to load CUDA back-end.
Traceback (most recent call last):
File "./test.py", line 7, in
import cudarray as ca
File "redacted/lib/python2.7/site-packages/cudarray-0.1.dev0-py2.7-macosx-10.5-x86_64.egg/cudarray/init.py", line 20, in
from .cudarray import *
File "redacted/lib/python2.7/site-packages/cudarray-0.1.dev0-py2.7-macosx-10.5-x86_64.egg/cudarray/cudarray.py", line 2, in
from .wrap.array_data import ArrayData
ImportError: dlopen(redacted/lib/python2.7/site-packages/cudarray-0.1.dev0-py2.7-macosx-10.5-x86_64.egg/cudarray/wrap/array_data.so, 2): Library not loaded: @rpath/libcudart.7.5.dylib
Referenced from: redacted/lib/python2.7/site-packages/cudarray-0.1.dev0-py2.7-macosx-10.5-x86_64.egg/cudarray/wrap/array_data.so
Reason: image not found

I suspect the cause is, in OS X El Capitan, its no longer possible to set DYLD_FALLBACK_LIBRARY_PATH or its friends. (Well, if it is possible, it now required additional hoops.)

fp16

Is it possible to set up arrays using FP16 rather than FP32 ?? (I didn't see anything obvious. Looks like the code using fp32 right now. I seem to recall seeing it being set to 32 somewhere in the code.)

If fp16 is possible, is it possible to convert fp32 arrays to fp16 directly as it is doing type conversion in numpy?

ca.dot for vectors

For now you cannot use ca.dot for (N,) shapes because it throws an exception. The only way I found to overcome this is:

def vectorT_dot(vec1, vec2):
    vec1_m = ca.reshape(vec1, (vec1.shape[0], 1))
    vec2_m = ca.reshape(vec2, (1, vec2.shape[0]))

    return ca.dot(vec1_m, vec2_m)

Hope you can impement this in more convenient way in the library.

arguments to random.uniform have no effect (with further implications)

I noticed that the arguments low and high to random.uniform have no effect. While it is easy to get around this limitation, the bug is grounded in a more serious problem that prevents me from implementing numpy.random.poisson in cudarray: in random.cu a cuda kernel operating directly on the output array (like kernel_stretch) seems to have no effect on the output array.

Any help is highly appreciated!
Wieland

Wrong indexing

Well, core functions doesn't work properly..

In [5]: a = ca.array(np.array([[ 1.66528273, -1.64936948],
                               [ 1.72497928, -1.69695187]]))

In [6]: a
Out[6]: 
array([[ 1.66528273, -1.64936948],
       [ 1.72497928, -1.69695187]], dtype=float32)

In [7]: a[:,0]
Out[7]: array([ 1.66528273, -1.64936948], dtype=float32)

In [8]: a[0,:]
Out[8]: array([ 1.66528273, -1.64936948], dtype=float32)

I see column slicing is not implemented, can you throw a error on this operation to make it clear?

how cudarray is different from other works?

hi,

I've read the technical report here http://www2.compute.dtu.dk/~abll/pubs/larsen2014cudarray.pdf

you listed several packages but I am not so sure how cudarray vs other? and why cudarray? which features cudarray have but others don't? ...

thanks

Hai

Install requirements?

Hi,
It looks like this only compiles for Objective C. Am I doing something wrong with a gcc only machine, or is C++ not supported?

Thanks
-Tim

Install problem with Cython

I tried Cython 0.21, 0.22, 0.23 and got the same error when python setup install
'''

......

cudarray/wrap/cudnn.pyx:45:32: Cannot convert Python object to 'float *'

Error compiling Cython file:

...
float_ptr(poolout))

def bprop(self, ArrayData imgs, ArrayData poolout, ArrayData poolout_d,
          ArrayData imgs_d):
    self.ptr.bprop(
        <const float *> imgs.dev_ptr,
       ^

cudarray/wrap/cudnn.pyx:50:12: Python objects cannot be cast to pointers of primitive types

Error compiling Cython file:

...

def bprop(self, ArrayData imgs, ArrayData poolout, ArrayData poolout_d,
          ArrayData imgs_d):
    self.ptr.bprop(
        <const float *> imgs.dev_ptr,
        <const float *> poolout.dev_ptr,
       ^

cudarray/wrap/cudnn.pyx:51:12: Python objects cannot be cast to pointers of primitive types

Error compiling Cython file:

...
def bprop(self, ArrayData imgs, ArrayData poolout, ArrayData poolout_d,
ArrayData imgs_d):
self.ptr.bprop(
<const float *> imgs.dev_ptr,
<const float *> poolout.dev_ptr,
<const float *> poolout_d.dev_ptr, <float *> imgs_d.dev_ptr

^

cudarray/wrap/cudnn.pyx:52:12: Python objects cannot be cast to pointers of primitive types

Error compiling Cython file:

^

cudarray/wrap/cudnn.pyx:52:47: Python objects cannot be cast to pointers of primitive types

Error compiling Cython file:

...
ArrayData convout):
cdef int img_h = img_shape[0]
cdef int img_w = img_shape[1]
cdef int filter_h = filter_shape[0]
cdef int filter_w = filter_shape[1]
self.ptr.fprop(float_ptr(imgs), float_ptr(filters), n_imgs, n_channels,

^

cudarray/wrap/cudnn.pyx:81:32: Cannot convert Python object to 'float const *'

Error compiling Cython file:

^

cudarray/wrap/cudnn.pyx:81:49: Cannot convert Python object to 'float const *'

Error compiling Cython file:

...
cdef int img_h = img_shape[0]
cdef int img_w = img_shape[1]
cdef int filter_h = filter_shape[0]
cdef int filter_w = filter_shape[1]
self.ptr.fprop(float_ptr(imgs), float_ptr(filters), n_imgs, n_channels,
n_filters, img_h, img_w, filter_h, filter_w, float_ptr(convout))

^

cudarray/wrap/cudnn.pyx:82:66: Cannot convert Python object to 'float *'

Error compiling Cython file:

...
self.ptr.fprop(float_ptr(imgs), float_ptr(filters), n_imgs, n_channels,
n_filters, img_h, img_w, filter_h, filter_w, float_ptr(convout))

def bprop(self, ArrayData imgs, ArrayData filters, ArrayData convout_d,
          ArrayData imgs_d, ArrayData filters_d):
    cdef float *imgs_ptr = <float *>NULL if imgs is None \
                          ^

cudarray/wrap/cudnn.pyx:86:31: Cannot convert 'float *' to Python object

Error compiling Cython file:

...
self.ptr.fprop(float_ptr(imgs), float_ptr(filters), n_imgs, n_channels,
n_filters, img_h, img_w, filter_h, filter_w, float_ptr(convout))

def bprop(self, ArrayData imgs, ArrayData filters, ArrayData convout_d,
          ArrayData imgs_d, ArrayData filters_d):
    cdef float *imgs_ptr = <float *>NULL if imgs is None \
                          ^

cudarray/wrap/cudnn.pyx:86:31: Cannot convert Python object to 'float *'

Error compiling Cython file:

...
n_filters, img_h, img_w, filter_h, filter_w, float_ptr(convout))

def bprop(self, ArrayData imgs, ArrayData filters, ArrayData convout_d,
          ArrayData imgs_d, ArrayData filters_d):
    cdef float *imgs_ptr = <float *>NULL if imgs is None \
                                         else float_ptr(imgs)
                                                      ^

cudarray/wrap/cudnn.pyx:87:59: Cannot convert Python object to 'float *'

Error compiling Cython file:

...

def bprop(self, ArrayData imgs, ArrayData filters, ArrayData convout_d,
          ArrayData imgs_d, ArrayData filters_d):
    cdef float *imgs_ptr = <float *>NULL if imgs is None \
                                         else float_ptr(imgs)
    cdef float *imgs_d_ptr = <float *>NULL if imgs_d is None \
                            ^

cudarray/wrap/cudnn.pyx:88:33: Cannot convert 'float *' to Python object

Error compiling Cython file:

...

def bprop(self, ArrayData imgs, ArrayData filters, ArrayData convout_d,
          ArrayData imgs_d, ArrayData filters_d):
    cdef float *imgs_ptr = <float *>NULL if imgs is None \
                                         else float_ptr(imgs)
    cdef float *imgs_d_ptr = <float *>NULL if imgs_d is None \
                            ^

cudarray/wrap/cudnn.pyx:88:33: Cannot convert Python object to 'float *'

Error compiling Cython file:

...
def bprop(self, ArrayData imgs, ArrayData filters, ArrayData convout_d,
ArrayData imgs_d, ArrayData filters_d):
cdef float *imgs_ptr = <float *>NULL if imgs is None
else float_ptr(imgs)
cdef float *imgs_d_ptr = <float *>NULL if imgs_d is None
else float_ptr(imgs_d)

^

cudarray/wrap/cudnn.pyx:89:61: Cannot convert Python object to 'float *'

Error compiling Cython file:

...
ArrayData imgs_d, ArrayData filters_d):
cdef float *imgs_ptr = <float *>NULL if imgs is None
else float_ptr(imgs)
cdef float *imgs_d_ptr = <float *>NULL if imgs_d is None
else float_ptr(imgs_d)
cdef float *filters_d_ptr = <float *>NULL if filters_d is None \

^

cudarray/wrap/cudnn.pyx:90:36: Cannot convert 'float *' to Python object

Error compiling Cython file:

^

cudarray/wrap/cudnn.pyx:90:36: Cannot convert Python object to 'float *'

Error compiling Cython file:

...
cdef float *imgs_ptr = <float *>NULL if imgs is None
else float_ptr(imgs)
cdef float *imgs_d_ptr = <float *>NULL if imgs_d is None
else float_ptr(imgs_d)
cdef float *filters_d_ptr = <float *>NULL if filters_d is None
else float_ptr(filters_d)

^

cudarray/wrap/cudnn.pyx:91:64: Cannot convert Python object to 'float *'

Error compiling Cython file:

...
else float_ptr(imgs)
cdef float *imgs_d_ptr = <float *>NULL if imgs_d is None
else float_ptr(imgs_d)
cdef float *filters_d_ptr = <float *>NULL if filters_d is None
else float_ptr(filters_d)
self.ptr.bprop(imgs_ptr, float_ptr(filters),

^

cudarray/wrap/cudnn.pyx:92:42: Cannot convert Python object to 'float const *'

Error compiling Cython file:

...
cdef float *imgs_d_ptr = <float *>NULL if imgs_d is None
else float_ptr(imgs_d)
cdef float *filters_d_ptr = <float *>NULL if filters_d is None
else float_ptr(filters_d)
self.ptr.bprop(imgs_ptr, float_ptr(filters),
float_ptr(convout_d), imgs_d_ptr, filters_d_ptr)

^

cudarray/wrap/cudnn.pyx:93:32: Cannot convert Python object to 'float const *'
building 'cudarray.wrap.cudnn' extension
x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/local/cuda-7.0/include -I./include -I/usr/lib/python2.7/dist-packages/numpy/core/include -I/usr/lib/python2.7/dist-packages/numpy/core/include -I/usr/include/python2.7 -c ./cudarray/wrap/cudnn.cpp -o build/temp.linux-x86_64-2.7/./cudarray/wrap/cudnn.o -DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
./cudarray/wrap/cudnn.cpp:1:2: error: #error Do not use this file, it is the result of a failed Cython compilation.
#error Do not use this file, it is the result of a failed Cython compilation.
^
compilation terminated due to -Wfatal-errors.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
'''

compiling error: identifier "__atomic_fetch_add" is undefined

Hi,

I am trying to install cudarray but got above error.

this is what I did

export INSTALL_PREFIX=./test/
export CUDA_PREFIX=/usr/local/cuda-5.0/
export CUDNN_ENABLED=1
make

and this is what I got

/usr/local/cuda-5.0//bin/nvcc -gencode arch=compute_20,code=sm_20 -gencode arch=compute_20,code=compute_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -O3 --compiler-options '-DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors' --ftz=true --prec-div=false -prec-sqrt=false --fmad=true -I./include -I/usr/local/cuda-5.0//include -c -o src/nnet/pool_b01.o src/nnet/pool_b01.cu
/usr/include/c++/4.8/ext/atomicity.h(49): error: identifier "__atomic_fetch_add" is undefined

/usr/include/c++/4.8/ext/atomicity.h(53): error: identifier "__atomic_fetch_add" is undefined

2 errors detected in the compilation of "/tmp/tmpxft_00004173_00000000-12_pool_b01.compute_35.cpp1.ii".
make: *** [src/nnet/pool_b01.o] Error 2

any idea why this happened? thanks

Hai

fatal error: 'numpy/arrayobject.h' file not found

running the non Cuda back-end setup install produces:

cudarray/numpy_backend/nnet/conv_bc01.c:274:10: fatal error: 'numpy/arrayobject.h' file not found

include "numpy/arrayobject.h"

1 error generated.
error: command 'clang' failed with exit status 1

Any insights as to what might be happening?

mkdir error in building libcudarray

I'm getting this error:

mkdir -p ./build
make: mkdir: No such file or directory
make: *** [build/libcudarray.so] Error 1

Any ideas? Thanks!

Update: I fixed this just by doing /bin/mkdir, but now it's saying:

ld: library not found for -lcudnn ?

Bugs in cudarray/cudarray/numpy_backend/nnet/lrnorm_bc01.pyx

In line 47: why is it "range(N+1)"? if we pad "half" number of 0 in both ends, then it should be "range(tailLength)".

in line 79: inline function "addToNormWindow" doesn't seem to update the variable "norm_window". I verified it by printing out its value.

Thanks

Reference: decaf implementation
https://github.com/UCB-ICSI-Vision-Group/decaf-release/blob/master/decaf/layers/cpp/local_response_normalization.cpp

cudaarray installation problem

Hi,

I am trying to install "cudarray" but I am facing a problem.

After doing make and make install when I tried to do python setup.py install, I get the following error :
./cudarray/wrap/array_data.cpp:1384:53: fatal error: arithmetic on a pointer to void __pyx_v_self->dev_ptr = (__pyx_v_owner->dev_ptr + (__pyx_v_offset * ...

I am using Mac OS X 10.10.3 and GCC (Apple LLVM version 6.1.0 (clang-602.0.49) (based on LLVM 3.6.0svn) , I also have the Nvidia compiler nvcc installed using the following version :

CUDA back-end not available even though toolkit is installed and setup.py was ran with no parameters.

I can import cudarray after installing everything but for some reason, I still can't use the CUDA back-end I know I have. Any help?

I get errors like these: g++ -O3 -fPIC -Wall -Wfatal-errors -I./include -I/mnt/cuda-7.5/include -c -o src/nnet/conv_bc01_matmul.o src/nnet/conv_bc01_matmul.cpp
nvcc -gencode arch=compute_20,code=sm_20 -gencode arch=compute_20,code=compute_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -O3 --compiler-options '-O3 -fPIC -Wall -Wfatal-errors' --ftz=true --prec-div=false -prec-sqrt=false --fmad=true -I./include -I/mnt/cuda-7.5/include -c -o src/nnet/pool_b01.o src/nnet/pool_b01.cu
g++ -O3 -fPIC -Wall -Wfatal-errors -I./include -I/mnt/cuda-7.5/include -c -o src/nnet/cudnn.o src/nnet/cudnn.cpp

when I run make.

Ok so if I manually switch to the CUDA back-end, it says it can't find a file called libcudart.sc

Installation without CUDA back-end failed

Hello,

I'm trying to install cudarray:

➜  Code  git clone https://github.com/andersbll/cudarray.git
Cloning into 'cudarray'...
remote: Counting objects: 970, done.
remote: Compressing objects: 100% (18/18), done.
remote: Total 970 (delta 3), reused 0 (delta 0)
Receiving objects: 100% (970/970), 196.66 KiB | 345.00 KiB/s, done.
Resolving deltas: 100% (566/566), done.
Checking connectivity... done.
➜  Code  cd cudarray
➜  cudarray git:(master) python setup.py --without-cuda install
Traceback (most recent call last):
  File "setup.py", line 112, in <module>
    ext_modules=numpy_extensions(),
  File "setup.py", line 77, in numpy_extensions
    return cythonize(cython_srcs, include_path=[numpy.get_include()])
  File "/Users/khalman/anaconda/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 714, in cythonize
    aliases=aliases)
  File "/Users/khalman/anaconda/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 637, in create_extension_list
    kwds = deps.distutils_info(file, aliases, base).values
  File "/Users/khalman/anaconda/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 541, in distutils_info
    return (self.transitive_merge(filename, self.distutils_info0, DistutilsInfo.merge)
  File "/Users/khalman/anaconda/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 551, in transitive_merge
    node, extract, merge, seen, {}, self.cimported_files)[0]
  File "/Users/khalman/anaconda/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 562, in transitive_merge_helper
    for next in outgoing(node):
  File "/Users/khalman/anaconda/lib/python2.7/site-packages/Cython/Utils.py", line 35, in wrapper
    res = cache[args] = f(self, *args)
  File "/Users/khalman/anaconda/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 490, in cimported_files
    pxd_file = self.find_pxd(module, filename)
  File "/Users/khalman/anaconda/lib/python2.7/site-packages/Cython/Utils.py", line 35, in wrapper
    res = cache[args] = f(self, *args)
  File "/Users/khalman/anaconda/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 474, in find_pxd
    pxd = self.context.find_pxd_file(relative, None)
  File "/Users/khalman/anaconda/lib/python2.7/site-packages/Cython/Compiler/Main.py", line 190, in find_pxd_file
    pxd = self.search_include_directories(qualified_name, ".pxd", pos, sys_path=True)
  File "/Users/khalman/anaconda/lib/python2.7/site-packages/Cython/Compiler/Main.py", line 231, in search_include_directories
    tuple(self.include_directories), qualified_name, suffix, pos, include, sys_path)
  File "/Users/khalman/anaconda/lib/python2.7/site-packages/Cython/Utils.py", line 22, in wrapper
    res = cache[args] = f(*args)
  File "/Users/khalman/anaconda/lib/python2.7/site-packages/Cython/Utils.py", line 111, in search_include_directories
    path = os.path.join(dir, dotted_filename)
  File "/Users/khalman/anaconda/lib/python2.7/posixpath.py", line 80, in join
    path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 15: ordinal not in range(128)

For python I use anaconda.
Python Version: 2.7.8

Install not working on OS X

I clone the project, then:

~/D/cudarray master ❯ make
g++ -O3 -fPIC -Wall -Wfatal-errors -I./include -I/usr/local/cuda/include -c -o src/nnet/conv_bc01_matmul.o src/nnet/conv_bc01_matmul.cpp
In file included from src/nnet/conv_bc01_matmul.cpp:1:
./include/cudarray/common.hpp:8:10: fatal error: 'cuda_runtime_api.h' file not
      found
#include <cuda_runtime_api.h>
         ^
1 error generated.
make: *** [src/nnet/conv_bc01_matmul.o] Error 1

Compiler not supported directive

C:\Python27\lib\site-packages\setuptools\dist.py:285: UserWarning: Normalizing '0.1.dev' to '0.1.dev0'
  normalized_version,
running install
running bdist_egg
running egg_info
writing requirements to cudarray.egg-info\requires.txt
writing cudarray.egg-info\PKG-INFO
writing top-level names to cudarray.egg-info\top_level.txt
writing dependency_links to cudarray.egg-info\dependency_links.txt
reading manifest file 'cudarray.egg-info\SOURCES.txt'
writing manifest file 'cudarray.egg-info\SOURCES.txt'
installing library code to build\bdist.win32\egg
running install_lib
running build_py
running build_ext
skipping './cudarray\wrap\cudart.cpp' Cython extension (up-to-date)
building 'cudarray.wrap.cudart' extension
C:\Users\...\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Bin\cl.exe /c /nologo /Ox /MD /W3 /GS- /DNDEBUG -I/usr/local/cuda\include -I./include -IC:\Python27\lib\site-packages\numpy\core\include -IC:\Python27\l
ib\site-packages\numpy\core\include -IC:\Python27\include -IC:\Python27\PC /Tp./cudarray\wrap\cudart.cpp /Fobuild\temp.win32-2.7\Release\./cudarray\wrap\cudart.obj -O3 -fPIC -Wall -Wfatal-errors
cl : Command line error D8021 : invalid numeric argument '/Wfatal-errors'
error: command 'C:\\Users\\...\\AppData\\Local\\Programs\\Common\\Microsoft\\Visual C++ for Python\\9.0\\VC\\Bin\\cl.exe' failed with exit status 2

ImportError: No module named conv_bc01

Ubuntu 14.04 LTS. CUDA 7.5.
However, I installed cudarray without cuda support.
And after installation I got:

grinya@mypc:~/cudarray$ python
import cudarray
CUDArray: CUDA back-end not available, using NumPy.
Traceback (most recent call last):
File "", line 1, in
File "cudarray/init.py", line 40, in
from .numpy_backend import *
File "cudarray/numpy_backend/init.py", line 2, in
from .nnet import *
File "cudarray/numpy_backend/nnet/init.py", line 3, in
from .conv_bc01 import *
ImportError: No module named conv_bc01

But the file in the directory "cudarray/numpy_backend/nnet" does exist. I do not understand this error.

transpose

I'm not sure if I've done something incorrectly but I notice the following issue:

a = ca.random.uniform(size=(5,3))
a
a.T

The transpose is 3x5 but it seems to be reading sequentially from top left to bottom right rather than transposing. I noticed in base.transpose you set a transposed flag so I assumed this flag would be used in other operators rather than explicitly transposing the matrix but I'm not sure anymore what's happening. In my code using a.T doesn't work unless it's a 1xN or Nx1.

Thanks.

Installing cuDNN (add <installpath> to the Include Directories...)

How can I complete the second step in the installation of cuDNN?

In your Visual Studio project properties, add to the Include Directories and Library Directories lists and add cudnn.lib to Linker->Input->Additional Dependencies.

Here's what it looks like without this step:

`C:\Python27\cudarray>python setup.py install
C:\Python27\lib\site-packages\setuptools\dist.py:285: UserWarning: Normalizing '0.1.dev' to '0.1.dev0'
normalized_version,
running install

running bdist_egg

running egg_info

writing requirements to cudarray.egg-info\requires.txt

writing cudarray.egg-info\PKG-INFO

writing top-level names to cudarray.egg-info\top_level.txt
writing dependency_links to cudarray.egg-info\dependency_links.txt
reading manifest file 'cudarray.egg-info\SOURCES.txt'
writing manifest file 'cudarray.egg-info\SOURCES.txt'
installing library code to build\bdist.win-amd64\egg
running install_lib
running build_py
running build_ext
skipping './cudarray\wrap\cudart.cpp' Cython extension (up-to-date)
building 'cudarray.wrap.cudart' extension
C:\Users\Christian\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Bin\amd64\cl.exe /c /nologo /Ox /MD /W3 /GS- /DNDEBUG -I/usr/local/cuda\include -I./include -IC:\Python27\lib\site-packages\numpy\core\include -IC:\Python27\lib\site-packages\numpy\core\include -IC:\Python27\include -IC:\Python27\PC /Tp./cudarray\wrap\cudart.cpp /Fobuild\temp.win-amd64-2.7\Release./cudarray\wrap\cudart.obj -O3 -fPIC -Wall -Wfatal-errors
cl : Command line error D8021 : invalid numeric argument '/Wfatal-errors'
error: command 'C:\Users\Christian\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Bin\amd64\cl.exe' failed with exit status 2`

Visual C++ for Python is only available as a command prompt. Can I do something with the command prompt, or do I need a new version visual studio? I'm not sure how to proceed.

Thanks.

Python compile failure

Hi, thanks for putting this together it looks nice. I was trying to take it for a spin but python setup.py install failed with somewhat cryptic errors that I wasn't able to narrow down. I don't have a huge amount of experience with cython so I'm not sure where to look or where things went wrong. Copy-pasting the beginning of the error: (I'm on Ubuntu 12.04 with Python 2.7)

...
cythoning ./cudarray/wrap/array_data.pyx to ./cudarray/wrap/array_data.cpp

Error compiling Cython file:
------------------------------------------------------------
...
from libcpp cimport bool
cimport numpy as np

cdef extern from 'cudarray/common.hpp' namespace 'cudarray':
    ctypedef int bool_t;
                      ^
------------------------------------------------------------

cudarray/wrap/array_data.pxd:5:23: Syntax error in ctypedef statement

Error compiling Cython file:
------------------------------------------------------------
...
    @property
    def itemsize(self):
        return self.dtype.itemsize


cdef bool_t *bool_ptr(ArrayData a):
    ^
------------------------------------------------------------

cudarray/wrap/array_data.pyx:50:5: 'bool_t' is not a type identifier

Error compiling Cython file:
------------------------------------------------------------
...

cdef int *int_ptr(ArrayData a):
    return <int *> a.dev_ptr


cdef bool is_int(ArrayData a):
    ^
------------------------------------------------------------

cudarray/wrap/array_data.pyx:62:5: 'bool' is not a type identifier

Error compiling Cython file:
------------------------------------------------------------
...

cdef bool is_int(ArrayData a):
    return a.dtype == np.dtype('int32')


cdef bool is_float(ArrayData a):
    ^
------------------------------------------------------------

cudarray/wrap/array_data.pyx:66:5: 'bool' is not a type identifier

Error compiling Cython file:
------------------------------------------------------------
...
        self.dtype = dtype
        self.nbytes = size*dtype.itemsize
        self.owner = owner
        self.offset = offset
        if owner is None:
            cudaCheck(cudaMalloc(&self.dev_ptr, self.nbytes))
                                ^
------------------------------------------------------------

cudarray/wrap/array_data.pyx:16:33: Cannot take address of Python variable

Error compiling Cython file:
------------------------------------------------------------
...
        if owner is None:
            cudaCheck(cudaMalloc(&self.dev_ptr, self.nbytes))
        else:
            self.dev_ptr = owner.dev_ptr + offset*dtype.itemsize
        if np_data is not None:
            cudaCheck(cudaMemcpyAsync(self.dev_ptr, np.PyArray_DATA(np_data),
                                         ^
------------------------------------------------------------

cudarray/wrap/array_data.pyx:20:42: Cannot convert Python object to 'void *'

Error compiling Cython file:
------------------------------------------------------------
...
        if np_data is not None:
            cudaCheck(cudaMemcpyAsync(self.dev_ptr, np.PyArray_DATA(np_data),
                                      self.nbytes, cudaMemcpyHostToDevice))

    def to_numpy(self, np_array):
        cudaCheck(cudaMemcpy(np.PyArray_DATA(np_array), self.dev_ptr,
                                                           ^
------------------------------------------------------------

cudarray/wrap/array_data.pyx:24:60: Cannot convert Python object to 'void const *'

Error compiling Cython file:
------------------------------------------------------------
...
                             self.nbytes, cudaMemcpyDeviceToHost))
        return np_array

    def __dealloc__(self):
        if self.owner is None:
            cudaFree(self.dev_ptr)
                        ^
------------------------------------------------------------

cudarray/wrap/array_data.pyx:30:25: Cannot convert Python object to 'void *'

Error compiling Cython file:
------------------------------------------------------------
...
    def itemsize(self):
        return self.dtype.itemsize


cdef bool_t *bool_ptr(ArrayData a):
    return <bool_t *> a.dev_ptr
           ^
------------------------------------------------------------

cudarray/wrap/array_data.pyx:51:12: 'bool_t' is not a type identifier

Error compiling Cython file:
------------------------------------------------------------
...
cdef bool_t *bool_ptr(ArrayData a):
    return <bool_t *> a.dev_ptr


cdef float *float_ptr(ArrayData a):
    return <float *> a.dev_ptr
          ^
------------------------------------------------------------

cudarray/wrap/array_data.pyx:55:11: Python objects cannot be cast to pointers of primitive types

Error compiling Cython file:
------------------------------------------------------------
...
cdef float *float_ptr(ArrayData a):
    return <float *> a.dev_ptr


cdef int *int_ptr(ArrayData a):
    return <int *> a.dev_ptr
          ^
------------------------------------------------------------

cudarray/wrap/array_data.pyx:59:11: Python objects cannot be cast to pointers of primitive types
building 'cudarray.wrap.array_data' extension

Support for CuDNN Softmax forward and backward ops

Implementing the Softmax's forward function is straightforward in CudArray but the backprop suffers performance wise as it requires multiple kernel launches. The resulting formula for the backprop also tends to be numerically unstable. Haveing the softmax's forward and backward available in the cudnn module would be a massive help for neural networks where the softmax is extensively used, especially in the case where the loss function attached to the softmax in not the cross entropy loss ( which avoids the calculation of the jacobian )

can't make under windows

python setup.py build keeps giving me this:

c:\anaconda\lib\site-packages\setuptools-19.6.2-py2.7.egg\setuptools\dist.py:285: UserWarning: Normal
izing '0.1.dev' to '0.1.dev0'
running install
running bdist_egg
running egg_info
writing requirements to cudarray.egg-info\requires.txt
writing cudarray.egg-info\PKG-INFO
writing top-level names to cudarray.egg-info\top_level.txt
writing dependency_links to cudarray.egg-info\dependency_links.txt
reading manifest file 'cudarray.egg-info\SOURCES.txt'
writing manifest file 'cudarray.egg-info\SOURCES.txt'
installing library code to build\bdist.win-amd64\egg
running install_lib
running build_py
running build_ext
skipping './cudarray\wrap\cudart.cpp' Cython extension (up-to-date)
building 'cudarray.wrap.cudart' extension
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Bin\amd64\cl.exe /c
/nologo /Ox /MD /W3 /GS- /DNDEBUG -Ic:\Anaconda\Lib\site-packages\cuda\include -I./include -Ic:\anaco
nda\lib\site-packages\numpy\core\include -Ic:\anaconda\lib\site-packages\numpy\core\include -Ic:\anac
onda\include -Ic:\anaconda\PC /Tp./cudarray\wrap\cudart.cpp /Fobuild\temp.win-amd64-2.7\Release./cud
array\wrap\cudart.obj -O3 -fPIC -Wall
cl : Command line warning D9002 : ignoring unknown option '-O3'
cl : Command line warning D9002 : ignoring unknown option '-fPIC'
cudart.cpp
c:\users\oh\appdata\local\programs\common\microsoft\visual c++ for python\9.0\vc\include\codeanalysis
\sourceannotations.h(81) : warning C4820: 'vc_attributes::Pre' : '4' bytes padding added after data m
ember 'vc_attributes::Pre::Access'
c:\users\oh\appdata\local\programs\common\microsoft\visual c++ for python\9.0\vc\include\codeanalysis
\sourceannotations.h(96) : warning C4820: 'vc_attributes::Pre' : '4' bytes padding added after data m
ember 'vc_attributes::Pre::NullTerminated'
c:\users\oh\appdata\local\programs\common\microsoft\visual c++ for python\9.0\vc\include\codeanalysis
\sourceannotations.h(112) : warning C4820: 'vc_attributes::Post' : '4' bytes padding added after data
member 'vc_attributes::Post::Access'
c:\users\oh\appdata\local\programs\common\microsoft\visual c++ for python\9.0\vc\include\codeanalysis
\sourceannotations.h(191) : warning C4820: 'vc_attributes::PreRange' : '4' bytes padding added after
data member 'vc_attributes::PreRange::Deref'
c:\users\oh\appdata\local\programs\common\microsoft\visual c++ for python\9.0\vc\include\codeanalysis
\sourceannotations.h(203) : warning C4820: 'vc_attributes::PostRange' : '4' bytes padding added after
data member 'vc_attributes::PostRange::Deref'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\io.h(60) : w
arning C4820: '_finddata32i64_t' : '4' bytes padding added after data member '_finddata32i64_t::name'

C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\io.h(64) : w
arning C4820: '_finddata64i32_t' : '4' bytes padding added after data member '_finddata64i32_t::attri
b'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\io.h(73) : w
arning C4820: '__finddata64_t' : '4' bytes padding added after data member '__finddata64_t::attrib'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\io.h(78) : w
arning C4820: '__finddata64_t' : '4' bytes padding added after data member '__finddata64_t::name'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\io.h(126) :
warning C4820: '_wfinddata64i32_t' : '4' bytes padding added after data member '_wfinddata64i32_t::at
trib'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\io.h(131) :
warning C4820: '_wfinddata64i32_t' : '4' bytes padding added after data member '_wfinddata64i32_t::na
me'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\io.h(135) :
warning C4820: '_wfinddata64_t' : '4' bytes padding added after data member '_wfinddata64_t::attrib'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\WinSDK\Include\basetsd.
h(114) : warning C4668: '__midl' is not defined as a preprocessor macro, replacing with '0' for '#if/

elif'

C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\WinSDK\Include\basetsd.
h(424) : warning C4668: '_WIN32_WINNT' is not defined as a preprocessor macro, replacing with '0' for
'#if/#elif'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\stdio.h(62)
: warning C4820: '_iobuf' : '4' bytes padding added after data member '_iobuf::_cnt'
c:\anaconda\include\pyport.h(206) : warning C4668: 'SIZEOF_PID_T' is not defined as a preprocessor ma
cro, replacing with '0' for '#if/#elif'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\math.h(41) :
warning C4820: '_exception' : '4' bytes padding added after data member '_exception::type'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\sys/stat.h(1
11) : warning C4820: '_stat32' : '2' bytes padding added after data member '_stat32::st_gid'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\sys/stat.h(1
27) : warning C4820: 'stat' : '2' bytes padding added after data member 'stat::st_gid'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\sys/stat.h(1
43) : warning C4820: '_stat32i64' : '2' bytes padding added after data member '_stat32i64::st_gid'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\sys/stat.h(1
44) : warning C4820: '_stat32i64' : '4' bytes padding added after data member '_stat32i64::st_rdev'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\sys/stat.h(1
48) : warning C4820: '_stat32i64' : '4' bytes padding added after data member '_stat32i64::st_ctime'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\sys/stat.h(1
57) : warning C4820: '_stat64i32' : '2' bytes padding added after data member '_stat64i32::st_gid'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\sys/stat.h(1
71) : warning C4820: '_stat64' : '2' bytes padding added after data member '_stat64::st_gid'
C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.0\VC\Include\sys/stat.h(1
72) : warning C4820: '_stat64' : '4' bytes padding added after data member '_stat64::st_rdev'
c:\anaconda\include\object.h(358) : warning C4820: '_typeobject' : '4' bytes padding added after data
member '_typeobject::tp_flags'
c:\anaconda\include\object.h(411) : warning C4820: '_typeobject' : '4' bytes padding added after data
member '_typeobject::tp_version_tag'
c:\anaconda\include\unicodeobject.h(420) : warning C4820: '' : '4' bytes padding added a
fter data member '::hash'
c:\anaconda\include\intobject.h(26) : warning C4820: '' : '4' bytes padding added after
data member '::ob_ival'
c:\anaconda\include\stringobject.h(49) : warning C4820: '' : '7' bytes padding added aft
er data member '::ob_sval'
c:\anaconda\include\bytearrayobject.h(26) : warning C4820: '' : '4' bytes padding added
after data member '::ob_exports'
c:\anaconda\include\setobject.h(26) : warning C4820: '' : '4' bytes padding added after
data member '::hash'
c:\anaconda\include\setobject.h(56) : warning C4820: '_setobject' : '4' bytes padding added after dat
a member '_setobject::hash'
c:\anaconda\include\methodobject.h(42) : warning C4820: 'PyMethodDef' : '4' bytes padding added after
data member 'PyMethodDef::ml_flags'
c:\anaconda\include\fileobject.h(26) : warning C4820: '' : '4' bytes padding added after
data member '::f_skipnextlf'
c:\anaconda\include\fileobject.h(33) : warning C4820: '' : '4' bytes padding added after
data member '::writable'
c:\anaconda\include\genobject.h(23) : warning C4820: '' : '4' bytes padding added after
data member '::gi_running'
c:\anaconda\include\descrobject.h(28) : warning C4820: 'wrapperbase' : '4' bytes padding added after
data member 'wrapperbase::offset'
c:\anaconda\include\descrobject.h(32) : warning C4820: 'wrapperbase' : '4' bytes padding added after
data member 'wrapperbase::flags'
c:\anaconda\include\weakrefobject.h(37) : warning C4820: '_PyWeakReference' : '4' bytes padding added
after data member '_PyWeakReference::hash'
c:\anaconda\include\pystate.h(70) : warning C4820: '_ts' : '4' bytes padding added after data member
'_ts::use_tracing'
c:\anaconda\include\import.h(61) : warning C4820: '_frozen' : '4' bytes padding added after data memb
er '_frozen::size'
c:\anaconda\include\code.h(26) : warning C4820: '' : '4' bytes padding added after data
member '::co_firstlineno'
./cudarray\wrap\cudart.cpp(248) : fatal error C1083: Cannot open include file: 'driver_types.h': No s
uch file or directory
error: command 'C:\Users\oh\AppData\Local\Programs\Common\Microsoft\Visual C++ for Python\9.
0\VC\Bin\amd64\cl.exe' failed with exit status 2

Install issue on Mac OS X

Hi, I have tried to install it on my macbook without CUDA back-end
but I have an error message like "ImportError: No module named _build_utils.apple_accelerate"
Can you help me solve it?

The whole message is shown below

running install
Checking .pth file support in /Library/Python/2.7/site-packages/
/usr/bin/python -E -c pass
TEST PASSED: /Library/Python/2.7/site-packages/ appears to support .pth files
running bdist_egg
running egg_info
writing requirements to cudarray.egg-info/requires.txt
writing cudarray.egg-info/PKG-INFO
writing top-level names to cudarray.egg-info/top_level.txt
writing dependency_links to cudarray.egg-info/dependency_links.txt
reading manifest file 'cudarray.egg-info/SOURCES.txt'
writing manifest file 'cudarray.egg-info/SOURCES.txt'
installing library code to build/bdist.macosx-10.10-intel/egg
running install_lib
running build_py
running build_ext
creating build/bdist.macosx-10.10-intel/egg
creating build/bdist.macosx-10.10-intel/egg/cudarray
copying build/lib.macosx-10.10-intel-2.7/cudarray/init.py -> build/bdist.macosx-10.10-intel/egg/cudarray
copying build/lib.macosx-10.10-intel-2.7/cudarray/base.py -> build/bdist.macosx-10.10-intel/egg/cudarray
creating build/bdist.macosx-10.10-intel/egg/cudarray/batch
copying build/lib.macosx-10.10-intel-2.7/cudarray/batch/init.py -> build/bdist.macosx-10.10-intel/egg/cudarray/batch
copying build/lib.macosx-10.10-intel-2.7/cudarray/batch/linalg.py -> build/bdist.macosx-10.10-intel/egg/cudarray/batch
copying build/lib.macosx-10.10-intel-2.7/cudarray/cudarray.py -> build/bdist.macosx-10.10-intel/egg/cudarray
copying build/lib.macosx-10.10-intel-2.7/cudarray/elementwise.py -> build/bdist.macosx-10.10-intel/egg/cudarray
creating build/bdist.macosx-10.10-intel/egg/cudarray/extra
copying build/lib.macosx-10.10-intel-2.7/cudarray/extra/init.py -> build/bdist.macosx-10.10-intel/egg/cudarray/extra
copying build/lib.macosx-10.10-intel-2.7/cudarray/extra/array.py -> build/bdist.macosx-10.10-intel/egg/cudarray/extra
copying build/lib.macosx-10.10-intel-2.7/cudarray/helpers.py -> build/bdist.macosx-10.10-intel/egg/cudarray
copying build/lib.macosx-10.10-intel-2.7/cudarray/linalg.py -> build/bdist.macosx-10.10-intel/egg/cudarray
creating build/bdist.macosx-10.10-intel/egg/cudarray/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/nnet/init.py -> build/bdist.macosx-10.10-intel/egg/cudarray/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/nnet/conv.py -> build/bdist.macosx-10.10-intel/egg/cudarray/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/nnet/image.py -> build/bdist.macosx-10.10-intel/egg/cudarray/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/nnet/math.py -> build/bdist.macosx-10.10-intel/egg/cudarray/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/nnet/pool.py -> build/bdist.macosx-10.10-intel/egg/cudarray/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/nnet/special.py -> build/bdist.macosx-10.10-intel/egg/cudarray/nnet
creating build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend
copying build/lib.macosx-10.10-intel-2.7/cudarray/numpy_backend/init.py -> build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend
creating build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/numpy_backend/nnet/init.py -> build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/numpy_backend/nnet/activations.py -> build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/numpy_backend/nnet/conv.py -> build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/numpy_backend/nnet/conv_bc01.so -> build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/numpy_backend/nnet/lrnorm_bc01.so -> build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/numpy_backend/nnet/pool.py -> build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/numpy_backend/nnet/pool_bc01.so -> build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/numpy_backend/nnet/special.py -> build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet
copying build/lib.macosx-10.10-intel-2.7/cudarray/random.py -> build/bdist.macosx-10.10-intel/egg/cudarray
copying build/lib.macosx-10.10-intel-2.7/cudarray/reduction.py -> build/bdist.macosx-10.10-intel/egg/cudarray
creating build/bdist.macosx-10.10-intel/egg/cudarray/wrap
copying build/lib.macosx-10.10-intel-2.7/cudarray/wrap/init.py -> build/bdist.macosx-10.10-intel/egg/cudarray/wrap
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/init.py to init.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/base.py to base.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/batch/init.py to init.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/batch/linalg.py to linalg.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/cudarray.py to cudarray.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/elementwise.py to elementwise.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/extra/init.py to init.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/extra/array.py to array.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/helpers.py to helpers.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/linalg.py to linalg.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/nnet/init.py to init.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/nnet/conv.py to conv.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/nnet/image.py to image.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/nnet/math.py to math.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/nnet/pool.py to pool.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/nnet/special.py to special.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/init.py to init.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet/init.py to init.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet/activations.py to activations.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet/conv.py to conv.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet/pool.py to pool.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet/special.py to special.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/random.py to random.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/reduction.py to reduction.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/wrap/init.py to init.pyc
creating stub loader for cudarray/numpy_backend/nnet/conv_bc01.so
creating stub loader for cudarray/numpy_backend/nnet/pool_bc01.so
creating stub loader for cudarray/numpy_backend/nnet/lrnorm_bc01.so
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet/conv_bc01.py to conv_bc01.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet/pool_bc01.py to pool_bc01.pyc
byte-compiling build/bdist.macosx-10.10-intel/egg/cudarray/numpy_backend/nnet/lrnorm_bc01.py to lrnorm_bc01.pyc
creating build/bdist.macosx-10.10-intel/egg/EGG-INFO
copying cudarray.egg-info/PKG-INFO -> build/bdist.macosx-10.10-intel/egg/EGG-INFO
copying cudarray.egg-info/SOURCES.txt -> build/bdist.macosx-10.10-intel/egg/EGG-INFO
copying cudarray.egg-info/dependency_links.txt -> build/bdist.macosx-10.10-intel/egg/EGG-INFO
copying cudarray.egg-info/not-zip-safe -> build/bdist.macosx-10.10-intel/egg/EGG-INFO
copying cudarray.egg-info/requires.txt -> build/bdist.macosx-10.10-intel/egg/EGG-INFO
copying cudarray.egg-info/top_level.txt -> build/bdist.macosx-10.10-intel/egg/EGG-INFO
writing build/bdist.macosx-10.10-intel/egg/EGG-INFO/native_libs.txt
creating 'dist/cudarray-0.1.dev-py2.7-macosx-10.10-intel.egg' and adding 'build/bdist.macosx-10.10-intel/egg' to it
removing 'build/bdist.macosx-10.10-intel/egg' (and everything under it)
Processing cudarray-0.1.dev-py2.7-macosx-10.10-intel.egg
removing '/Library/Python/2.7/site-packages/cudarray-0.1.dev-py2.7-macosx-10.10-intel.egg' (and everything under it)
creating /Library/Python/2.7/site-packages/cudarray-0.1.dev-py2.7-macosx-10.10-intel.egg
Extracting cudarray-0.1.dev-py2.7-macosx-10.10-intel.egg to /Library/Python/2.7/site-packages
cudarray 0.1.dev is already the active version in easy-install.pth

Installed /Library/Python/2.7/site-packages/cudarray-0.1.dev-py2.7-macosx-10.10-intel.egg
Processing dependencies for cudarray==0.1.dev
Searching for numpy>=1.8
Reading https://pypi.python.org/simple/numpy/
Best match: numpy 1.10.1
Downloading https://pypi.python.org/packages/source/n/numpy/numpy-1.10.1.zip#md5=6f57c58bc5b28440fbeccd505da63d58
Processing numpy-1.10.1.zip
Writing /tmp/easy_install-0FU_YO/numpy-1.10.1/setup.cfg
Running numpy-1.10.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-0FU_YO/numpy-1.10.1/egg-dist-tmp-TgA5CX
Warning: distutils distribution has been initialized, it may be too late to add a subpackage compatTraceback (most recent call last):
File "setup.py", line 161, in
zip_safe=False,
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/core.py", line 151, in setup
dist.run_commands()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 953, in run_commands
self.run_command(cmd)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/distutils/dist.py", line 972, in run_command
cmd_obj.run()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/command/install.py", line 73, in run
self.do_egg_install()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/command/install.py", line 101, in do_egg_install
cmd.run()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/command/easy_install.py", line 374, in run
self.easy_install(spec, not self.no_deps)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/command/easy_install.py", line 590, in easy_install
return self.install_item(None, spec, tmpdir, deps, True)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/command/easy_install.py", line 641, in install_item
self.process_distribution(spec, dist, deps)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/command/easy_install.py", line 687, in process_distribution
[requirement], self.local_index, self.easy_install
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 568, in resolve
dist = best[req.key] = env.best_match(req, self, installer)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 806, in best_match
return self.obtain(req, installer) # try and download/install
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/pkg_resources.py", line 818, in obtain
return installer(requirement)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/command/easy_install.py", line 609, in easy_install
return self.install_item(spec, dist.location, tmpdir, deps)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/command/easy_install.py", line 639, in install_item
dists = self.install_eggs(spec, download, tmpdir)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/command/easy_install.py", line 825, in install_eggs
return self.build_and_install(setup_script, setup_base)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/command/easy_install.py", line 1031, in build_and_install
self.run_setup(setup_script, setup_base, args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/command/easy_install.py", line 1016, in run_setup
run_setup(setup_script, args)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/sandbox.py", line 69, in run_setup
lambda: execfile(
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/sandbox.py", line 120, in run
return func()
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/setuptools/sandbox.py", line 71, in
{'file':setup_script, 'name':'main'}
File "setup.py", line 264, in

File "setup.py", line 256, in setup_package

File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/distutils/core.py", line 135, in setup
config = configuration()
File "setup.py", line 156, in configuration
},
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/distutils/misc_util.py", line 966, in add_subpackage
caller_level = 2)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/distutils/misc_util.py", line 935, in get_subpackage
caller_level = caller_level + 1)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/distutils/misc_util.py", line 872, in _get_configuration_from_setup_py
config = setup_module.configuration(*args)
File "numpy/setup.py", line 11, in configuration
from Cython.Distutils.extension import Extension
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/distutils/misc_util.py", line 966, in add_subpackage
caller_level = 2)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/distutils/misc_util.py", line 935, in get_subpackage
caller_level = caller_level + 1)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy/distutils/misc_util.py", line 847, in _get_configuration_from_setup_py
('.py', 'U', 1))
File "/private/tmp/easy_install-0FU_YO/numpy-1.10.1/numpy/core/setup.py", line 13, in

ImportError: No module named _build_utils.apple_accelerate

Error while installing

Trying to install cudarray, however getting error causing installation is being terminated

c++ -bundle -undefined dynamic_lookup -arch x86_64 -arch i386 -Wl,-F. build/temp.macosx-10.10-intel-2.7/./cudarray/wrap/cudart.o -L/usr/local/cuda/lib -lcudart -lcudarray -o build/lib.macosx-10.10-intel-2.7/cudarray/wrap/cudart.so -fPIC
ld: library not found for -lcudarray
clang: error: linker command failed with exit code 1 (use -v to see invocation)
error: command 'c++' failed with exit status 1

Any way to use MKL speedup in numpy?

Hi all,

I've tried accelerating cudarray computations with anaconda's mkl packages and I didn't see any speedup at all (python still used only one core). Is there a way to make use of the acceleration?

Thanks!

Daniel

cudarray/numpy_backend/nnet/conv_bc01.c: No such file or directory

Could you help me see something wrong?
running install
install_dir /usr/local/lib/python2.7/dist-packages/
Checking .pth file support in /usr/local/lib/python2.7/dist-packages/
/usr/bin/python -E -c pass
TEST PASSED: /usr/local/lib/python2.7/dist-packages/ appears to support .pth files
running bdist_egg
running egg_info
writing requirements to cudarray.egg-info/requires.txt
writing cudarray.egg-info/PKG-INFO
writing top-level names to cudarray.egg-info/top_level.txt
writing dependency_links to cudarray.egg-info/dependency_links.txt
reading manifest file 'cudarray.egg-info/SOURCES.txt'
writing manifest file 'cudarray.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
building 'cudarray.numpy_backend.nnet.conv_bc01' extension
x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I/usr/include/python2.7 -c cudarray/numpy_backend/nnet/conv_bc01.c -o build/temp.linux-x86_64-2.7/cudarray/numpy_backend/nnet/conv_bc01.o
x86_64-linux-gnu-gcc: error: cudarray/numpy_backend/nnet/conv_bc01.c: No such file or directory
x86_64-linux-gnu-gcc: fatal error: no input files
compilation terminated.
error: command 'x86_64-linux-gnu-gcc' failed with exit status 4

make fails when CUDNN_ENABLED

I'm getting a compile error when using CUDNN:

g++ -DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors -I./include -I/usr/local/cuda/include -c -o src/nnet/conv_bc01_matmul.o src/nnet/conv_bc01_matmul.cpp
nvcc -gencode arch=compute_20,code=sm_20 -gencode arch=compute_20,code=compute_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -O3 --compiler-options '-DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors' --ftz=true --prec-div=false -prec-sqrt=false --fmad=true -I./include -I/usr/local/cuda/include -c -o src/nnet/pool_b01.o src/nnet/pool_b01.cu
g++ -DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors -I./include -I/usr/local/cuda/include -c -o src/nnet/cudnn.o src/nnet/cudnn.cpp
In file included from src/nnet/cudnn.cpp:5:0:
src/nnet/cudnn.cpp: In instantiation of ‘void cudarray::ConvBC01CuDNN<T>::bprop(const T*, const T*, const T*, T*, T*) [with T = float]’:
src/nnet/cudnn.cpp:216:16:   required from here
src/nnet/cudnn.cpp:206:5: error: cannot convert ‘const float*’ to ‘cudnnConvolutionBwdFilterAlgo_t’ for argument ‘8’ to ‘cudnnStatus_t cudnnConvolutionBackwardFilter(cudnnHandle_t, const void*, cudnnTensorDescriptor_t, const void*, cudnnTensorDescriptor_t, const void*, cudnnConvolutionDescriptor_t, cudnnConvolutionBwdFilterAlgo_t, void*, size_t, const void*, cudnnFilterDescriptor_t, void*)’
     ));
     ^
./include/cudarray/nnet/cudnn.hpp:85:44: note: in definition of macro ‘CUDNN_CHECK’
 #define CUDNN_CHECK(status) { cudnn_check((status), __FILE__, __LINE__); }
                                            ^
compilation terminated due to -Wfatal-errors.
make: *** [src/nnet/cudnn.o] Error 1

I've got the following versions installed:

CUDA 7.5.18
CUDNN 4.0

Seems to compile OK without CUDNN.

Any ideas? If there's something I can provide that might be useful please let me know.

And thanks for the awesome package and tools!

std::runtime_error with to big a batch

When having to big a batch size i get the following error:

terminate called after throwing an instance of 'std::runtime_error'
  what():  src/nnet/pool_b01.cu:50: invalid configuration argument
/var/spool/torque/mom_priv/jobs/847742.hnode2.SC: line 25: 20375 Aborted

I printed
std::cout << "blocks "; std::cout << cuda_blocks(n_threads);

from pool_b01 and got : blocks 16988

The GPU I'm using has the following specs:

Detected 1 CUDA Capable device(s)

Device 0: "Tesla M2050"
  CUDA Driver Version / Runtime Version          6.5 / 6.5
  CUDA Capability Major/Minor version number:    2.0
  Total amount of global memory:                 2687 MBytes (2817982464 bytes)
  (14) Multiprocessors, ( 32) CUDA Cores/MP:     448 CUDA Cores
  GPU Clock rate:                                1147 MHz (1.15 GHz)
  Memory Clock rate:                             1566 Mhz
  Memory Bus Width:                              384-bit
  L2 Cache Size:                                 786432 bytes
  Maximum Texture Dimension Size (x,y,z)         1D=(65536), 2D=(65536, 65535), 3D=(2048, 2048, 2048)
  Maximum Layered 1D Texture Size, (num) layers  1D=(16384), 2048 layers
  Maximum Layered 2D Texture Size, (num) layers  2D=(16384, 16384), 2048 layers
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
  Max dimension size of a grid size    (x,y,z): (65535, 65535, 65535)
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 2 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Enabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Bus ID / PCI location ID:           20 / 0
  Compute Mode:
     < Exclusive (only one host thread in one process is able to use ::cudaSetDevice() with this device) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 6.5, CUDA Runtime Version = 6.5, NumDevs = 1, Device0 = Tesla M2050
Result = PASS

missing vtable error when doing make

I got following error when doing make on MacOS. is it related to the C++ compiler?
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2014 NVIDIA Corporation
Built on Thu_Jul_17_19:13:24_CDT_2014
Cuda compilation tools, release 6.5, V6.5.12

NOTE: a missing vtable usually means the first non-inline virtual member function has no definition.
"vtable for std::basic_stringbuf<char, std::char_traits, std::allocator >", referenced from:
cudarray::cuda_kernel_check(char const_, int) in pool_b01.o
cudarray::cuda_kernel_check(char const_, int) in array_ops.o
cudarray::cuda_check(cudaError, char const_, int) in array_ops.o
cudarray::cuda_check(cudaError, char const_, int) in reduction.o
cudarray::curand_check(curandStatus, char const_, int) in random.o
cudarray::cuda_kernel_check(char const_, int) in img2win.o
cudarray::cuda_kernel_check(char const_, int) in rescale.o
...
NOTE: a missing vtable usually means the first non-inline virtual member function has no definition.
ld: symbol(s) not found for architecture x86_64
clang: fatal error: linker command failed with exit code 1 (use -v to see invocation)
make: *_* [build/libcudarray.so] Error 1

Install Problems Ubuntu 14.04 with cuDNN and CUDA 7.5, Anaconda2, PyCharm IDE, numpy 1.10

Ultimately I'm getting following error:

Traceback (most recent call last):
  File "/home/philglau/PycharmProjects/testing/cudarry_test.py", line 7, in <module>
    import cudarray as ca
  File "/home/philglau/anaconda2/lib/python2.7/site-packages/cudarray-0.1.dev0-py2.7-linux-x86_64.egg/cudarray/__init__.py", line 20, in <module>
    from .cudarray import *
  File "/home/philglau/anaconda2/lib/python2.7/site-packages/cudarray-0.1.dev0-py2.7-linux-x86_64.egg/cudarray/cudarray.py", line 2, in <module>
    from .wrap.array_data import ArrayData
ImportError: libcudart.so.7.5: cannot open shared object file: No such file or directory

I have CUDA and cuDNN installed and working in /usr/local/cuda and prior to make the following environmental variables were set as:

philglau@gpu-workstation:~$ echo $INSTALL_PREFIX
/home/philglau/anaconda2
philglau@gpu-workstation:~$ echo $CUDA_PREFIX
/usr/local/cuda-7.5
philglau@gpu-workstation:~$ echo $CUDNN_ENABLED
1
philglau@gpu-workstation:~$

inside of /usr/local/cuda-7.5 I can see the libcudart.so.7.5

philglau@gpu-workstation:/usr/local/cuda-7.5/lib64$ ll *libcudart.so*
lrwxrwxrwx 1 root root     16 Aug 15 08:55 libcudart.so -> libcudart.so.7.5*
lrwxrwxrwx 1 root root     19 Aug 15 08:55 libcudart.so.7.5 -> libcudart.so.7.5.18*
-rwxr-xr-x 1 root root 383336 Aug 15 08:55 libcudart.so.7.5.18*
philglau@gpu-workstation:/usr/local/cuda-7.5/lib64$

I tried make with CUDA prefix set to both /usr/local/cuda and /usr/local/cuda-7.5 but that didn't make a difference. (I didn't think it would)

Any thoughts on what I should try to do to resolve this error? Thank you in advance for your help.

Issues with installing/running CUDA back-end

Hi, I've been stuck on this for a while [noob]
I have installed CUDA from Nvidia website. I'm trying to use CUDArray to run your neural_artistic_style algorithm using my GPU. On the CPU it works fine (super slow, but works)

At the CUDArray installation I set the environment variables, but when I do make i get this message

nvcc -gencode arch=compute_20,code=sm_20 -gencode arch=compute_20,code=compute_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -O3 --compiler-options '-DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors' --ftz=true --prec-div=false -prec-sqrt=false --fmad=true -I./include -I/usr/local/cuda/include -c -o src/nnet/pool_b01.o src/nnet/pool_b01.cu make: nvcc: No such file or directory make: *** [src/nnet/pool_b01.o] Error 1

I have found the nvcc file at usr/local/cuda/bin/. I've also tried LD_LIBRARY_PATH='/usr/local/cuda/bin/' before doing make but that didn't help either.

Any tips? What am I doing wrong?

Thanks a lot!

python setup.py install error

python setup.py install
error in cudarray setup command: CUDA back-end wants to be able to remove cudarray.cudarray_wrap, but the distribution doesn't contain any packages or modules under cudarray.cudarray_wrap

How to solve, thank you very much

Max operation

Hello, I read in docs that max over the array is supported, but I cannot find a way to use it. Can't you give an example ? Thank you!

setup.py install problem

Greetings, great project here; trying to get this to work with your fantastic neural artistic style project so the generation of images there doesn't take days (on a CPU).

As I don't have a GPU, I set up an Amazon GPU instance provided by Nvidia called, "Amazon Linux AMI with NVIDIA GRID GPU Driver", and downloaded the cudnn package from Nvidia's website.

Now when I try to install your package I get the following error. I pasted all the way from the make command:

[ec2-user@ip-172-31-3-36 cudarray-master]$ make
g++ -DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors -I./include -I/opt/nvidia/cuda/include -c -o src/nnet/conv_bc01_matmul.o src/nnet/conv_bc01_matmul.cpp
nvcc -gencode arch=compute_20,code=sm_20 -gencode arch=compute_20,code=compute_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -O3 --compiler-options '-DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors' --ftz=true --prec-div=false -prec-sqrt=false --fmad=true -I./include -I/opt/nvidia/cuda/include -c -o src/nnet/pool_b01.o src/nnet/pool_b01.cu
g++ -DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors -I./include -I/opt/nvidia/cuda/include -c -o src/nnet/cudnn.o src/nnet/cudnn.cpp
nvcc -gencode arch=compute_20,code=sm_20 -gencode arch=compute_20,code=compute_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -O3 --compiler-options '-DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors' --ftz=true --prec-div=false -prec-sqrt=false --fmad=true -I./include -I/opt/nvidia/cuda/include -c -o src/array_ops.o src/array_ops.cu
nvcc -gencode arch=compute_20,code=sm_20 -gencode arch=compute_20,code=compute_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -O3 --compiler-options '-DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors' --ftz=true --prec-div=false -prec-sqrt=false --fmad=true -I./include -I/opt/nvidia/cuda/include -c -o src/elementwise.o src/elementwise.cu
nvcc -gencode arch=compute_20,code=sm_20 -gencode arch=compute_20,code=compute_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -O3 --compiler-options '-DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors' --ftz=true --prec-div=false -prec-sqrt=false --fmad=true -I./include -I/opt/nvidia/cuda/include -c -o src/reduction.o src/reduction.cu
g++ -DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors -I./include -I/opt/nvidia/cuda/include -c -o src/blas.o src/blas.cpp
nvcc -gencode arch=compute_20,code=sm_20 -gencode arch=compute_20,code=compute_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -O3 --compiler-options '-DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors' --ftz=true --prec-div=false -prec-sqrt=false --fmad=true -I./include -I/opt/nvidia/cuda/include -c -o src/random.o src/random.cu
nvcc -gencode arch=compute_20,code=sm_20 -gencode arch=compute_20,code=compute_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -O3 --compiler-options '-DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors' --ftz=true --prec-div=false -prec-sqrt=false --fmad=true -I./include -I/opt/nvidia/cuda/include -c -o src/image/img2win.o src/image/img2win.cu
nvcc -gencode arch=compute_20,code=sm_20 -gencode arch=compute_20,code=compute_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_30,code=compute_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_35,code=compute_35 -O3 --compiler-options '-DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors' --ftz=true --prec-div=false -prec-sqrt=false --fmad=true -I./include -I/opt/nvidia/cuda/include -c -o src/nnet/one_hot.o src/nnet/one_hot.cu
mkdir -p ./build
g++ -shared -DCUDNN_ENABLED -O3 -fPIC -Wall -Wfatal-errors -o build/libcudarray.so src/nnet/conv_bc01_matmul.o src/nnet/pool_b01.o src/nnet/cudnn.o src/array_ops.o src/elementwise.o src/reduction.o src/blas.o src/random.o src/image/img2win.o src/nnet/one_hot.o -L/opt/nvidia/cuda/lib64 -L/opt/nvidia/cuda/lib -lcudart -lcublas -lcufft -lcurand -lcudnn
[ec2-user@ip-172-31-3-36 cudarray-master]$ sudo make install
cp ./build/libcudarray.so /usr/local/lib/libcudarray.so
[ec2-user@ip-172-31-3-36 cudarray-master]$ sudo python setup.py install
Compiling cudarray/numpy_backend/nnet/conv_bc01.pyx because it changed.
Compiling cudarray/numpy_backend/nnet/pool_bc01.pyx because it changed.
Compiling cudarray/numpy_backend/nnet/lrnorm_bc01.pyx because it changed.
[1/3] Cythonizing cudarray/numpy_backend/nnet/conv_bc01.pyx
[2/3] Cythonizing cudarray/numpy_backend/nnet/lrnorm_bc01.pyx
[3/3] Cythonizing cudarray/numpy_backend/nnet/pool_bc01.pyx
/usr/lib/python2.7/dist-packages/setuptools/dist.py:282: UserWarning: Normalizing '0.1.dev' to '0.1.dev0'
normalized_version,
running install
running bdist_egg
running egg_info
creating cudarray.egg-info
writing requirements to cudarray.egg-info/requires.txt
writing cudarray.egg-info/PKG-INFO
writing top-level names to cudarray.egg-info/top_level.txt
writing dependency_links to cudarray.egg-info/dependency_links.txt
writing manifest file 'cudarray.egg-info/SOURCES.txt'
reading manifest file 'cudarray.egg-info/SOURCES.txt'
writing manifest file 'cudarray.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build/lib.linux-x86_64-2.7
creating build/lib.linux-x86_64-2.7/cudarray
copying cudarray/elementwise.py -> build/lib.linux-x86_64-2.7/cudarray
copying cudarray/reduction.py -> build/lib.linux-x86_64-2.7/cudarray
copying cudarray/base.py -> build/lib.linux-x86_64-2.7/cudarray
copying cudarray/random.py -> build/lib.linux-x86_64-2.7/cudarray
copying cudarray/cudarray.py -> build/lib.linux-x86_64-2.7/cudarray
copying cudarray/helpers.py -> build/lib.linux-x86_64-2.7/cudarray
copying cudarray/linalg.py -> build/lib.linux-x86_64-2.7/cudarray
copying cudarray/init.py -> build/lib.linux-x86_64-2.7/cudarray
creating build/lib.linux-x86_64-2.7/cudarray/wrap
copying cudarray/wrap/init.py -> build/lib.linux-x86_64-2.7/cudarray/wrap
creating build/lib.linux-x86_64-2.7/cudarray/numpy_backend
copying cudarray/numpy_backend/init.py -> build/lib.linux-x86_64-2.7/cudarray/numpy_backend
creating build/lib.linux-x86_64-2.7/cudarray/nnet
copying cudarray/nnet/special.py -> build/lib.linux-x86_64-2.7/cudarray/nnet
copying cudarray/nnet/math.py -> build/lib.linux-x86_64-2.7/cudarray/nnet
copying cudarray/nnet/pool.py -> build/lib.linux-x86_64-2.7/cudarray/nnet
copying cudarray/nnet/init.py -> build/lib.linux-x86_64-2.7/cudarray/nnet
copying cudarray/nnet/conv.py -> build/lib.linux-x86_64-2.7/cudarray/nnet
creating build/lib.linux-x86_64-2.7/cudarray/batch
copying cudarray/batch/linalg.py -> build/lib.linux-x86_64-2.7/cudarray/batch
copying cudarray/batch/init.py -> build/lib.linux-x86_64-2.7/cudarray/batch
creating build/lib.linux-x86_64-2.7/cudarray/numpy_backend/nnet
copying cudarray/numpy_backend/nnet/special.py -> build/lib.linux-x86_64-2.7/cudarray/numpy_backend/nnet
copying cudarray/numpy_backend/nnet/activations.py -> build/lib.linux-x86_64-2.7/cudarray/numpy_backend/nnet
copying cudarray/numpy_backend/nnet/pool.py -> build/lib.linux-x86_64-2.7/cudarray/numpy_backend/nnet
copying cudarray/numpy_backend/nnet/init.py -> build/lib.linux-x86_64-2.7/cudarray/numpy_backend/nnet
copying cudarray/numpy_backend/nnet/conv.py -> build/lib.linux-x86_64-2.7/cudarray/numpy_backend/nnet
running build_ext
building 'cudarray.numpy_backend.nnet.conv_bc01' extension
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/cudarray
creating build/temp.linux-x86_64-2.7/cudarray/numpy_backend
creating build/temp.linux-x86_64-2.7/cudarray/numpy_backend/nnet
gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/local/lib64/python2.7/site-packages/numpy/core/include -I/usr/include/python2.7 -c cudarray/numpy_backend/nnet/conv_bc01.c -o build/temp.linux-x86_64-2.7/cudarray/numpy_backend/nnet/conv_bc01.o
In file included from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1804:0,
from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:17,
from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from cudarray/numpy_backend/nnet/conv_bc01.c:250:
/usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it by "
^
In file included from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:26:0,
from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from cudarray/numpy_backend/nnet/conv_bc01.c:250:
/usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:1629:1: warning: ‘_import_array’ defined but not used [-Wunused-function]
_import_array(void)
^
In file included from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/ufuncobject.h:317:0,
from cudarray/numpy_backend/nnet/conv_bc01.c:251:
/usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/__ufunc_api.h:241:1: warning: ‘_import_umath’ defined but not used [-Wunused-function]
_import_umath(void)
^
gcc -pthread -shared build/temp.linux-x86_64-2.7/cudarray/numpy_backend/nnet/conv_bc01.o -L/usr/lib64 -lpython2.7 -o build/lib.linux-x86_64-2.7/cudarray/numpy_backend/nnet/conv_bc01.so
building 'cudarray.numpy_backend.nnet.pool_bc01' extension
gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/local/lib64/python2.7/site-packages/numpy/core/include -I/usr/include/python2.7 -c cudarray/numpy_backend/nnet/pool_bc01.c -o build/temp.linux-x86_64-2.7/cudarray/numpy_backend/nnet/pool_bc01.o
In file included from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1804:0,
from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:17,
from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from cudarray/numpy_backend/nnet/pool_bc01.c:250:
/usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it by "
^
In file included from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:26:0,
from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from cudarray/numpy_backend/nnet/pool_bc01.c:250:
/usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:1629:1: warning: ‘_import_array’ defined but not used [-Wunused-function]
_import_array(void)
^
In file included from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/ufuncobject.h:317:0,
from cudarray/numpy_backend/nnet/pool_bc01.c:251:
/usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/__ufunc_api.h:241:1: warning: ‘_import_umath’ defined but not used [-Wunused-function]
_import_umath(void)
^
gcc -pthread -shared build/temp.linux-x86_64-2.7/cudarray/numpy_backend/nnet/pool_bc01.o -L/usr/lib64 -lpython2.7 -o build/lib.linux-x86_64-2.7/cudarray/numpy_backend/nnet/pool_bc01.so
building 'cudarray.numpy_backend.nnet.lrnorm_bc01' extension
gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/local/lib64/python2.7/site-packages/numpy/core/include -I/usr/include/python2.7 -c cudarray/numpy_backend/nnet/lrnorm_bc01.c -o build/temp.linux-x86_64-2.7/cudarray/numpy_backend/nnet/lrnorm_bc01.o
In file included from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1804:0,
from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:17,
from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from cudarray/numpy_backend/nnet/lrnorm_bc01.c:250:
/usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:15:2: warning: #warning "Using deprecated NumPy API, disable it by " "#defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp]
#warning "Using deprecated NumPy API, disable it by "
^
In file included from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:26:0,
from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/arrayobject.h:4,
from cudarray/numpy_backend/nnet/lrnorm_bc01.c:250:
/usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/__multiarray_api.h:1629:1: warning: ‘_import_array’ defined but not used [-Wunused-function]
_import_array(void)
^
In file included from /usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/ufuncobject.h:317:0,
from cudarray/numpy_backend/nnet/lrnorm_bc01.c:251:
/usr/local/lib64/python2.7/site-packages/numpy/core/include/numpy/__ufunc_api.h:241:1: warning: ‘_import_umath’ defined but not used [-Wunused-function]
_import_umath(void)
^
gcc -pthread -shared build/temp.linux-x86_64-2.7/cudarray/numpy_backend/nnet/lrnorm_bc01.o -L/usr/lib64 -lpython2.7 -o build/lib.linux-x86_64-2.7/cudarray/numpy_backend/nnet/lrnorm_bc01.so
cythoning ./cudarray/wrap/cudart.pyx to ./cudarray/wrap/cudart.cpp
building 'cudarray.wrap.cudart' extension
creating build/temp.linux-x86_64-2.7/cudarray/wrap
gcc -pthread -fno-strict-aliasing -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/usr/local/cuda/include -I./include -I/usr/local/lib64/python2.7/site-packages/numpy/core/include -I/usr/local/lib64/python2.7/site-packages/numpy/core/include -I/usr/include/python2.7 -c ./cudarray/wrap/cudart.cpp -o build/temp.linux-x86_64-2.7/./cudarray/wrap/cudart.o -O3 -fPIC -Wall -Wfatal-errors
./cudarray/wrap/cudart.cpp:248:26: fatal error: driver_types.h: No such file or directory
#include "driver_types.h"
^
compilation terminated.
error: command 'gcc' failed with exit status 1

Any help would be greatly appreciated; am excited to get the artistic style program working at a reasonable speed.

installation errors

Hi there,

Thanks for making such a great library. I am trying to install it but failed. Below are the errors:

in the 'make' step:
g++ -O3 -fPIC -Wall -Wfatal-errors -I./include -I/opt/cuda-5.0.35/include -c -o src/blas.o src/blas.cpp
src/blas.cpp: In function ‘const char* cudarray::cublas_message(cublasStatus_t)’:
src/blas.cpp:168: error: ‘CUBLAS_STATUS_NOT_SUPPORTED’ was not declared in this scope
compilation terminated due to -Wfatal-errors.
make: *** [src/blas.o] Error 1

g++ -O3 -fPIC -Wall -Wfatal-errors -I./include -I/opt/cuda-5.0.35/include -c -o src/blas.o src/blas.cpp
src/blas.cpp: In function ‘const char* cudarray::cublas_message(cublasStatus_t)’:
src/blas.cpp:170: error: ‘CUBLAS_STATUS_LICENSE_ERROR’ was not declared in this scope
compilation terminated due to -Wfatal-errors.
make: *** [src/blas.o] Error 1

To pass this step, I commented the four lines in blas.cpp:
// case CUBLAS_STATUS_NOT_SUPPORTED:
// return "The functionnality requested is not supported.";
// case CUBLAS_STATUS_LICENSE_ERROR:
// return "The functionality requested requires some license.";

Then I can compile the library successfully.

in python setup.py install --user (i don't have sudo permission)
gcc: cudarray/numpy_backend/nnet/conv_bc01.c: No such file or directory
gcc: no input files
error: command 'gcc' failed with exit status 1
I have no idea of how to solve them.

Could you please help with above errors?

Thanks.

Fast way to resize 3D matrices

I'm trying to scale up 3D matrices from a deeppy convolutional neural network, for example from (512,64,64) to (512,128,128). Currently I'm doing this by going via a numpy array, then iterating and using scipy.misc.imresize with bilinear filtering which is very slow, but works.

Is there a way to do this on CUDA? If so, is this exposed to cudarray?

The alternative (not quite as good) would be to scale the same matrices down, for example using the same code that does pooling. It'd take one matrix that's (512,128,128) and return (512,64,64). I presume there is a way I can do this as a function call on an array rather than within a deeppy layer?

Thanks!

force the back-end to CUDA

I'm trying to force the back-end to CUDA using the steps in your installation guide and get this:

ImportError: dlopen(/path/to/anaconda/lib/python2.7/site-packages/cudarray-0.1.dev0-py2.7-macosx-10.5-x86_64.egg/cudarray/wrap/array_data.so, 2): Library not loaded: @rpath/libcudart.7.5.dylib Referenced from: /path/to/anaconda/lib/python2.7/site-packages/cudarray-0.1.dev0-py2.7-macosx-10.5-x86_64.egg/cudarray/wrap/array_data.so Reason: image not found

any ideas? Thanks again for your help.

Bounds check ?

Looks like I can access the values in the array, which are not there.

t.shape
(81920, 10)

and no errors when doing t[:,10] or t[:,1000].

memory leak in src/blas.cpp line 90

For new[] should use delete[] to clean it up.

delete ptrs_host;

should changed to:

delete [] ptrs_host;

ImportError: libcudarray.so: cannot open shared object file

Hi,
first of all: amazing work you are doing with cudarray! I like the concept quite a lot. The installation runs smoothly (after updating Cython), except that I can't use the CUDA backend. Every time I try, I get the following error message,

ImportError                               Traceback (most recent call last)
<ipython-input-4-342f252c3bce> in <module>()
----> 1 import cudarray as ca
      2 a = ca.random.uniform(size=(100, 10))
      3 for _ in range(1000):
      4     ca.softmax(a)

/usr/local/lib/python2.7/dist-packages/cudarray-0.1-py2.7-linux-x86_64.egg/cudarray/__init__.py in <module>()
     28         try:
     29             from .base import *
---> 30             from .cudarray import *
     31             from .linalg import *
     32             from .elementwise import *

/usr/local/lib/python2.7/dist-packages/cudarray-0.1-py2.7-linux-x86_64.egg/cudarray/cudarray.py in <module>()
      1 import numpy as np
      2 from .wrap.array_data import ArrayData
----> 3 from . import elementwise
      4 from . import base
      5 from . import helpers

/usr/local/lib/python2.7/dist-packages/cudarray-0.1-py2.7-linux-x86_64.egg/cudarray/elementwise.py in <module>()
      1 import numpy as np
      2 import cudarray
----> 3 from .wrap import elementwise
      4 from . import helpers
      5 

ImportError: libcudarray.so: cannot open shared object file: No such file or directory

The file exists at /usr/local/lib but adding it to the standard search paths didn't help. Any ideas?

And while we are at it, just a short question on cudarray: all operations are done on the GPU, so there is no back-and-forth copy between host and device, right?

Thanks a lot in advance!
Wieland

fatal error: 'numpy/arrayobject.h' file not found

I ran setup.py without cuda.

python setup.py --without-cuda install

getting error:

cudarray/numpy_backend/nnet/conv_bc01.c:250:10: fatal error: 'numpy/arrayobject.h' file not found #include "numpy/arrayobject.h"
     ^
1 error generated.
error: command 'clang' failed with exit status 1

I think this is a setup.py issue. I do see

    return cythonize(cython_srcs, include_path=[numpy.get_include()])

is used. I do see people talk about similar issue in https://www.reddit.com/r/MachineLearning/comments/2lv8n3/cudabased_neural_networks_in_python/cpjm44c but I followed "python setup.py build_ext --inplace" and get the same error.

The function requires a feature absent from the GPU

Hey,
just wondering if anybody came across this problem:

terminate called after throwing an instance of 'std::runtime_error'
  what():  ./include/cudarray/nnet/cudnn.hpp:107: The function requires a feature absent from the GPU
Aborted (core dumped)

I've traced it down to be coming from here: https://github.com/andersbll/neural_artistic_style/blob/master/style_network.py#L99

I'm on Ubuntu 14.04 with a fairly dated GeForce GTX 560.. I suppose that's the reason, though a new card is not an option for me at the moment and I am wondering whether it's possible to get rid of that one feature that's absent from the GPU, though I can't tell which one it would be. Do you have more insight into this?

Thanks already,
Marcel

Cudnn 5 Support for GRU and LSTM cells

Cudnn 5 now supports GRU and LSTM cells natively, with speedup's of up to 6 times compared to a Cublass implementation of GRU's and LSTMs. IT would be great if Cudarray could support this.

https://devblogs.nvidia.com/parallelforall/optimizing-recurrent-neural-networks-cudnn-5/

install codecs issue . UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 14: ordinal not in range(128)

Traceback (most recent call last):
File "setup.py", line 154, in
ext_modules=numpy_extensions(),
File "setup.py", line 79, in numpy_extensions
return cythonize(cython_srcs, include_path=[numpy.get_include()])
File "/usr/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 758, in cythonize
aliases=aliases)
File "/usr/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 664, in create_extension_list
kwds = deps.distutils_info(file, aliases, base).values
File "/usr/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 564, in distutils_info
return (self.transitive_merge(filename, self.distutils_info0, DistutilsInfo.merge)
File "/usr/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 574, in transitive_merge
node, extract, merge, seen, {}, self.cimported_files)[0]
File "/usr/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 585, in transitive_merge_helper
for next in outgoing(node):
File "/usr/local/lib/python2.7/site-packages/Cython/Utils.py", line 43, in wrapper
res = cache[args] = f(self, _args)
File "/usr/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 513, in cimported_files
pxd_file = self.find_pxd(module, filename)
File "/usr/local/lib/python2.7/site-packages/Cython/Utils.py", line 43, in wrapper
res = cache[args] = f(self, *args)
File "/usr/local/lib/python2.7/site-packages/Cython/Build/Dependencies.py", line 497, in find_pxd
pxd = self.context.find_pxd_file(relative, None)
File "/usr/local/lib/python2.7/site-packages/Cython/Compiler/Main.py", line 239, in find_pxd_file
pxd = self.search_include_directories(qualified_name, ".pxd", pos, sys_path=sys_path)
File "/usr/local/lib/python2.7/site-packages/Cython/Compiler/Main.py", line 280, in search_include_directories
tuple(self.include_directories), qualified_name, suffix, pos, include, sys_path)
File "/usr/local/lib/python2.7/site-packages/Cython/Utils.py", line 29, in wrapper
res = cache[args] = f(_args)
File "/usr/local/lib/python2.7/site-packages/Cython/Utils.py", line 119, in search_include_directories
path = os.path.join(dir, dotted_filename)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/posixpath.py", line 80, in join
path += '/' + b
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe7 in position 14: ordinal not in range(128)

Cuda Streams

It will be great if cudarray can support Cuda streams for maximum utilization of GPU resource.

Something like,

ca.dot(a,b, stream = 1, out = c )
ca.dot(d,e, stream = 2, out = f )

andersbll / cudarray Goto Github PK

cudarray's People

Contributors

Stargazers

Watchers

Forkers

cudarray's Issues

......

Error compiling Cython file:

Error compiling Cython file:

Error compiling Cython file:

^

Error compiling Cython file:

^

Error compiling Cython file:

^

Error compiling Cython file:

^

Error compiling Cython file:

^

Error compiling Cython file:

Error compiling Cython file:

Error compiling Cython file:

Error compiling Cython file:

Error compiling Cython file:

Error compiling Cython file:

^

Error compiling Cython file:

^

Error compiling Cython file:

^

Error compiling Cython file:

^

Error compiling Cython file:

^

Error compiling Cython file:

^

include "numpy/arrayobject.h"

elif'

Recommend Projects

Recommend Topics

Recommend Org