Comments (16)
Definitely needed!
from halide.
This is in process.
from halide.
Any progress on OpenCL support? I would be happy to beta test it.
from halide.
+1
from halide.
+2
from halide.
+3
from halide.
As discussed at GTC, it might be difficult to do any portable IR for OpenCL. So an ideal solution maybe to use clang to generate OpenCL kernels source.
from halide.
Wait there is OpenCL SPIR which is pretty similar to LLVMIR even clang can generate SPIR via:
clang -x cl -fno-builtin -target spir -c -emit-llvm
and seems coming soon at least in AMD OCL drivers for testing..
from halide.
Yes but SPIR is not (yet) supported by any GPU vendor and as far as I know
none have plans for it in the next year.
On Apr 4, 2013 3:25 AM, "oscarbg" [email protected] wrote:
Wait there is OpenCL SPIR which is pretty similar to LLVMIR even clang can
generate SPIR via:
clang -x cl -fno-builtin -target spir -c -emit-llvm
and seems coming soon at least in AMD OCL drivers for testing..—
Reply to this email directly or view it on GitHubhttps://github.com//issues/42#issuecomment-15889813
.
from halide.
SPIR has been released recently.
Has Halide made progress on OpenCL backend, using SPIR or not?
from halide.
@dsharlet-intel has been making steady progress on both the SPIR and OpenCL C-based backends. I believe they're starting to pass most/all of the tests, as of (the very recent) 2e3222b commit.
from halide.
I tried the basic apps on osx 10.9 and all seg fault with OpenCL.
-- Mike
On Dec 11, 2013 7:52 AM, "Jonathan Ragan-Kelley" [email protected]
wrote:
@dsharlet-intel https://github.com/dsharlet-intel has been making
steady progress on both the SPIR and OpenCL C-based backends. I believe
they're starting to pass most/all of the tests, as of (the very recent)
2e3222b 2e3222b0489d commit.—
Reply to this email directly or view it on GitHubhttps://github.com//issues/42#issuecomment-30331805
.
from halide.
Do you know which OpenCL device you’re targeting? One really annoying gotcha of Apple’s implementations is that their x86 OpenCL backend only supports 1D kernel launches.
Do the tests pass?
from halide.
I'm targeting embedded gpu mainly but before porting there I test on
desktops, which is macbook pro retina osx 10.9 with nvidia GPU. There the
Cuda/ptx back end works perfectly.
I'll try the tests and try to pinpoint the issue.
-- Mike
On Dec 11, 2013 10:18 AM, "Jonathan Ragan-Kelley" [email protected]
wrote:
Do you know which OpenCL device you’re targeting? One really annoying
gotcha of Apple’s implementations is that their x86 OpenCL backend only
supports 1D kernel launches.Do the tests pass?
—
Reply to this email directly or view it on GitHubhttps://github.com//issues/42#issuecomment-30346118
.
from halide.
I appreciate any information you can share from running the tests, but you should know that I've only just started looking at the apps since the last commit cited by jrk. The apps use a little bit different mechanism to run the generated code, which I didn't realize until recently.
In addition to the issue jrk mentioned regarding Apple's x86 OpenCL implementation, they also have a little bit different expected behavior for creating the OpenCL context. I will need to get an Apple machine to make sure this still works for Apple in addition to Linux/Win.
from halide.
doing: make run_tests with HL_TARGET=opencl
clang++ -O3 test/correctness/argmax.cpp -Iinclude -Lbin -lHalide -lpthread
-ldl -o bin/test_argmax
cd tmp ; DYLD_LIBRARY_PATH=../bin LD_LIBRARY_PATH=../bin ../bin/test_argmax
OpenCL device codegen init_module
Error: Failed to build program executable! err = -11
Build Log:
No kernels or only kernel prototypes found.
Error: err == CL_SUCCESS
make: *** [test_argmax] Error 1
Same error with test_internal:
cd tmp ; DYLD_LIBRARY_PATH=../bin LD_LIBRARY_PATH=../bin
../bin/test_internal
IRPrinter test passed
CodeGen_C test passed
Simplify test passed
Bounds test passed
Lowering test passed
OpenCL device codegen init_module
Error: Failed to build program executable! err = -11
Build Log:
No kernels or only kernel prototypes found.
Error: err == CL_SUCCESS
make: *** [test_internal] Error 1
The problem is the kernel being generated:
/OpenCL C/
float nan_f32() { return NAN; }
float neg_inf_f32() { return -INFINITY; }
float inf_f32() { return INFINITY; }
float sqrt_f32(float x) { return sqrt(x); }
float sin_f32(float x) { return sin(x); }
float cos_f32(float x) { return cos(x); }
float exp_f32(float x) { return exp(x); }
float log_f32(float x) { return log(x); }
float abs_f32(float x) { return x < 0.0f ? -x : x; }
float floor_f32(float x) { return floor(x); }
float ceil_f32(float x) { return ceil(x); }
float round_f32(float x) { return round(x); }
float pow_f32(float x, float y) { return pow(x, y); }
float asin_f32(float x) { return asin(x); }
float acos_f32(float x) { return acos(x); }
float tan_f32(float x) { return tan(x); }
float atan_f32(float x) { return atan(x); }
float atan2_f32(float y, float x) { return atan2(y, x); }
float sinh_f32(float x) { return sinh(x); }
float asinh_f32(float x) { return asinh(x); }
float cosh_f32(float x) { return cosh(x); }
float acosh_f32(float x) { return acosh(x); }
float tanh_f32(float x) { return tanh(x); }
float atanh_f32(float x) { return atanh(x); }
there is no __kernel!!!
Misc stuff:
on OSX don't use g++ for CXX default but clang++. Default g++ is way to old.
with HL_TARGET=host or cuda, all success.
-- Mike
On Wed, Dec 11, 2013 at 10:53 AM, dsharlet-intel
[email protected]:
I appreciate any information you can share from running the tests, but you
should know that I've only just started looking at the apps since the last
commit cited by jrk. The apps use a little bit different mechanism to run
the generated code, which I didn't realize until recently.In addition to the issue jrk mentioned regarding Apple's x86 OpenCL
implementation, they also have a little bit different expected behavior for
creating the OpenCL context. I will need to get an Apple machine to make
sure this still works for Apple in addition to Linux/Win.—
Reply to this email directly or view it on GitHubhttps://github.com//issues/42#issuecomment-30350612
.
from halide.
Related Issues (20)
- Getting InternalError with a combination of compute_with, and reductions that use functions with update definitions HOT 2
- We should shuffle around the realization order to minimize peak memory usage HOT 3
- [upstream] `generator_aot_cleanup_on_error`/`generator_aotcpp_cleanup_on_error` don't work with RustiCL HOT 1
- Fuzz-test failure HOT 1
- Support aliased input and output buffers
- [D3D12] HLSL 6? HOT 6
- Fuzzing Failures HOT 3
- why halide blur algorithm is slower than the blur by opencv HOT 1
- Error message "Unsupported HVX type: float32x32" HOT 7
- oss-fuzz is of dubious value for our use-cases HOT 2
- Some find intrinsics rewrite rules that convert widen_right_add to widening_add are wrong HOT 2
- print() and print_when() could be more ergonomic to use HOT 2
- Halide for Hexagon HVX Instructions in README Still Correct? HOT 1
- `debug` shouldn't print entry/exit for parallel closures (but does) HOT 3
- Implement nested loop with indexing coordinates different from looping coordinates
- Different results on different GPUs (Stress test) HOT 2
- We should have helper functions that return bool Exprs that query properties of the target HOT 4
- Improvements for Halide floating-point math functions. HOT 2
- [HVX] Blur example fails with: Failed to write shared object to 'libhalide_shared_runtime.so' HOT 1
- How to debug a context? (cuda.cpp, d3d12compute.cpp, ...)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from halide.