Giter VIP home page Giter VIP logo

Comments (10)

blueberry avatar blueberry commented on July 25, 2024

As far as I remember, pointers to Java arrays will work from native code, but it is not guaranteed. It all depends on the JVM implementation and the memory resources available at each particular moment. That also makes it difficult for testing, because it will work - most of the time. But, that can fail anytime, since JVM is free to rearrange Java arrays in memory as it pleases. There are methods to tell it not to do it, but that makes JVM less effective at its job.

All in all, the safest bet is to use DirectByteBuffer.

from jocl.

gpu avatar gpu commented on July 25, 2024

They will not work directly. You can obtain a pointer to a Java array in JNI, using GetPrimitiveArrayCritical. This is already done in JOCL, for blocking operations. The pseudocode is

void data* = GetPrimitiveArrayCritical(someJavaArray);
clEnqueueWriteBuffer(..., /*blocking=*/ true, ... data);
ReleasePrimitiveArrayCritical(data);

But as the name suggests, this imposes a "critical section". Particularly, it is not possible to do GetPrimitiveArrayCritical in one JNI function, and ReleasePrimitiveArrayCritical in another.

Now, when calling clCreateBuffer with CL_MEM_USE_HOST_PTR, and a pointer to a Java array, even if it fetches the pointer using GetPrimitiveArrayCritical, it will release it after the function call.

And the crucial point is: Nobody knows what exactly the OpenCL implementation is doing with this pointer when CL_MEM_USE_HOST_PTR is given. It's likely that it will store this pointer in some way, to later update it with the data from the cl_mem object. But at this point in time, this pointer is already invalid. This could cause nasty crashes. (In fact, this issue is based on a bug report that I received via mail, and where this might have happened).

Usually, I try to support Java arrays whereever possible, because always doing everything with direct buffers is cumbersome. But I'm afraid that CL_MEM_USE_HOST_PTR will likely be one of the few cases where I'll have to add something like

if (flagsContain(CL_MEM_USE_HOST_PTR) && isNoDirectBufferPointer(pointer)) {
    throw new NoThatIsNotPossibleException("Sorry...");
}

I'm not sure when I'll find the time to investigate this in more detail, but wanted to summarize it here, so that people know that there likely is an issue with this usage pattern, and ... that it does not get lost somewhere at the bottom of my TODO list....

from jocl.

TPolzer avatar TPolzer commented on July 25, 2024

Doesn't this also apply to other non blocking memory operations to arrays? So e.g.

clEnqueueWriteBuffer(q, input, false, 0, Sizeof.cl_double * DATA_SIZE, Pointer.to(inputdata), 0, null, null);

should be considered unsafe?

from jocl.

gpu avatar gpu commented on July 25, 2024

@TPolzer In this case, JOCL will internally keep a reference to the input data until the write operation finished. So the input data will not be garbage collected in the meantime. (I'm not sure whether a similar concept could be applied to the CL_MEM_USE_HOST_PTR, but that's one option that I considered).

But you indirectly raised a point that is related (or structurally similar) to this issue: Although the data may not be garbage collected, it might still be relocated by the JVM when the "critical section" that is opened with GetPrimitiveArrayCritical was closed, and for non-blocking write operations, this will be done before the write operation finished. Although the exact behavior will depend on the GC and the JVM (and I never encountered any issues here), I'll try to investigate this here as well. I'm afraid that, in the worst case, and to be on the safe side, I might have to drop the support of non-blocking operations for primitive arrays altogether. There is no way of preventing a relocation of a Java array.

(So this issue just bubbled up to the top of my TODO list)

from jocl.

TPolzer avatar TPolzer commented on July 25, 2024

It is likely almost never a real concern, because humongous objects are only compacted in full gc cycles, which should rarely occur.

from jocl.

javagl avatar javagl commented on July 25, 2024

Again, I have never encountered any issues here. But this may have different reasons:

  • Doing reliable tests for this is not so trivial, due to the black-box nature of GCs. One can try to provoke errors, using some random object creations, System.gc() and Thread.sleep(x) trickery, but this always feels hackish and unreliable. Playing with some -XX:... parameters may increase the "coverage" and "likeliness of certain events" here, though
  • The Oracle VM may have a resilient behavior here. Other VMs might be more sensitive for this.

But, in doubt, even if I cannot provoke an error, and/or even if it only causes problems in full GCs and these happen "rarely", this case has to be covered. The worst thing that could happen is that it "rarely" causes a problem in a production system. Such a "problem" would probably be a JVM crash, and I consider this as totally unacceptable - even if it happens "rarely". It may never crash *.


* "never" here means: It may never crash when it is used according to the OpenCL spec. But this finds its limits in the underlying implementation. For example, a recent commit was remotely related to that: 5c6e44f - unfortunately, the underlying OpenCL implementations (by AMD and NVIDIA) also don't behave very graciously here: When calling clWaitForEvents with a cl_event that was already released, one might expect them to return CL_INVALID_EVENT - but instead, they crash the whole program, painfully. However, this is not something that I can sensibly prevent in the (thin) JOCL layer. The question of whether Java arrays may be used in async operations is in the responsibility of JOCL, and it either has to work (always), or has to be prevented.

from jocl.

gpu avatar gpu commented on July 25, 2024

^ This comment was by me. I was logged in with the wrong account.

from jocl.

gpu avatar gpu commented on July 25, 2024

I did some general tests, particularly regarding the secondary issue.

From the first test, it indeed seems that a relocation of arrays occurs only during a full GC, and the likeliness of a real relocation decreases with a larger array size. All this is to be taken with a grain of salt, as it is mainly based on some Unsafe/verbose:gc/System.gc() trickery.

But undoubtedly, an array may be relocated by the GC, at "arbitrary" points in time.

However, it's really hard to provoke the error that suspectedly may happen during a non-blocking clEnqueueWriteBuffer with a Java array. I played with different configurations, but I cannot imagine a way to deterministically cause a GC with a relocation exactly in the short time where it would have to happen to cause a problem. (This would be between the CL_PROFILING_COMMAND_START and CL_PROFILING_COMMAND_END time of the event that is associated with clEnqueueWriteBuffer - and this is only a few ms even for ridiculously large arrays).

Nevertheless, right now, I assume that I'll have to drop the support of non-blocking operations on Java arrays, because it is impossible to guarantee that this will work. This will likely be a change in JOCL 2.1 (which is pending anyhow). Unfortunately, this will affect backward compatibility, I'll still have to figure out how exactly this change will be implemented.

All this refers to the secondary issue. The primary one, namely the use of CL_MEM_USE_HOST_PTR will likely be a similar case, but still has to be investigated.

from jocl.

blueberry avatar blueberry commented on July 25, 2024

@gpu In your place, I won't be that concerned with backward compatibility, since it seems that all users are early adopters and our primary goal is to get the best possible technical solution. Please just provide clear and precise documentation (and warnings :) about changes and best ways to use the library, and don't worry. Thank you for the great effort you put into this!

Also, if you could release the tools you use for creating this as open source, that would be great, since we could try to help you more :)

from jocl.

gpu avatar gpu commented on July 25, 2024

@blueberry Some off-topic (I'm not sure how much I should elaborate this here) : Eight years ago I started this "code generation" with some crude hacks, to generate the code of JCublas. Later, I generalized this into a "proper" code generator, but went waaay to far with that (it became an "abstraction hell"). Later, I concretized it a bit, and tried to bring it into a shape that could actually be published, but it's still far from perfect. It is mainly used for parts of the libraries like JCublas or JOCLBlast, but there are still manual steps involved to integrate the generated code (and it's hard to convey the necessary assembly steps for someone who is not already familiar with it). So I think that publishing it now would be of very limited use for externals. (I would, however, provide it for anyhone who is interested in it (via mail). One should not expect it to be "easy to use" or run "out of the box", but ... maybe some feedback would help to bring it into a really publishable shape).

But more importantly, and leading to this issue: The code of JOCL itself is not generated with this code generator. One of the reasons is exactly related to this issue: There are many subtleties that might get lost when "blindly" doing a header-to-JNI-code-generation.


Now, specifically regarding this issue:

I consider backward compatibility as an important point. If I change the behavior here, this would mean that someone who used

clEnqueueWriteBuffer(..., /*blocking=*/ false, Pointer.to(someJavaArray), ...);

will receive an exception in the next version -

new CLException("Non-blocking writes with Java arrays are not possible");

And this could annoy people. I'm aware of this, and if this is changed, I will prominently point it out in the release notes.

However, I think that this change is still reasonable, and the reasons are:

  • Not changing it may lead to unpredicatable behavior, and IMHO this is a very strong argument
  • It will be easy to fix for the user, by changing blocking=false to blocking=true in this call
  • It seems very unlikely that non-blocking operations on Java arrays are actually widely used
  • The benefits of using non-blocking operations in these perticular cases seem to be negligible (but I'm currently in the process of validating this with some basic benchmarks, tests and event profiling)

from jocl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.