Giter VIP home page Giter VIP logo

opencl-cts's Introduction

OpenCL Conformance Test Suite (CTS)

This is the OpenCL CTS for all versions of the Khronos OpenCL standard.

Building the CTS

The CTS supports Linux, Windows, macOS, and Android platforms. In particular, GitHub Actions CI builds against Ubuntu 20.04, Windows-latest, and macos-latest.

Compiling the CTS requires the following CMake configuration options to be set:

  • CL_INCLUDE_DIR Points to the unified OpenCL-Headers.
  • CL_LIB_DIR Directory containing the OpenCL library to build against.
  • OPENCL_LIBRARIES Name of the OpenCL library to link.

It is advised that the OpenCL ICD-Loader is used as the OpenCL library to build against. Where CL_LIB_DIR points to a build of the ICD loader and OPENCL_LIBRARIES is "OpenCL".

Example Build

Steps on a Linux platform to clone dependencies from GitHub sources, configure a build, and compile.

git clone https://github.com/KhronosGroup/OpenCL-CTS.git
git clone https://github.com/KhronosGroup/OpenCL-Headers.git
git clone https://github.com/KhronosGroup/OpenCL-ICD-Loader.git

mkdir OpenCL-ICD-Loader/build
cmake -S OpenCL-ICD-Loader -B OpenCL-ICD-Loader/build \
      -DOPENCL_ICD_LOADER_HEADERS_DIR=$PWD/OpenCL-Headers
cmake --build ./OpenCL-ICD-Loader/build --config Release

mkdir OpenCL-CTS/build
cmake -S OpenCL-CTS -B OpenCL-CTS/build \
      -DCL_INCLUDE_DIR=$PWD/OpenCL-Headers \
      -DCL_LIB_DIR=$PWD/OpenCL-ICD-Loader/build \
      -DOPENCL_LIBRARIES=OpenCL
cmake --build OpenCL-CTS/build --config Release

Running the CTS

A build of the CTS contains multiple executables representing the directories in the test_conformance folder. Each of these executables contains sub-tests, and possibly smaller granularities of testing within the sub-tests.

See the --help output on each executable for the list of sub-tests available, as well as other options for configuring execution.

If the OpenCL library built against is the ICD Loader, and the vendor library to be tested is not registered in the default ICD Loader location then the OCL_ICD_FILENAMES environment variable will need to be set for the ICD Loader to detect the OpenCL library to use at runtime. For example, to run the basic tests on a Linux platform:

OCL_ICD_FILENAMES=/path/to/vendor_lib.so ./test_basic

Offline Compilation

Testing OpenCL drivers which do not have a runtime compiler can be done by using additional command line arguments provided by the test harness for tests which require compilation, these are:

  • --compilation-mode Selects if OpenCL-C source code should be compiled using an external tool before being passed on to the OpenCL driver in that form for testing. Online is the default mode, but also accepts the values spir-v, and binary.

  • --compilation-cache-mode Controls how the compiled OpenCL-C source code should be cached on disk.

  • --compilation-cache-path Accepts a path to a directory where the compiled binary cache should be stored on disk.

  • --compilation-program Accepts a path to an executable (default: cl_offline_compiler) invoked by the test harness to perform offline compilation of OpenCL-C source code. This executable must match the interface description.

Generating a Conformance Report

The Khronos Conformance Process Document details the steps required for a conformance submission. In this repository opencl_conformance_tests_full.csv defines the full list of tests which must be run for conformance. The output log of which must be included alongside a filled in submission details template.

Utility script run_conformance.py can be used to help generating the submission log, although it is not required.

Git tags are used to define the version of the repository conformance submissions are made against.

Contributing

Contributions are welcome to the project from Khronos members and non-members alike via GitHub Pull Requests (PR). Alternatively, if you've found a bug or have a question please file an issue in the GitHub project. First time contributors will be required to sign the Khronos Contributor License Agreement (CLA) before their PR can be merged.

PRs to the repository are required to be clang-format clean to pass CI. Developers can either use the git-clang-format tool locally to verify this before contributing, or update their PR based on the diff provided by a failing CI job.

opencl-cts's People

Contributors

aarongreig avatar ahmedamraniakdi avatar alycm avatar b-sumner avatar bashbaug avatar chemis01 avatar ellnor01 avatar ewanc avatar franklandjack avatar gwawiork avatar james-morrissey-arm avatar jeremy-kemp avatar jlewis-austin avatar johnkesapidesarm avatar jrprice avatar kamil-goras-mobica avatar kbenzie avatar kpet avatar lakshmih avatar mantognini avatar nikhiljnv avatar niranjanjoshi121 avatar paulfradgley avatar pierremoreau avatar pj87 avatar rjodinchr avatar shajder avatar stuartdbrady avatar svenvh avatar wenju-he avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

opencl-cts's Issues

integer tests are likely to miss zero

The integer tests use genrand_int32 to fill the input buffers used by the tests. This is generally not a problem, except that for the larger scalar types, int and long, the probability that the relatively few samples that are tested contain zero is low. This is particularly unfortunate for functions such as popcount(), clz(), and ctz() where getting the correct result for a zero input is especially important. We should have a way to ensure that 0 and other interesting values are guaranteed to be tested.

cl20 cl21 cl22 device_execution/enqueue_flags data race

At least four of tests' kernels run into data race:

  • enqueue_flags_wait_work_group_simple
  • enqueue_flags_wait_work_group_event
  • enqueue_flags_wait_work_group_local
  • enqueue_flags_wait_work_group_event_local

res[gid] is being written by get_num_groups()/get_local_size() threads. And if one of these thread put 2 to the array cell, it still may be overwritten with 1 by another thread. Thus, the test doesn't really test anything.

Several tests aren't meaningful in SPIR-V mode (2.1)

They use online compilation and completely bypass the SPIR-V offline path. A few examples:

  • basic/int*
  • basic/createkernelsinprogram
  • api/null_buffer_arg
  • api/kernels
  • api/create_kernels
  • generic_address_space/advanced_tests
  • ...

Exit codes potentially bogus

It appears that many main functions return the result from runTestHarness directly. https://github.com/KhronosGroup/OpenCL-CTS/search?q=runTestHarness&unscoped_q=runTestHarness

While not being documented, runTestHarness returns the number of failed tests. This number is not constraint to a specific domain (e.g. [0, 255[).

Because this number is returned from main functions, this leads to undefined behaviour when this number is outside the domain specified by the C or C++ standard. In practice, this means that returning values bigger than 256 on some Linux machine could be truncated to 0 and therefore lead the test infrastructure to think everything went okay.

Instead, to avoid any undefined behaviour, the main function should probably return either EXIT_SUCCESS or EXIT_FAILURE to err on the side of caution.

clEnqueueFillImage fill_color data type

I have a question about the clEnqueueFillImage tests. It looks like the test is always passing in floating point types for the fill_color, but it uses a variety of different cl_channel_type enums.

According to the spec:

"The fill color is a four component RGBA floating-point color value if
the image channel data type is not an unnormalized signed and unsigned integer type, is a four
component signed integer value if the image channel data type is an unnormalized signed integer
type and is a four component unsigned integer value if the image channel data type is an
unnormalized unsigned integer type. The fill color will be converted to the appropriate image
channel format and order associated with image"

So as far as I can tell, the driver isn't expected to perform a conversion from float to anything else, but that appears to be what the test expects. Can somebody please point me to an explanation for the behavior, or flag this as a bug in the test? I am looking at the CL 1.2 branch. Thanks in advance.

Integrate the SPIR-V validator

There are a number of places where we should make use of the SPIR-V validator:

  • to check the output of the offline compilation flow
  • to check that the checked-in SPIR-V binaries are correct as part of the continuous integration.

cl21 - The subgroups test is not aligned with OpenCL 2.1 spec

Internal MR (102) should be moved to opensource to fix subgroups testing to be aligned with specification OpenCL2.1
IFP - independent forward progress

  • current version looks like only for platforms supporting OpenCL2.0
  • condition for test execution - if implementation shows cl_khr_subgroups extension then use OpenCL 2.0 subgroups in device is OpenCL 2.X and higher then core subgroups should be used
  • in OpenCL2.0 version we require extension cl_khr_subgrups and IFP is expected to be supported so test IFP should be executed
  • in core subgroups we should not use function/enums with KHR suffixes (proper only for OpenCL2.0).
  • in core subgroups version no extension, because this is a core feature and IFP is not obligatory . Only if capability CL_DEVICE_SUB_GROUP_INDEPENDENT_FORWARD_PROGRESS is set to 1 then start IFP testing

cl12 test_api_min_max.c inconsistently uses ints for parameter types in test_min_max_parameter_size

The code in the test_min_max_parameter_size inconsistently assumes that gEmbeddedProfile implies !gHasLong. In the code snippet below line 999 will calculate the numberOfIntParametersToTry assuming the kernels will have int arguments. However, line 1054 will create kernel files with long arguments if longs are supported. Therefore in the case where there is an Embedded Profile device that supports the cles_khr_int64 extension the test will overrun CL_DEVICE_MAX_PARAMETER_SIZE.

/* The embedded profile does not require longs, so use ints */
if(gIsEmbedded)
numberOfIntParametersToTry = numberExpected = (maxSize-sizeof(cl_mem))/sizeof(cl_int);
else
numberOfIntParametersToTry = numberExpected = (maxSize-sizeof(cl_mem))/sizeof(cl_long);
decrement = (size_t)(numberOfIntParametersToTry/8);
if (decrement < 1)
decrement = 1;
log_info("Reported max parameter size of %d bytes.\n", (int)maxSize);
while (numberOfIntParametersToTry > 0) {
// These need to be inside to be deallocated automatically on each loop iteration.
clProgramWrapper program;
clMemWrapper mem;
clKernelWrapper kernel;
if(gIsEmbedded)
{
log_info("Trying a kernel with %ld int arguments (%ld bytes) and one cl_mem (%ld bytes) for %ld bytes total.\n",
numberOfIntParametersToTry, sizeof(cl_int)*numberOfIntParametersToTry, sizeof(cl_mem),
sizeof(cl_mem)+numberOfIntParametersToTry*sizeof(cl_int));
}
else
{
log_info("Trying a kernel with %ld long arguments (%ld bytes) and one cl_mem (%ld bytes) for %ld bytes total.\n",
numberOfIntParametersToTry, sizeof(cl_long)*numberOfIntParametersToTry, sizeof(cl_mem),
sizeof(cl_mem)+numberOfIntParametersToTry*sizeof(cl_long));
}
// Allocate memory for the program storage
data = malloc(sizeof(cl_long)*numberOfIntParametersToTry);
argumentLine = (char*)malloc(sizeof(char)*numberOfIntParametersToTry*32);
codeLines = (char*)malloc(sizeof(char)*numberOfIntParametersToTry*32);
programSrc = (char*)malloc(sizeof(char)*(numberOfIntParametersToTry*64+1024));
argumentLine[0] = '\0';
codeLines[0] = '\0';
programSrc[0] = '\0';
// Generate our results
expectedResult = 0;
for (i=0; i<(int)numberOfIntParametersToTry; i++)
{
if( gHasLong )
{
((cl_long *)data)[i] = i;
expectedResult += i;
}
else
{
((cl_int *)data)[i] = i;
expectedResult += i;
}
}
// Build the program
if( gHasLong)
sprintf(argumentLine, "%s", "long arg0");
else
sprintf(argumentLine, "%s", "int arg0");
sprintf(codeLines, "%s", "result[0] += arg0;");
for (i=1; i<(int)numberOfIntParametersToTry; i++)
{
if( gHasLong)
sprintf(argumentLine + strlen( argumentLine), ", long arg%d", i);
else
sprintf(argumentLine + strlen( argumentLine), ", int arg%d", i);

The fix would be to change the if statements on the following lines to check gHasLong instead of gIsEmbedded


Several subgroup features aren't tested

Here's a (possibly non-exhaustive) list of built-in functions that aren't tested:

  • get_kernel_sub_group_count_for_ndrange
  • get_kernel_max_sub_group_size_for_ndrange

A few other features aren't tested at all:

  • memory_scope_sub_group

Half pointer kernel args must be tested even if cl_khr_fp16 is not supported

As I can see in the code half data type is tested, but only if "cl_khr_fp16" extension is supported.

If I understand correctly, a few test cases should be tested even if "cl_khr_fp16" extension is not supported by device.

From OpenCL C spec ver. 2.0 rev. 33:

The half data type can only be used to declare a pointer to a buffer that contains half values. A few valid examples are given below:

void
bar (__global half *p)
   ...
}
__kernel void
foo (__global half *pg, __local half *pl)
{
    __global half *ptr;
    int offset;
    ptr = pg + offset;
    bar(ptr);
}

So, half_scalar_p and half_scalar_p2 test cases listed here should be always tested. Is my understanding right?

If so, I can submit a PR: which branches should it affect?

cl12 cl20 cl21 cl22 - User defined size while querying clGetDeviceInfo and CL_DEVICE_EXTENSIONS

This can generate test errors when extension string is bigger than test expects (then CL_INVALID_VALUE will be returned).
The following tests will be affected:

  • all spir tests
    char extensions[1024] = {0}; size_t size; // Querying the device for its supported extensions cl_int errcode = clGetDeviceInfo(devId, CL_DEVICE_EXTENSIONS, sizeof(extensions), extensions, &size);
  • computeinfo
    For example computeinfo tests defines config_data union where one of the filed is :
    char string[1024];

Conformance tests should include specific, representative inputs

The goal here is to make tests more robust and trustworthy.

Currently, the conformance tests generate inputs randomly. For example, for ternary functions such as mad_sat the inputs are generated in test_three_param_integer_kernel as follows:

/* Generate some streams */
generate_random_data( vecAType, vecSize * TEST_SIZE, d, inDataA );
generate_random_data( vecBType, vecSize * TEST_SIZE, d, inDataB );
generate_random_data( vecCType, vecSize * TEST_SIZE, d, inDataC );

While, theoretically, running tests many, many times with different PRNG seeds can give a good confidence in the correctness of the implementation, in practice this is not feasible for large input spaces. E.g. mad_sat on long inputs has more than 6 octodecillion possible inputs to be tested, leaving plenty of room for bugs to hide.

In addition to random testing, the conformance test should probably provide mechanisms to check specific, representative inputs. Those inputs would then test different behaviours/corner cases.

Here are a few examples of inputs for mad_sat(x, y, z), which for reference purpose is clamp(x * y + z, domain_lower_bound, domain_upper_bound) if we assume integer capable of representing infinitely large values:

  • x, y, z s.t. x * y + z isn't saturated;
  • x, y, z s.t. x * y saturated but x * y + z doesn't;
  • x, y, z s.t. x * y doesn't saturate but x * y + z does;
  • x, y, z s.t. x * y is {positive, negative} but x * y + z is {negative, positive};
  • ...

test_printf - parallel executed tests fail

test_printf use file /tmp/tmpfile or tmpfile to capture OpenCL kernel outputs, which are compared with correct results. If we run multiple printf tests in parallel, they fail because they write to the same file and one test could read another one's outputs.

OpenCL 2.2 CTS: can provide spirv10_2015.11.25.zip to non Khronos members?

reading OpenCL-CTS/readme-spir-v-binaries.txt says:

To run the 2.2 conformance tests test suite for the C++ features you need need 
SPIR-V binaries.
can be picked from:
https://cvs.khronos.org/svn/repos/OpenCL/trunk/Khronos/spirv/spirv10_2015.11.25.zip

but that link can't be accessed for a non Khronos user so can upload this file?
I say that because txt file also says this:
Alternatively you can check out and build all of the below repositories.
problem is all this links also are to Khronos members and these projects altough exist on github are missing required branches so note I also asked
SPIRV-LLVM (KhronosGroup/SPIRV-LLVM#218)
and
libclcxx (KhronosGroup/libclcxx#15)
projects on github to be updated..

cl12 - half suite - vload_half function violates OpenCL's type-based aliasing rules

The root cause seems to be a violation in this test of OpenCL's type-based aliasing rules:

Consider the following kernel derived from Test_vLoadHalf.c:

__kernel void test( const __global half p, __global float2 f ) {
__local int data[8/2];
__local half
hdata_p = (__local half
) data;
__global int* i_p = (__global int*)p;
size_t i = get_global_id(0);
size_t lid = get_local_id(0);
int k;
for (k=0; k<2/2; k++)
data[lid2/2+k] = i_p[i2/2+k];
f[i] = vload_half2( lid, hdata_p );
}
The kernel stores to the local int array but loads half values from it, effectively accessing it using two distinct data types. This seems to go against C99’s aliasing rules which OpenCL complies with (section 6.1.8 in spec 1.2) and can lead compilers to generate incorrect code by relying on type-based alias analysis.

Such tests can be legalized by converting them to use (aligned) ‘char’ instead of ‘short’, ‘int’ etc. for the local buffers as converting to ‘char’ is allowed, e.g.:

__kernel void test( const __global half p, __global float2 f )
{
attribute ((aligned (4))) __local char data[4
8/2];
__local half
hdata_p = (__local half*) data;
__global char* i_p = (__global char*)p;
size_t i = get_global_id(0);
size_t lid = get_local_id(0);
int k;
for (k=0; k<42/2; k++)
data[4
lid2/2+k] = i_p[4i*2/2+k];
f[i] = vload_half2( lid, hdata_p );
}

cl12 cl20 cl21 cl22 image_streams write tests don't allow flushing float denormals

The write tests use a normalized error fabsf( ( expected[ j ] - actual[ j ] ) / expected[ j ] ) for non-zero expected results. For denormals that get flushed to zero, this will compute a failing error value (e.g. 1 for CL_R images).

Additional issues in the error computation (limited here to CL_FLOAT tests in test_write_image_1D()):

  1. CL_FLOAT images are tested against a relative error of 0.005f, which has no correlation to spec requirements.
  2. The computed "err" is averaged over all channels, which is not provide for in the spec
  3. NaN/Inf are not considered in the relative error calculation, and can lead to both false fails and false passes.

The overall test can still pass due to #27

Handling of in-tree SPIR-V assembly/binaries

A few issues to discuss:

  • Do we really want to commit the binaries? Shouldn't we just generate them from the assembly as part of the build system?
  • The assembly doesn't use the same syntax as the SPIR-V disassembler which makes it hard to check consistency between the binaries and assembly or check that PRs modify the binaries and assembly in the same manner.

Bad addressing in error report from async strided copy test

The basic async_strided_copy test contains the following:
200 for (int i=0; i<(int)globalBufferSize; i+=(int)elementSize*(int)stride)
201 {
202 if (memcmp( ((char )inBuffer)+i, ((char )outBuffer)+i, typeSize) != 0 )
203 {
204 unsigned char * inchar = (unsigned char
)inBuffer + i;
205 unsigned char * outchar = (unsigned char
)outBuffer + i;
206 char values[4096];
207 values[0] = 0;
208
209 log_error( "ERROR: Results of copy did not validate!\n" );
210 sprintf(values + strlen( values), "%d -> [", i);
211 for (int j=0; j<(int)elementSize; j++)
212 sprintf(values + strlen( values), "%2x ", inchar[ielementSize+j]);
213 sprintf(values + strlen(values), "] != [");
214 for (int j=0; j<(int)elementSize; j++)
215 sprintf(values + strlen( values), "%2x ", outchar[i
elementSize+j]);
216 sprintf(values + strlen(values), "]");
217 log_error("%s\n", values);
218
219 return -1;
220 }
221 }

Lines 212 and 215 can access out of bounds since inchar and outchar have already been offset by "i". The simple fix is to simply remove the "i*elementSize" in the subscripting expressions. This problem appears in all branches I checked.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.