Comments (4)
The CUDA Math library supports all C99 standard float and double math functions. Therefore I now implemented for each of those functions a wrapper, that either promotes integral types to float when the function is called in device code or calls the normal C++ function (which is already overloaded for integral types, effectively casting the argument to double) when in host code. For any function func
it looks like this
template <typename T>
__host__ __device__
float _brian_func(T value)
{
#if (defined(__CUDA_ARCH__) && (__CUDA_ARCH__ > 0))
return funcf((float)value);
#else
return func(value);
#endif
}
inline __host__ __device__
double _brian_func(double value)
{
return func(value);
}
Tests for all functions are written as well (for both host and device code) and are passing.
And it turns out long double
is always treated as double
in device code anyways, so no need to take care of that
Done. Closing.
from brian2cuda.
are integer types the only values that are used for the template parameter in instances of these code snippets?
additionally: why not cast int to (64bit) double as in normal C++ (assuming what you write is correct) but (32bit) float which is not accurate enough for (32bit) int values due to the limited size of the mantissa. i therefore suggest to use double
finally: for integer types with more than 32bit, e.g. 64bit even (64bit) double might be too small. therefore long int occurences should give rise to a warning that these values might be casted with loss of precision
from brian2cuda.
are integer types the only values that are used for the template parameter in instances of these code snippets?
These snippets are used whenever one of the default functions is used in any string expression. This example covers some possible use cases at the moment:
G = NeuronGroup(3, 'dv/dt = sin(v)') # will use the double version of sin, since v is double by default
G.v[0] = 'sin(i)' # i is int32_t type
G.v[1] = 'sin(1)' # 1 is int type
G.v[2] = 'sin(1.0)' # 1.0 is double type
additionally: why not cast int to (64bit) double as in normal C++ (assuming what you write is correct) but (32bit) float which is not accurate enough for (32bit) int values due to the limited size of the mantissa. i therefore suggest to use double
Yeah right, I didn't think of the fact, that we can loose information when promoting integral to floating point types. I used float because of the faster single precision arithmetic on the GPU. But I think in many use cases, we won't loose information when casting integral types to float (the integer would have to be >2^23 = 8388608
with 23 bit mantissa of 32 bit floats). So maybe we should cast to double
by default and add an extra preference when we add support for single precision data types (#37)?
finally: for integer types with more than 32bit, e.g. 64bit even (64bit) double might be too small. therefore long int occurences should give rise to a warning that these values might be casted with loss of precision
As far as I can tell this case can only happen if the user specifically specifies a numpy.int64
or numpy.uint64
data type in the model. So we could create a warning with the Brian2 logging system every time a user specifies a 64bit
int type (easy to do). If we want a warning only if in64
types are used inside default function calls, I would have to dig a little to see where to check for that. Or another option would be to create template specialisation for the 64bit
integer types in the code snippets and create compiler warnings there using #warning
and #pragma message
. I guess I'd prefer the first option.
So what about
template <typename T>
__host__ __device__
double _brian_func(T value)
{
#if (defined(__CUDA_ARCH__) && (__CUDA_ARCH__ > 0))
return func((double)value);
#else
return func(value);
#endif
}
inline __host__ __device__
float _brian_func(float value)
{
return funcf(value);
}
and later we can just switch the double
and float
specifiers, depending on some user preference (e.g. something like prefs.devices.cuda_standalone.integer_promotion_to = 'float'/'double'
). And once we have the option for single precision floating points (#37), these would then always use the faster single precision arithemics.
from brian2cuda.
I added a preference
prefs.codegen.generators.cuda.default_functions_integral_convertion
The default is 'double_precision'
, converting integral types to double
(as shown in the code snippet in my last comment). Setting it to 'single_precision'
exchanges the float
and double
keywords in the functions, converting integral types to float
.
I have also added warnings. Just warning whenever a 64bit
integer type is used doesn't work, since Clock
class uses them always. So instead the cuda_generator
checks for the occurrence of one the relevant default functions and an integer type in the same generated code line and gives a warning, using the brian2 logging system. We can't parse only the arguments of the function with regex, because of possibly nested paranthesis (e.g. _brian_sin((1+i)*5)
and using another parsing package just for this seems unnecessary. So we also warn in the case of a line v = i * _brian_sin(1);
if i
is int64_t
, but I'd say that's fine.
We warn in two cases:
int64
typesint32
types whenprefs.codegen.generators.cuda.default_functions_integral_convertion == 'single_precision'
I also added tests for setting the preference and the correct warnings. So, trying to close this again :)
from brian2cuda.
Related Issues (20)
- Reduce number of threads in push kernel for heterogeneous delays in bundle mode
- Implement special case connectivity matrices with pre/post IDs for models that effect only pre/post variables HOT 5
- Call reset kernel only with as many threads as there are spiking neurons (not as there are neurons in total)
- Refactor benchmarking scripts and update generated plots
- Check if storing the size of synapse groups is necessary? HOT 1
- Needs patch to run with Brian 2.4.2 HOT 2
- Optimize `StateMonitor`
- Impelement brian2cuda preference file support
- Copy all eventspace counters to host efficiently at each time step
- Investigate and document performance effects when working with `Subgroup`s HOT 1
- Consider partitioning eventspaces when using `Subgroup`s HOT 4
- Optimize `PopulationRateMonitor`
- Fix `SpikeMonitor` for `Subgroup`s HOT 1
- Optimize our `SpikeMonitor` for `Subgroups`
- Refactor test suite scripts
- Fix `ReferencError` in spatialneuron tests HOT 3
- Fix memory leak when having multiple `run` calls
- Spikes are lost when changing delays between `run` calls HOT 3
- Recent Brian2 update PR broke benchmark scripts HOT 1
- Brian2Cuda Uninstalls Brian2 2.5.1 and Installs 2.4.2 Which Won't Import HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from brian2cuda.