flamegpu / flamegpu Goto Github PK

License: MIT License

XSLT 6.83% C++ 40.15% C 21.91% Batchfile 0.04% CMake 0.04% Shell 0.06% Python 0.41% Cuda 30.22% Makefile 0.34%

flamegpu's Issues

Modify Archive creation to include all source + binaries (anything other than temporary files) so that only a single download is required when users wish to run examples without compilation, and immediately follow development, rather than having to also download the source zip.

This will also mark the next release 1.4.3.

Bug with optional messaging when no messages are output

Spotted by Alcione

Sort agents before message input functions for spatial partitioned messages

Would improve cache coherence at the cost of sort. @Robadob thoughts?

Linux builds report missing glu.h for boids example

ubuntu@ip-10-63-219-58:~/FLAMEGPU/examples$ make
*********************************************************************
*  Copyright 2016 University of Sheffield.  All rights reserved.    *
*********************************************************************
make all -> Processes XML model and builds default modes per example*
           ------------------------------------------------         *
make build -> builds all executables in either release or debug     *
              only use this, if you already have all the .cu files  *
           ------------------------------------------------         *
All scripts are stored in bin/x64. To run, simple exectue the script*
*********************************************************************
ubuntu@ip-10-63-219-58:~/FLAMEGPU/examples$ make all
make[1]: Entering directory `/home/ubuntu/FLAMEGPU/examples/Boids_BruteForce'
xmllint --noout src/model/XMLModelFile.xml --schema ../../FLAMEGPU/schemas/XMMLG             PU.xsd
src/model/XMLModelFile.xml validates
xsltproc ../../FLAMEGPU/templates/header.xslt  src/model/XMLModelFile.xml> src/d             ynamic/header.h
xsltproc ../../FLAMEGPU/templates/FLAMEGPU_kernals.xslt src/model/XMLModelFile.x             ml > src/dynamic/FLAMEGPU_kernals.cu
xsltproc ../../FLAMEGPU/templates/io.xslt src/model/XMLModelFile.xml > src/dynam             ic/io.cu
xsltproc ../../FLAMEGPU/templates/simulation.xslt src/model/XMLModelFile.xml > s             rc/dynamic/simulation.cu
xsltproc ../../FLAMEGPU/templates/main.xslt src/model/XMLModelFile.xml > src/dyn             amic/main.cu
xsltproc ../../FLAMEGPU/templates/visualisation.xslt src/model/XMLModelFile.xml              > src/dynamic/visualisation.cu
"/usr/local/cuda-7.5"/bin/nvcc -ccbin g++ -I../../common/inc  -m64     -gencode              arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=com             pute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,             code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_52,code=com             pute_52 -I ../../include/ -I src/model/ -I src/dynamic/ -I src/visualisation/ -I              ../../include//GL/ -I../../lib/ -o io.o -c src/dynamic/io.cu
"/usr/local/cuda-7.5"/bin/nvcc -ccbin g++ -I../../common/inc  -m64     -gencode              arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=com             pute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,             code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_52,code=com             pute_52 -I ../../include/ -I src/model/ -I src/dynamic/ -I src/visualisation/ -I              ../../include//GL/ -I../../lib/ -o simulation.o -c src/dynamic/simulation.cu
"/usr/local/cuda-7.5"/bin/nvcc -ccbin g++ -I../../common/inc  -m64     -gencode              arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=com             pute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,             code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_52,code=com             pute_52 -I ../../include/ -I src/model/ -I src/dynamic/ -I src/visualisation/ -I              ../../include//GL/ -I../../lib/ -o main_.o -c -DVISUALISATION src/dynamic/main.             cu
In file included from src/dynamic/main.cu:21:0:
../../include/GL/glew.h:1142:20: fatal error: GL/glu.h: No such file or director             y
 #include <GL/glu.h>
                    ^
compilation terminated.
make[1]: *** [main_.o] Error 1
make[1]: Leaving directory `/home/ubuntu/FLAMEGPU/examples/Boids_BruteForce'
make: *** [Boids_BruteForce/Makefile.ph_build] Error 2

Update FLAME GPU Documentation

The documentation needs to be updated with recent features and developments including:

Instrumentation constants
Limitations of statically allocated memory (2GB limit on windows) which effects agent populations, message list size and the number of bins available for spatially partitioned messaging.
The current Linux build process
Step Functions
Analytics functions such as reductions
- Including that these are not available for agent variable arrays
Upcoming host-based agent creation (init and step functions)
PAUSE_ON_START in visualisation.h

etc.

https://github.com/FLAMEGPU/FLAMEGPU_TechnicalReport

Linux Binary tracked bin/x64/Release_Console

An executable binary for linux has been tracked in the bin/x64/Release_Console directory. This should be removed and ideally purged from history.

https://github.com/FLAMEGPU/FLAMEGPU/blob/master/bin/x64/Release_Console/CirclesBruteForce_float_console

Purge Binaries which should not be tracked from history
Purge XML files which should not be tracked from history
Purge shell scripts which should not be tracked from history
Update .gitignore to prevent these issues in the future / make people have to use git add -f to force them in (which becomes a conscious decision)

Potentially incorrect velocity matching in boids model

In https://github.com/FLAMEGPU/FLAMEGPU/blob/master/examples/Boids_Partitioning/src/model/functions.c#L148 the global velocity is normalized over the collision_count. However, the velocity is added for all boids in the interaction radius rather than separation radius, so this should probably be using the global_centre_count.

Additionally https://github.com/FLAMEGPU/FLAMEGPU/blob/master/examples/Boids_Partitioning/src/model/functions.c#L152 assigns match_velocity = match_velocity * MATCH_SCALE, where probably global_velocity was intended on the right-hand-side. Right now match_velocity will always stay zero.

Updating the model diagram generation code

The model diagram generation code (model2dot.py) was previously written by Alcione Oliveira. The generated model should be able to include init,step,and exit functions.

Invalid XML Model File handling

When an XML Model File is incomplete, there is not necessarily a relevant error message produced for the user.

For instance:
If a message partitioning scheme is not specified for a message, the code is generated but will not compile, due to undefined variable message_<message_name>_count.

Is the model file validated at all or is it up to the end user to ensure the model is valid?
If not, should the default of <gpu:partitioningNone></gpu:partitioningNone> be implied?

FLAME GPU makefile problem with Iceberg

FLAME GPU make files do not work on Iceberg (The Sheffiled Uni HPC facility). The issue resolves around the call to nvcc.

The make file includes the CUDA_PATH so that calls to nvcc end up looking like /usr/local/cuda-7.5/bin/nvcc rather than just nvcc. I think that the former changes the location of where nvcc looks for files to the bin directory rather than the makefile location so that the cuda input files are not found. E.g.

/usr/local/cuda-7.5/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_52,code=compute_52 -I ../../include/ -I src/model/ -I src/dynamic/ -I src/visualisation/ -I ../../include/GL/ -o io.o -c src/dynamic/io.cu bash: /usr/local/cuda-7.5/bin/nvcc: No such file or directory

From the makefile fails where as modifying the call to

nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_52,code=compute_52 -I ../../include/ -I src/model/ -I src/dynamic/ -I src/visualisation/ -I ../../include/GL/ -o io.o -c src/dynamic/io.cu
works correctly.

Linux Build Warnings

Warnings are generated during compilation under linux, which should be resolved where possible.

In file included from src/visualisation/CustomVisualisation.cpp:26:0:
src/visualisation/CustomVisualisation.h:50:8: warning: extra tokens at end of #endif directive
 #endif __VISUALISATION
        ^
In file included from src/visualisation/CustomVisualisation.cpp:27:0:
src/visualisation/GLUTInputController.h:55:8: warning: extra tokens at end of #endif directive
 #endif __GLUT_INPUT_CONTROLLER

...

 src/visualisation/CustomVisualisation.cpp: In function ‘void initVisualisation()’:
src/visualisation/CustomVisualisation.cpp:65:45: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
     char *argv[] = {"GLUT application", NULL};

...

src/visualisation/MenuDisplay.cpp: In function ‘void drawInfoDisplay(int, int)’:
src/visualisation/MenuDisplay.cpp:206:63: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
   printInfoLine("********** Simulation Information **********");

Not possible to run generic code between simulation steps

This is required in some models where rules are based on statistical data from the population. For example birth rate may be a function of the total population density, rather than the perceived local density. Such calculations could be performed by each agent but it makes sense to have host functions which are able perform things like reductions on agent variables.

Tidy up branches

Some of these have additional examples which should be merged. Branches which are not merged should have updated descriptions explaining what they are.

Bug on discrete messaging

Alcione has discovered a bug which causes occasional launch failures for discrete message input (of discrete agents). Code sent to Paul for testing.

Getting more done in GitHub with ZenHub

Hola! @mondus has created a ZenHub account for the FLAMEGPU organization. ZenHub is the only project management tool integrated natively in GitHub – created specifically for fast-moving, software-driven teams.

How do I use ZenHub?

To get set up with ZenHub, all you have to do is download the browser extension and log in with your GitHub account. Once you do, you’ll get access to ZenHub’s complete feature-set immediately.

What can ZenHub do?

ZenHub adds a series of enhancements directly inside the GitHub UI:

Real-time, customizable task boards for GitHub issues;
Multi-Repository burndown charts, estimates, and velocity tracking based on GitHub Milestones;
Personal to-do lists and task prioritization;
Time-saving shortcuts – like a quick repo switcher, a “Move issue” button, and much more.

Add ZenHub to GitHub

Still curious? See more ZenHub features or read user reviews. This issue was written by your friendly ZenHub bot, posted by request from @mondus.

Poor visualisation performance for Sugarscape Example

The Sugarscape visualisation performance is very poor, and does not render agents in the expected 2D square grid.

Sugarscape agents are discrete and have the agent variable location_id for visualisation purposes, rather than x/y.
x and y are calculated using location_id and population_width during output_agent_agent_to_VBO in visualisation.cu.

int population_width = (int)floor(sqrt((float)get_agent_agent_default_count()));
....
vbo[index].x = (agents->location_id[index] % population_width) - centralise.x;
vbo[index].y = floor((float)agents->location_id[index] / (float)population_width) - centralise.y;

get_agent_agent_default_count() returns xmachine_memory_agent_MAX defined at compile time

#define xmachine_memory_agent_MAX 1048576

There are 65536 agents in examples/Sugarscape/iterations/0.xml, not 104857.

2 solutions:

Adjust the model file to only allow 65k agents, fixing the visualisation for when the number of agents is the maximum defined at compile time
Fix the root issue of using the maximum number of agents not the actual number of agents for the visualisation.

[question] FLAMEGPU cluster configuration

Hi,
Is it possible to run FLAMEGPU on a GPU cluster environment (nodes connected via InfiniBand)?

Upgrade projects + makefiles to use CUDA 8.0 by default

CUDA 8.0 was announced over a year ago, and provides significantly improved compilation time compared to 7.5.

Additionally CUDA 9.0 has been announced (although with no public release date, likely Q3/Q4 2017 to support Volta GPUs in DGX-1v/DGX Station etc)

Visual studio project files, and linux Makefiles should be modified to default to 8.0, or the latest installed version if possible (Makefiles).

Issues with negative timestep on Pedestrian Navigation

Move to glm for math vectors

Initial change is at the glm-switch feature branch, I've fixed the bugs that appeared when compiling a few different projects (GLM required switching to CPP from C and a double support function which requires internal use of int2). Leaving the whole solution to batch build overnight.

Project creation / Empty Project

Creating a new model generally involves taking an existing model and stripping out the details.

Providing an empty example would simplify this process.

Another option would be to add the relevant VS commands to create a new project of the sort.

Projects outside of the examples folder for independent git repos would be nice too.

Limitations on Spatially Partitioned Messaging dimensions

Windows imposes limits on the size of statically allocated data. For 32bit and 64bit windows this is limited to 2GB (source).

The number of bins required for spatially partitioned messaging is dependant upon the bounds and radius. If the combination of environment bounds and radius result in a significant number of bins being required (approx 536,870,912 bins, or roughly an 812*812*812 cube ) then the application will not compile.
This should not be a problem for FLAMEGPU2 as dynamically allocated memory can be up to 8TB on Windows x64.

Furthermore, the use of cudaBindTexture in spatially partitioned messaging imposes an additional limit (device dependant) at runtime.
The total number of elements in a linear address range cannot exceed cudaDeviceProp::maxTexture1DLinear[0] (source), which for a Geforce GTX Titan X is 134217728 texels (source).
In practice this limits the number of bins in spatially partitioned messaging to 512*512*512.

Support loading of environment constants from initial states file

To avoid users having to create init functions, which set the initial value of environment constants either to a fixed value or by loading from disk manually, it would be beneficial to have this as loading from the initial states XML file (0.xml).

All required information to handle this is available at compile time from XMLModelFile.xml and can be built into io.cu.
This would be much simpler to support using @Robadob 's updated XML handling branch mentioned in #16

Add FLAME GPU documentation source files + build script(s) to repository

Add the source files for the FLAME GPU documentation to the repository to enable collaborative editing.

We will likely use pandoc to convert a markdown file to pdf, or possibly just latex. Build script(s) for windows/linux will also be required.

Pointer swaps for agent xmachine_memory_*_lists

@mondus suggested that under certain conditions it should be possible to avoid using append_<agentname>_Agents kernels to move agents from one state to another during agent functions.

If currentState == nextState, it should be safe to do a simple pointer swap.
- If there are no function conditions and reallocate==false
- Global conditions should be fine.
If there is a state change (currentState != nextState), and the population of the nextState == 0 then a pointer swap should be possible
- If there are no function conditions and reallocate == false.
- Possible edge case: multiple agent functions in the same layer have the same nextState
If the
When reallocate == true, it may be possible to still do the pointer swap
- It depends on the reallocation process, needs to be looked into.
The case of functions which create agents needs to be considered.

The cost of the append function is relatively small compared to typical agent functions, however, for large agent populations with large numbers of agent parameters the cost will be insignificant.
For the keratinocyte model with only 1024 agents and the append kernel only takes ~4μs per invocation plus any kernel sync overheads, however this is a very small agent population.

Support dynamic graphs with multiple new messaging types

Following on from static graph support a more general case of dynamic graphs can be implemented.
This will allow agents to represent vertices and edges, with additional messaging techniques for edge/vertex based messaging etc.

network-as-agents.txt

@Robadob brought up that we should also consider directed/undirected graphs.

Generate Step/Exit functions in functions.xslt

In the Analytics branch the step/exit functions outlines should be generated, i.e

__FLAME_GPU_STEP_FUNC__ void .... {
//...
}

Visualisation does not seem to work for multiple agent types

Only the first agent type is rendered. Disabling the draw call for the first agent type enables the second agent type to be seen.

Generic functions required for reducing agent variables

Required for calculating properties about agents for analysis. Required functions would be min, max, histogram etc.

Multiple Spatial Partitions unsupported in agent_function

If an agent function requires access to two spatially partitioned message lists, both parameters are named partition_matrix within header.h, meaning that compilation fails.

Noticed by Mozhgan.

Should be a quick fix;

a) Remove identifiers from the method prototypes
b) append _<xsl-select....name()> to the identifiers

The rarely used template for functions.c will likely also require updating.

Support static graphs for on-network communication

To support on-network communication a static network should be defined as a part of the environment and a new message partitioning technique.

Only IDs will be used in user facing locations, with indices used internally, as supporting work for dynamic networks. This will require lookups between IDs and indices, however if the ID == index for the entire network structure this lookup could be skipped.

An Example XMLModelFile is included

Stable Marriage Example init file is invalid

Stable Marriage init file incorrectly provides Woman agents with preferred_woman properties, assuming the data is as intended this should be preferred_man as specified in the xmlmodelfile.

Generated functions.c commented out code formatting

The functions.c template generates commented-out code for each function - i.e. agent output functions or message processing code.

This code uses docblock style comments rather than normal multi-line comments which is unnecessary and results in extra work for the end user.

i.e.

/* //Template for agent output functions
*
* int id = 0;
* int integer_value = 0;
* float float_value = 0;
* add_agent(agents, int id, int integer_value, float float_value);
*/

should be replaced with the following (or similar)

/* 
// Template for agent output functions
int id = 0;
int integer_value = 0;
float float_value = 0;
add_agent(agents, id, integer_value, float_value);
*/

printf & comment typos/errors

Several printf statements (and comments) contain errors / inconsistencies which can be corrected (i.e. memeory).

Incorrect error message with invalid device ID

When specifying a device ID on launch of a simulation, if an invalid device is selected an appropriate error message should be shown.

The wrong error is generated when a user selects the first device which does not exist. I.e. on a system with 2 GPUs, setting the device id parameter to 2 generates the error message

Error setting CUDA device!

rather than

Error selecting CUDA device! Device id '2' is not found?

Reductions across agent variable arrays

Reductions across agent arrays seems incorrect, as the number of elements in the array is not considered.
i.e.

float reduce_Agent_default_example_array_variable(){
    //reduce in default stream
    return thrust::reduce(thrust::device_pointer_cast(d_Agents_default->example_array),  thrust::device_pointer_cast(d_Agents_default->example_array) + h_xmachine_memory_Agent_default_count);
}

Where each example_array is 4 elements long per agent.

As the array is strided (h_Agents_default->example_array[(j*xmachine_memory_Agent_MAX)+i])) the reduction is only applied to the 0th element of each agent array.

Instrumentation: Population from disk

When OUTPUT_POPULATION_PER_ITERATION is set, it would be good to output the population after any INIT functions and/or after the initial set of agents have been loaded from disk.

Agent's located outside of env bounds read from some bins twice during spatial partitioning.

If an agent is located outside of the environmental bounds min<= x <max their grid location is wrapped. Due to the wrapping only replacing out of bounds locations with the opposite bound, an agent that is out of bounds, can search further out of bounds cells, causing both cells to be 'wrapped' to the same cell.

Hence causing the agent to double dip.

Potential fixes:

Do proper modular wrapping
Throw an assertion error when agents are located out of bounds
Increase env max by +radius (and potentially decrease min by the same)
- If you only increase env max by 1, the number of bins increases, however they are shared across the env width, meaning that the interaction radius is subsequently reduced.

Agent creation from host in INIT and STEP functions

Allow users to create agents from the host, within INIT and STEP functions.

Reduce repository file size

The repository includes files which are not well suited to being tracked by git, nor should be tracked by git.

These files should be removed, and purged from the history of the repository to greatly reduce file size, while all the required files and tools to generate them will still be available.

Executable files should be provided as an archive with each tagged release, removing the need for them to be updated for all commits.

Files to purge from history and add to .gitignore if not already present:

/bin/* other than *.bat files
/examples/*/src/dynamic/*

Optionally a script should be added to the /tools/ directory which compiles all examples and produces an archive of the binary folder to be distributed with a tagged release.

Texture binding doesn't support double's correctly.

When binding spatially partitioned messages, the allocated texture memory is sized as count*sizeof(int) this should be count*sizeof(<xsl:value-of select="xmml:type"/>) otherwise none 4-byte values (e.g. doubles) are handled incorrectly, leading to bad message parsing (aka a bunch of 0'd messages if in the 2nd half).

Simulation.xslt::746

Is:
gpuErrchk( cudaBindTexture(&tex_xmachine_message_<xsl:value-of select="../../xmml:name"/>_<xsl:value-of select="xmml:name"/>_byte_offset, tex_xmachine_message_<xsl:value-of select="../../xmml:name"/>_<xsl:value-of select="xmml:name"/>, d_<xsl:value-of select="../../xmml:name"/>s-><xsl:value-of select="xmml:name"/>, sizeof(int)*xmachine_message_<xsl:value-of select="../../xmml:name"/>_MAX));

Should be:
gpuErrchk( cudaBindTexture(&tex_xmachine_message_<xsl:value-of select="../../xmml:name"/>_<xsl:value-of select="xmml:name"/>_byte_offset, tex_xmachine_message_<xsl:value-of select="../../xmml:name"/>_<xsl:value-of select="xmml:name"/>, d_<xsl:value-of select="../../xmml:name"/>s-><xsl:value-of select="xmml:name"/>, sizeof(<xsl:value-of select="xmml:type"/>)*xmachine_message_<xsl:value-of select="../../xmml:name"/>_MAX));

Notice the change of sizeof(int) on the start of the last wrapped line.

Incorrect FPS reporter

The average FPS reported using average/(millis/1000.0f) does not match with the actual FPS seen. Can fix this by adding a variable frame_time in global visualisation.cu memory and including frame_time += millis; each frame as well as setting to zero during if(frame_count == average){ ... average/(frame_time/1000.0f); frame_time = 0.0f; }

Also possible to include is the frame time, which can be calculated every average frames with millis for that frame. I.e. the entire window title becomes
sprintf(title, "Execution & Rendering Total: %f (FPS), %f milliseconds per frame", average/(frame_time/1000.0f), millis);

Discrete agents can not read continuous messages (either non partitioned or spatial)

No examples in the SDK use this functionality. Issue spotted by Alcione.

Linux Compilation

With the majority of national HPC facilities and most research Universities running Linux on their production servers (with GPUs), it would be great if FlameGPU was developed with a multi-platform focus such that compilation/execution was supported on both Windows and Linux (Ubuntu/CentOS). My understanding is that the new NVIDIA DGX system will run a custom version of Ubuntu Linux and the majority of GPU accelerated servers in US University compute centers (such as mine at ND) run CentOS/RedHat.

Possible bug in circles model

Just noticed that on Line 71 of circles/functions.c the value of xmemory->fx is being overwritten with the count of the number of iterations at the end of the kernel. This seems out of place, as xmemory->fx and xmemory->fy have been adjusted within the neighbourhood search and nothing done with the result (so why would they be treated differently?).

https://github.com/FLAMEGPU/FLAMEGPU/blob/master/examples/CirclesBruteForce_float/src/model/functions.c#L71

Environment constant getter functions do not compile for arrays

The constant getter functions added in 09c4c67 such as const int* get_CYCLE_LENGTH() from the keratinocyte example do not compile.

As these are fixed size arrays, (i.e. int h_env_CYCLE_LENGTH[5]) this is not simple, and merely changing the prototype to const int** get_CYCLE_LENGTH() is not sufficient.

Return/Exit Codes

Currently exit codes / the return of main() are generally all exit(0) for successful or not successful operations, with exceptions for when a cuda error occurs, the cuda error code is used as the exit value, and in main 1 is returned when the device could not be reset.

It would be better to make use of EXIT_FAILURE and EXIT_SUCCESS from <stdlib> where appropriate.

XML Upgrade

Working in the rapidxml branch, I've updated the io.xslt template to use the rapidxml header lib for importing initialisation files. This makes the generated code far more approachable and maintainable (example), (however I decided to change io.cu, into io.cpp as the CUDA compiler produce alot of garbage warnings when including rapidxml.hpp).

In doing so, I added support for a <nowarn\> flag in the root of an init file to suppress warnings about missing agent properties and new support for setting environmental constants based on the init file as visible in the Keratinocyte init file (this was something visible in a few places, but never implemented with most examples using init functions or other custom code).

I've tested that it still loads things correctly with a few existing models (e.g. the pedestrian visualisation) and breakpointing to check var values (e.g. Keratinocyte constants). Just running a full rebuild to ensure nothing else has been broken, will merge after @mondus has had a chance to approve next week.

Split into multiple compilation units

When FLAMEGPU was first setup the 'relocatable device code' option wasn't available, so device functions had to be in the same object as their calling function. This meant that functions.c was made to be included from simulation.cu/fgpu_kernals.cu

This is a bad idea because it means that any pre-processor macro's defined within functions.c, leak into the remainder of simulations.cu to be compiled. It also makes partial compilation slower as the entire of simulation/fgpu_kernals/functions.c must be recompiled if you change 1 line in functions.c. Also being an include, marked as do not compile can confuse IDE's making them not detect when a recompilation is required.

@mozhgan-kch had this issue this afternoon, and short of noticing the highlighting is off (only going to work in an IDE), or working it out yourself it's a pain to diagnose. In this case the cuda constant symbol was being replaced with a literal at compile time, compiling fine and then throwing runtime err with invalid device symbol.

It might be feasible to rename functions.c to functions.cu and enable the relocatable device code option, however I expect there will be other minor changes.

flamegpu / flamegpu Goto Github PK

flamegpu's Issues

How do I use ZenHub?

What can ZenHub do?

Recommend Projects

Recommend Topics

Recommend Org