flamegpu / flamegpu Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
Noticed a few typos in the makefiles.
Modify Archive creation to include all source + binaries (anything other than temporary files) so that only a single download is required when users wish to run examples without compilation, and immediately follow development, rather than having to also download the source zip.
This will also mark the next release 1.4.3
.
Spotted by Alcione
Would improve cache coherence at the cost of sort. @Robadob thoughts?
ubuntu@ip-10-63-219-58:~/FLAMEGPU/examples$ make
*********************************************************************
* Copyright 2016 University of Sheffield. All rights reserved. *
*********************************************************************
make all -> Processes XML model and builds default modes per example*
------------------------------------------------ *
make build -> builds all executables in either release or debug *
only use this, if you already have all the .cu files *
------------------------------------------------ *
All scripts are stored in bin/x64. To run, simple exectue the script*
*********************************************************************
ubuntu@ip-10-63-219-58:~/FLAMEGPU/examples$ make all
make[1]: Entering directory `/home/ubuntu/FLAMEGPU/examples/Boids_BruteForce'
xmllint --noout src/model/XMLModelFile.xml --schema ../../FLAMEGPU/schemas/XMMLG PU.xsd
src/model/XMLModelFile.xml validates
xsltproc ../../FLAMEGPU/templates/header.xslt src/model/XMLModelFile.xml> src/d ynamic/header.h
xsltproc ../../FLAMEGPU/templates/FLAMEGPU_kernals.xslt src/model/XMLModelFile.x ml > src/dynamic/FLAMEGPU_kernals.cu
xsltproc ../../FLAMEGPU/templates/io.xslt src/model/XMLModelFile.xml > src/dynam ic/io.cu
xsltproc ../../FLAMEGPU/templates/simulation.xslt src/model/XMLModelFile.xml > s rc/dynamic/simulation.cu
xsltproc ../../FLAMEGPU/templates/main.xslt src/model/XMLModelFile.xml > src/dyn amic/main.cu
xsltproc ../../FLAMEGPU/templates/visualisation.xslt src/model/XMLModelFile.xml > src/dynamic/visualisation.cu
"/usr/local/cuda-7.5"/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=com pute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50, code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_52,code=com pute_52 -I ../../include/ -I src/model/ -I src/dynamic/ -I src/visualisation/ -I ../../include//GL/ -I../../lib/ -o io.o -c src/dynamic/io.cu
"/usr/local/cuda-7.5"/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=com pute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50, code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_52,code=com pute_52 -I ../../include/ -I src/model/ -I src/dynamic/ -I src/visualisation/ -I ../../include//GL/ -I../../lib/ -o simulation.o -c src/dynamic/simulation.cu
"/usr/local/cuda-7.5"/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=com pute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50, code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_52,code=com pute_52 -I ../../include/ -I src/model/ -I src/dynamic/ -I src/visualisation/ -I ../../include//GL/ -I../../lib/ -o main_.o -c -DVISUALISATION src/dynamic/main. cu
In file included from src/dynamic/main.cu:21:0:
../../include/GL/glew.h:1142:20: fatal error: GL/glu.h: No such file or director y
#include <GL/glu.h>
^
compilation terminated.
make[1]: *** [main_.o] Error 1
make[1]: Leaving directory `/home/ubuntu/FLAMEGPU/examples/Boids_BruteForce'
make: *** [Boids_BruteForce/Makefile.ph_build] Error 2
The documentation needs to be updated with recent features and developments including:
PAUSE_ON_START
in visualisation.h
etc.
An executable binary for linux has been tracked in the bin/x64/Release_Console directory. This should be removed and ideally purged from history.
.gitignore
to prevent these issues in the future / make people have to use git add -f
to force them in (which becomes a conscious decision)In https://github.com/FLAMEGPU/FLAMEGPU/blob/master/examples/Boids_Partitioning/src/model/functions.c#L148 the global velocity is normalized over the collision_count. However, the velocity is added for all boids in the interaction radius rather than separation radius, so this should probably be using the global_centre_count.
Additionally https://github.com/FLAMEGPU/FLAMEGPU/blob/master/examples/Boids_Partitioning/src/model/functions.c#L152 assigns match_velocity = match_velocity * MATCH_SCALE
, where probably global_velocity
was intended on the right-hand-side. Right now match_velocity
will always stay zero.
The model diagram generation code (model2dot.py) was previously written by Alcione Oliveira. The generated model should be able to include init,step,and exit functions.
When an XML Model File is incomplete, there is not necessarily a relevant error message produced for the user.
For instance:
If a message partitioning scheme is not specified for a message, the code is generated but will not compile, due to undefined variable message_<message_name>_count
.
Is the model file validated at all or is it up to the end user to ensure the model is valid?
If not, should the default of <gpu:partitioningNone></gpu:partitioningNone>
be implied?
FLAME GPU make files do not work on Iceberg (The Sheffiled Uni HPC facility). The issue resolves around the call to nvcc.
The make file includes the CUDA_PATH
so that calls to nvcc end up looking like /usr/local/cuda-7.5/bin/nvcc
rather than just nvcc
. I think that the former changes the location of where nvcc looks for files to the bin
directory rather than the makefile location so that the cuda input files are not found. E.g.
/usr/local/cuda-7.5/bin/nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_52,code=compute_52 -I ../../include/ -I src/model/ -I src/dynamic/ -I src/visualisation/ -I ../../include/GL/ -o io.o -c src/dynamic/io.cu bash: /usr/local/cuda-7.5/bin/nvcc: No such file or directory
From the makefile fails where as modifying the call to
nvcc -ccbin g++ -I../../common/inc -m64 -gencode arch=compute_20,code=sm_20 -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_52,code=compute_52 -I ../../include/ -I src/model/ -I src/dynamic/ -I src/visualisation/ -I ../../include/GL/ -o io.o -c src/dynamic/io.cu
works correctly.
Warnings are generated during compilation under linux, which should be resolved where possible.
In file included from src/visualisation/CustomVisualisation.cpp:26:0:
src/visualisation/CustomVisualisation.h:50:8: warning: extra tokens at end of #endif directive
#endif __VISUALISATION
^
In file included from src/visualisation/CustomVisualisation.cpp:27:0:
src/visualisation/GLUTInputController.h:55:8: warning: extra tokens at end of #endif directive
#endif __GLUT_INPUT_CONTROLLER
...
src/visualisation/CustomVisualisation.cpp: In function ‘void initVisualisation()’:
src/visualisation/CustomVisualisation.cpp:65:45: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
char *argv[] = {"GLUT application", NULL};
...
src/visualisation/MenuDisplay.cpp: In function ‘void drawInfoDisplay(int, int)’:
src/visualisation/MenuDisplay.cpp:206:63: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
printInfoLine("********** Simulation Information **********");
This is required in some models where rules are based on statistical data from the population. For example birth rate may be a function of the total population density, rather than the perceived local density. Such calculations could be performed by each agent but it makes sense to have host functions which are able perform things like reductions on agent variables.
Some of these have additional examples which should be merged. Branches which are not merged should have updated descriptions explaining what they are.
Alcione has discovered a bug which causes occasional launch failures for discrete message input (of discrete agents). Code sent to Paul for testing.
Hola! @mondus has created a ZenHub account for the FLAMEGPU organization. ZenHub is the only project management tool integrated natively in GitHub – created specifically for fast-moving, software-driven teams.
To get set up with ZenHub, all you have to do is download the browser extension and log in with your GitHub account. Once you do, you’ll get access to ZenHub’s complete feature-set immediately.
ZenHub adds a series of enhancements directly inside the GitHub UI:
Still curious? See more ZenHub features or read user reviews. This issue was written by your friendly ZenHub bot, posted by request from @mondus.
The Sugarscape visualisation performance is very poor, and does not render agents in the expected 2D square grid.
Sugarscape agents are discrete and have the agent variable location_id
for visualisation purposes, rather than x
/y
.
x
and y
are calculated using location_id
and population_width
during output_agent_agent_to_VBO
in visualisation.cu
.
int population_width = (int)floor(sqrt((float)get_agent_agent_default_count()));
....
vbo[index].x = (agents->location_id[index] % population_width) - centralise.x;
vbo[index].y = floor((float)agents->location_id[index] / (float)population_width) - centralise.y;
get_agent_agent_default_count()
returns xmachine_memory_agent_MAX
defined at compile time
#define xmachine_memory_agent_MAX 1048576
There are 65536
agents in examples/Sugarscape/iterations/0.xml
, not 104857.
2 solutions:
Hi,
Is it possible to run FLAMEGPU on a GPU cluster environment (nodes connected via InfiniBand)?
CUDA 8.0 was announced over a year ago, and provides significantly improved compilation time compared to 7.5.
Additionally CUDA 9.0 has been announced (although with no public release date, likely Q3/Q4 2017 to support Volta GPUs in DGX-1v/DGX Station etc)
Visual studio project files, and linux Makefiles should be modified to default to 8.0, or the latest installed version if possible (Makefiles).
Initial change is at the glm-switch
feature branch, I've fixed the bugs that appeared when compiling a few different projects (GLM required switching to CPP from C and a double support function which requires internal use of int2
). Leaving the whole solution to batch build overnight.
Creating a new model generally involves taking an existing model and stripping out the details.
Providing an empty example would simplify this process.
Another option would be to add the relevant VS commands to create a new project of the sort.
Projects outside of the examples
folder for independent git repos would be nice too.
Windows imposes limits on the size of statically allocated data. For 32bit and 64bit windows this is limited to 2GB
(source).
The number of bins required for spatially partitioned messaging is dependant upon the bounds and radius. If the combination of environment bounds and radius result in a significant number of bins being required (approx 536,870,912
bins, or roughly an 812*812*812
cube ) then the application will not compile.
This should not be a problem for FLAMEGPU2 as dynamically allocated memory can be up to 8TB
on Windows x64.
Furthermore, the use of cudaBindTexture
in spatially partitioned messaging imposes an additional limit (device dependant) at runtime.
The total number of elements in a linear address range cannot exceed cudaDeviceProp::maxTexture1DLinear[0]
(source), which for a Geforce GTX Titan X is 134217728
texels (source).
In practice this limits the number of bins in spatially partitioned messaging to 512*512*512
.
To avoid users having to create init functions, which set the initial value of environment constants either to a fixed value or by loading from disk manually, it would be beneficial to have this as loading from the initial states XML file (0.xml
).
All required information to handle this is available at compile time from XMLModelFile.xml
and can be built into io.cu
.
This would be much simpler to support using @Robadob 's updated XML handling branch mentioned in #16
Add the source files for the FLAME GPU documentation to the repository to enable collaborative editing.
We will likely use pandoc
to convert a markdown file to pdf, or possibly just latex
. Build script(s) for windows/linux will also be required.
@mondus suggested that under certain conditions it should be possible to avoid using append_<agentname>_Agents
kernels to move agents from one state to another during agent functions.
If currentState == nextState
, it should be safe to do a simple pointer swap.
reallocate==false
If there is a state change (currentState != nextState
), and the population of the nextState == 0
then a pointer swap should be possible
reallocate == false
.nextState
If the
When reallocate == true
, it may be possible to still do the pointer swap
The case of functions which create agents needs to be considered.
The cost of the append function is relatively small compared to typical agent functions, however, for large agent populations with large numbers of agent parameters the cost will be insignificant.
For the keratinocyte model with only 1024
agents and the append kernel only takes ~4μs
per invocation plus any kernel sync overheads, however this is a very small agent population.
Following on from static graph support a more general case of dynamic graphs can be implemented.
This will allow agents to represent vertices and edges, with additional messaging techniques for edge/vertex based messaging etc.
@Robadob brought up that we should also consider directed/undirected graphs.
In the Analytics branch the step/exit functions outlines should be generated, i.e
__FLAME_GPU_STEP_FUNC__ void .... {
//...
}
Only the first agent type is rendered. Disabling the draw call for the first agent type enables the second agent type to be seen.
Required for calculating properties about agents for analysis. Required functions would be min, max, histogram etc.
If an agent function requires access to two spatially partitioned message lists, both parameters are named partition_matrix
within header.h
, meaning that compilation fails.
Noticed by Mozhgan.
Should be a quick fix;
a) Remove identifiers from the method prototypes
b) append _<xsl-select....name()>
to the identifiers
The rarely used template for functions.c
will likely also require updating.
To support on-network communication a static network should be defined as a part of the environment and a new message partitioning technique.
Only IDs will be used in user facing locations, with indices used internally, as supporting work for dynamic networks. This will require lookups between IDs and indices, however if the ID == index for the entire network structure this lookup could be skipped.
Stable Marriage init file incorrectly provides Woman
agents with preferred_woman
properties, assuming the data is as intended this should be preferred_man
as specified in the xmlmodelfile.
The functions.c template generates commented-out code for each function - i.e. agent output functions or message processing code.
This code uses docblock style comments rather than normal multi-line comments which is unnecessary and results in extra work for the end user.
i.e.
/* //Template for agent output functions
*
* int id = 0;
* int integer_value = 0;
* float float_value = 0;
* add_agent(agents, int id, int integer_value, float float_value);
*/
should be replaced with the following (or similar)
/*
// Template for agent output functions
int id = 0;
int integer_value = 0;
float float_value = 0;
add_agent(agents, id, integer_value, float_value);
*/
Several printf statements (and comments) contain errors / inconsistencies which can be corrected (i.e. memeory
).
When specifying a device ID on launch of a simulation, if an invalid device is selected an appropriate error message should be shown.
The wrong error is generated when a user selects the first device which does not exist. I.e. on a system with 2
GPUs, setting the device id parameter to 2
generates the error message
Error setting CUDA device!
rather than
Error selecting CUDA device! Device id '2' is not found?
Reductions across agent arrays seems incorrect, as the number of elements in the array is not considered.
i.e.
float reduce_Agent_default_example_array_variable(){
//reduce in default stream
return thrust::reduce(thrust::device_pointer_cast(d_Agents_default->example_array), thrust::device_pointer_cast(d_Agents_default->example_array) + h_xmachine_memory_Agent_default_count);
}
Where each example_array
is 4
elements long per agent.
As the array is strided (h_Agents_default->example_array[(j*xmachine_memory_Agent_MAX)+i])
) the reduction is only applied to the 0th element of each agent array.
When OUTPUT_POPULATION_PER_ITERATION
is set, it would be good to output the population after any INIT
functions and/or after the initial set of agents have been loaded from disk.
If an agent is located outside of the environmental bounds min<= x <max
their grid location is wrapped. Due to the wrapping only replacing out of bounds locations with the opposite bound, an agent that is out of bounds, can search further out of bounds cells, causing both cells to be 'wrapped' to the same cell.
Hence causing the agent to double dip.
Potential fixes:
+radius
(and potentially decrease min by the same)
Allow users to create agents from the host, within INIT and STEP functions.
The repository includes files which are not well suited to being tracked by git, nor should be tracked by git.
These files should be removed, and purged from the history of the repository to greatly reduce file size, while all the required files and tools to generate them will still be available.
Executable files should be provided as an archive with each tagged release, removing the need for them to be updated for all commits.
Files to purge from history and add to .gitignore if not already present:
/bin/*
other than *.bat
files/examples/*/src/dynamic/*
Optionally a script should be added to the /tools/
directory which compiles all examples and produces an archive of the binary folder to be distributed with a tagged release.
When binding spatially partitioned messages, the allocated texture memory is sized as count*sizeof(int)
this should be count*sizeof(<xsl:value-of select="xmml:type"/>)
otherwise none 4-byte values (e.g. doubles) are handled incorrectly, leading to bad message parsing (aka a bunch of 0'd messages if in the 2nd half).
Simulation.xslt::746
Is:
gpuErrchk( cudaBindTexture(&tex_xmachine_message_<xsl:value-of select="../../xmml:name"/>_<xsl:value-of select="xmml:name"/>_byte_offset, tex_xmachine_message_<xsl:value-of select="../../xmml:name"/>_<xsl:value-of select="xmml:name"/>, d_<xsl:value-of select="../../xmml:name"/>s-><xsl:value-of select="xmml:name"/>, sizeof(int)*xmachine_message_<xsl:value-of select="../../xmml:name"/>_MAX));
Should be:
gpuErrchk( cudaBindTexture(&tex_xmachine_message_<xsl:value-of select="../../xmml:name"/>_<xsl:value-of select="xmml:name"/>_byte_offset, tex_xmachine_message_<xsl:value-of select="../../xmml:name"/>_<xsl:value-of select="xmml:name"/>, d_<xsl:value-of select="../../xmml:name"/>s-><xsl:value-of select="xmml:name"/>, sizeof(<xsl:value-of select="xmml:type"/>)*xmachine_message_<xsl:value-of select="../../xmml:name"/>_MAX));
Notice the change of sizeof(int)
on the start of the last wrapped line.
The average FPS reported using average/(millis/1000.0f)
does not match with the actual FPS seen. Can fix this by adding a variable frame_time
in global visualisation.cu memory and including frame_time += millis;
each frame as well as setting to zero during if(frame_count == average){ ... average/(frame_time/1000.0f); frame_time = 0.0f; }
Also possible to include is the frame time, which can be calculated every average
frames with millis
for that frame. I.e. the entire window title becomes
sprintf(title, "Execution & Rendering Total: %f (FPS), %f milliseconds per frame", average/(frame_time/1000.0f), millis);
No examples in the SDK use this functionality. Issue spotted by Alcione.
With the majority of national HPC facilities and most research Universities running Linux on their production servers (with GPUs), it would be great if FlameGPU was developed with a multi-platform focus such that compilation/execution was supported on both Windows and Linux (Ubuntu/CentOS). My understanding is that the new NVIDIA DGX system will run a custom version of Ubuntu Linux and the majority of GPU accelerated servers in US University compute centers (such as mine at ND) run CentOS/RedHat.
Just noticed that on Line 71 of circles/functions.c
the value of xmemory->fx
is being overwritten with the count of the number of iterations at the end of the kernel. This seems out of place, as xmemory->fx
and xmemory->fy
have been adjusted within the neighbourhood search and nothing done with the result (so why would they be treated differently?).
The constant getter functions added in 09c4c67 such as const int* get_CYCLE_LENGTH()
from the keratinocyte example do not compile.
As these are fixed size arrays, (i.e. int h_env_CYCLE_LENGTH[5]
) this is not simple, and merely changing the prototype to const int** get_CYCLE_LENGTH()
is not sufficient.
Currently exit codes / the return of main()
are generally all exit(0)
for successful or not successful operations, with exceptions for when a cuda error occurs, the cuda error code is used as the exit value, and in main
1
is returned when the device could not be reset.
It would be better to make use of EXIT_FAILURE
and EXIT_SUCCESS
from <stdlib>
where appropriate.
Working in the rapidxml branch, I've updated the io.xslt
template to use the rapidxml header lib for importing initialisation files. This makes the generated code far more approachable and maintainable (example), (however I decided to change io.cu, into io.cpp as the CUDA compiler produce alot of garbage warnings when including rapidxml.hpp
).
In doing so, I added support for a <nowarn\>
flag in the root of an init file to suppress warnings about missing agent properties and new support for setting environmental constants based on the init file as visible in the Keratinocyte init file (this was something visible in a few places, but never implemented with most examples using init functions or other custom code).
I've tested that it still loads things correctly with a few existing models (e.g. the pedestrian visualisation) and breakpointing to check var values (e.g. Keratinocyte constants). Just running a full rebuild to ensure nothing else has been broken, will merge after @mondus has had a chance to approve next week.
When FLAMEGPU was first setup the 'relocatable device code' option wasn't available, so device functions had to be in the same object as their calling function. This meant that functions.c was made to be included from simulation.cu
/fgpu_kernals.cu
This is a bad idea because it means that any pre-processor macro's defined within functions.c
, leak into the remainder of simulations.cu
to be compiled. It also makes partial compilation slower as the entire of simulation
/fgpu_kernals
/functions.c
must be recompiled if you change 1 line in functions.c
. Also being an include, marked as do not compile can confuse IDE's making them not detect when a recompilation is required.
@mozhgan-kch had this issue this afternoon, and short of noticing the highlighting is off (only going to work in an IDE), or working it out yourself it's a pain to diagnose. In this case the cuda constant symbol was being replaced with a literal at compile time, compiling fine and then throwing runtime err with invalid device symbol.
It might be feasible to rename functions.c
to functions.cu
and enable the relocatable device code option, however I expect there will be other minor changes.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.