Giter VIP home page Giter VIP logo

dphpc-project's Introduction

dphpc-project

The goal of this project was to implement various parllel 2D convex hull solver algorithms and perform benchmark tests among themselfs. For further information, check the Report/report pdf.

How-To

  1. cd <project_root_folder>
  2. Do mkdir build, compile using cmake and put the executable named dphpc_project into the build folder
  3. Make sure your desired input image (.png or .jpg) is placed in the Input folder (used for point generation). A default "sample_image.png" is provided from the repo.
  4. Set the desired parameters in run_config.py in the Project root folder folder.
    Important: the exe_dir_name has to match the folder name where you placed your dphpc_project executable!
  5. Start the runs by running dphpcpython/scripts/grid_run_exe.py from your project root folder, you might specify a different run config
  6. Run benchmark_postprocessing.py in dphpcpython/scripts/ folder to create runtime and speedup plots.

Parameters

  • run_name: the run will appear in the Output folder as <run_name>_mmhh_ssssss
  • exe_dir_name: the name of the folder containing the dphpc_project executable file
  • save_png_input: whether to save the created input points as an img to look at
  • n_cores: specify on what numbers of cores to run as a list, e.g. [1, 2, 4] runs all configurations on 1, 2 and 4 cores
  • n_points: specify number of input points as a list, e.g. [5000, 10000, 100000]
  • img_files: specify the name of images you want to run on as a list, files have to be in Input folder
  • algorithms: specify names of algorithms to run, note all possible algorithms are given at the bottom
  • sub_size: specify sub sizes on which you want to run the algorithms
  • use_sub_sizes: whether you want to use this feature or not
  • n_iterations: number of times to repeat a given configuration to gain statistics information

Output

The output folder in Output/<run_name>_mmhh_ssssss will contain

  • input_data folder: holds the input images and data files for created input points
  • sub_<run_id> folders: for every run configuration one sub folder holding hull points data, json param file and timing file
  • a all_params.json file holding all configurations
  • a config.json file holding the run config in json format

Input Creation

In case you just want to create input data files from images without doing full parameter grid runs:
From the dphpcpython/scripts/ directory, run the create_points_from_img.py (specify parameters at top section of file) to create point distributions using images. The input image specified has to be located in the Input folder (.png or .jpg is fine)!
The Input data will be stored in a .dat file in the Input folder.

Visualization

Run animation_wrapper.py script in the dphpcpython/scripts/ folder. First specify the desired parameters at the top section of the script. However, there is some tweeking needed, it might not work since some output files might not have been written :), sorry...

dphpc-project's People

Contributors

pollymath avatar polly95 avatar

Watchers

 avatar James Cloos avatar Matthäus Heer avatar  avatar

Forkers

f4z3r

dphpc-project's Issues

Parallelise the sub graham scans

  • Parallelise the sub graham scans using either MPI or openMP.
  • Build a simple environment where we can run scaling against number of cores.
  • Create python script which reads in the produced data and creates strong and weak scaling plots.
  • Think about how we could further parallelise.

Fix the FileWriter

  • have an output folder to store all output files / create if not exist before every run
  • delete all output files inside before every run

Improve SplitVector

@leper
The input data should be split in a way such that the different subsets are not almost equal but hold different regions of the 2d particles such that one can distinguish the various subhulls. For n parts, split the data into n regions but apply some overlap of points at the region boundary.

Implement data loader for grid runs output

Based on the run_config.py and the json param files of the single runs and the timing files of the subruns, write a generic data loader class which puts all this information in a nice dictionary which then can be passed to post processing module to do some neat plots and shit.

Research convex hull performances

As stated in the email, we should compare against existing work. Could someone do reasearch on the topic and get comparisons for runtimes and speedups, note what they did such that we can do appropriate runs to compare against theirs?

Fix hull merge algorithm

Graham scan works just fine. The total hull around all graham sub hulls however leaves out some of the points: We need to check the tangent function. @jakobbeckmann
image

Algo variations

Algorithm variations

Note that all parallel variations do not require much code to be written they consists of a few functions calls to the correct algorithms and some data passing which is the same for all variations.

Serial

type status
Jarvis OK
Graham OK
Quickhull OK
All parallel algos as serial OK

Parallel (all subversions of Chan)

Follows the format algo1-algo2 where algo1 is the algo run in parallel on different nodes/processors and algo2 is the algorithm used on the hulls obtained from algo1.

type status
Jarvis - binary Jarvis OK
Jarvis - Jarvis OK
Jarvis - Graham OK
Jarvis - Quickhull OK
Graham - binary Jarvis (traditional Chan) OK (both full merge and binary-tree merge)
Graham - Jarvis OK
Graham - Graham OK
Graham - Quickhull OK
Quickhull - binary Jarvis OK
Quickhull - Jarvis OK
Quickhull - Graham OK
Quickhull - Quickhull OK

Notice all of the above come in two versions:

  1. Full merge: merge combines all hulls from algo1 in one go,
  2. Binary-tree merge: merges are performed 2-by-2. If the number of subhulls is odd, the last hull is merged last with the combination of all others.

Reflections

  1. Technically, unless performed for 2-by-2 merging, binary Jarvis will always be faster than regular Jarvis for merging. Therefore is it useful to actually build merges with regular Jarvis?
  2. Binary-tree merging, if not performed based on finish times (not implemented right now) will nearly always be slower than a regular merge. This is definitively the case of variations with Graham in algo1 as all Graham Scans performed in parallel take about the same time (this can different significantly for Quickhull). Is binary-tree merging useful? Or should it be used as a "proof of concept"?

Implement and test MPI

@polly95

For divide-and-conquer approached based algorithms:

  • Divide input data into chunks and send those to different nodes
  • On each node run the algorithm using openMP and merge
  • Send merged subhull back to main node and merge all subhulls
  • test on Euler

Post processing module

Think about what we want to show and implement. E.g.

  • runtime plots
  • strong scaling
  • runtime bar plot comparison for different input images
  • weak scaling
  • error bars for timing
  • check data normality
  • put all speedup plots into one figure with appropriate legends
  • runtime plots for different n cores depending on the part size

randomPointsGenerator class

Write 2D random point distribution class. Keep all points in a range of 0 to 1 and save them in a vector.

Chan2DVisualizer

  1. step
  • Write all initial points AND the points which end up in the hull to a file (in chan2d.cpp).
  • Read into python module and do final static visualization.
  1. step
  • Write all interim steps of the algorithm in the file and label them
  • Do animated python visualization

Check output correctness

Write hull output files for run iterations into indexed files and cross check for correctness.

Grid run controller in c++

Since the reading of the input file takes the longest time (1M points: 13sec, 500K points 6sec, ...), all runs which use a common input file should be done in a grid run controller in c++. So all the outer controlls, like param files, folder etc. is still handled by GridRunHandler.py, however the c++ loop controller fills the folders appropriately. Should save tons of time, especially for debugging.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.