Giter VIP home page Giter VIP logo

ogl's People

Contributors

chihta-wang avatar greole avatar hendriceh avatar zbinkz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

ogl's Issues

Improve exporting of the underlying system

Currently only A, x, and b are exported, but for debugging purposes other information might be valuable eg. residual norms over iteration.

Thus, when the export flag is set a folder named export should be created with subfolders for each timestep and also a json file for the residuals.

Implement a device version of the normfactor calculation.

Currently OGL relies on the OpenFOAM calculation of the normfactor. This should be implemented as separate device function. The implementation has two parts

  • a l1 norm in ginkgo
  • computing the sum of the system matrix column vectors

Solver crashes with symbol lookup error -> undefined symbol: _ZNK4Foam5UListINS_8UPstream11commsStructEEixEi

Might be connected to label size 64 version of OpenFOAM used in this case.

Courant Number mean: 0.112964631622221 max: 0.131831232657451
Time = 0.2

PIMPLE: iteration 1
Turbulent DFSEM patch: ii_southwest seeded 1047 eddies with total volume 1923959974.27959
DILUPBiCGStab:  Solving for Ux, Initial residual = 0.000597809484897753, Final residual = 1.82425475850801e-10, No Iterations 1
DILUPBiCGStab:  Solving for Uy, Initial residual = 0.000493708964616313, Final residual = 1.67047487048995e-10, No Iterations 1
DILUPBiCGStab:  Solving for Uz, Initial residual = 0.386410738136328, Final residual = 1.08809247844095e-07, No Iterations 1
DILUPBiCGStab:  Solving for T, Initial residual = 9.23515453836181e-08, Final residual = 5.88366283166232e-14, No Iterations 1
[OGL LOG][lduLduBase.H:119] Initialising OGL
        OGL commit: v0.5.2 6c561ab+
        Branch: dev
        Build type: Release
        Ginkgo version: 1.8.0 ( develop)
        Ginkgo commit: fc86d48b78cebd2b2c5833a2dcf0fe40f615cf19
        MPI is GPU aware: 1
        Forces host buffer based communication: 0
        CPU ranks per GPU: 1
        Matrix format: Coo
        End OGL_INFO
buoyantBoussinesqPimpleFoam: symbol lookup error: /home/hk-project-exasim/hgf_tnn2411/OpenFOAM/hgf_tnn2411-v2212/platforms/linux64GccDPInt64Opt/lib/libOGL.so: undefined symbol: _ZNK4Foam5UListINS_8UPstream11commsStructEEixEi
buoyantBoussinesqPimpleFoam: symbol lookup error: /home/hk-project-exasim/hgf_tnn2411/OpenFOAM/hgf_tnn2411-v2212/platforms/linux64GccDPInt64Opt/lib/libOGL.so: undefined symbol: _ZNK4Foam5UListINS_8UPstream11commsStructEEixEi

Improve backend selection stability

Currently, it is very easy to not compile a GPU backend but selecting it at runtime. Probably it is the best solution to throw an error if the ExecutorHandler cannot instantiate a corresponding executor.

Make devicePersistent MPI aware

Once #49 fixing #45 is implemented devicePersistentFields ie devicePersistentData<gko::Array<T>> need some way of handling parallel execution. This includes:

  1. For local data, storing the MPI rank to which the data corresponds.
  2. Additionally, some way to handle gathered/scattered global fields. If the sparsity pattern and matrix values are not stored directly as devicePersistentField and instead the CSR matrix is kept, only the initial values and rhs are devicePersistentFields.
Data Local Global Location Updated
lduCsrMapping persistent not needed host constant
initial guess overwrite persistent device constant
rhs overwrite overwrite device updated
matrix values overwrite overwrite both updated
globalIndex* persistent - host constant

Here, overwrite indicates that fields don't need to be stored, but storing would avoid reallocation. Global data is (for now) obtained by gathering the local data only. Thus, if global data is marked to be persistent (update false) a look up in the objectRegistry is performed.

Cmake fails when -DOGL_USE_EXTERNAL_GINKGO=On is set

See title, building with cmake when setting -DOGL_USE_EXTERNAL_GINKGO=On . The relevant error message is as follows

CMake Error at /home/go/data/code/ginkgo/cmake/Modules/FindHWLOC.cmake:124 (add_library):
  add_library cannot create imported target "hwloc" because another target
  with the same name already exists.
Call Stack (most recent call first):
  /usr/local/lib/cmake/Ginkgo/GinkgoConfig.cmake:185 (find_package)
  CMakeLists.txt:27 (find_package)

Add stopping criterion for distributed solver

The current implementation of the stopping criterion relies on vectors of typegko::matrix::Dense<> and gko::matrix::Csr<> to compute the normfactor. For the distributed gko version a normfactor computation for distributed vectors/matrices has to be implemented.

Avoid raw pointers

Currently, a lot of raw pointers are used to hold IO classes returned from the OF objectRegistry. This causes several indirect memory leaks.

Clean up cmake files

The cmake files contain several uneeded and unmaintained code and should be cleaned.

installation errors related to does not support platform specification, but platform

I got the following error related to the AMDDeviceLibs_DIR when using:

cmake -DGINKGO_BUILD_HIP=ON -DGINKGO_BUILD_OMP=ON

CMake Warning:
  No source or binary directory provided.  Both will be assumed to be the
  same as the current working directory, but note that this warning will
  become a fatal error in future CMake releases.


-- The C compiler identification is GNU 11.4.1
-- The CXX compiler identification is GNU 11.4.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test OGL_C_COVERAGE_SUPPORTED
-- Performing Test OGL_C_COVERAGE_SUPPORTED - Success
-- Performing Test OGL_C_TSAN_SUPPORTED
-- Performing Test OGL_C_TSAN_SUPPORTED - Failed
-- Performing Test OGL_C_ASAN_SUPPORTED
-- Performing Test OGL_C_ASAN_SUPPORTED - Failed
-- Performing Test OGL_C_LSAN_SUPPORTED
-- Performing Test OGL_C_LSAN_SUPPORTED - Failed
-- Performing Test OGL_C_UBSAN_SUPPORTED
-- Performing Test OGL_C_UBSAN_SUPPORTED - Failed
-- Performing Test OGL_CXX_COVERAGE_SUPPORTED
-- Performing Test OGL_CXX_COVERAGE_SUPPORTED - Success
-- Performing Test OGL_CXX_TSAN_SUPPORTED
-- Performing Test OGL_CXX_TSAN_SUPPORTED - Failed
-- Performing Test OGL_CXX_ASAN_SUPPORTED
-- Performing Test OGL_CXX_ASAN_SUPPORTED - Failed
-- Performing Test OGL_CXX_LSAN_SUPPORTED
-- Performing Test OGL_CXX_LSAN_SUPPORTED - Failed
-- Performing Test OGL_CXX_UBSAN_SUPPORTED
-- Performing Test OGL_CXX_UBSAN_SUPPORTED - Failed
-- Performing Test OGL_HIP_COVERAGE_SUPPORTED
-- Performing Test OGL_HIP_COVERAGE_SUPPORTED - Success
-- Performing Test OGL_HIP_TSAN_SUPPORTED
-- Performing Test OGL_HIP_TSAN_SUPPORTED - Failed
-- Performing Test OGL_HIP_ASAN_SUPPORTED
-- Performing Test OGL_HIP_ASAN_SUPPORTED - Failed
-- Performing Test OGL_HIP_LSAN_SUPPORTED
-- Performing Test OGL_HIP_LSAN_SUPPORTED - Failed
-- Performing Test OGL_HIP_UBSAN_SUPPORTED
-- Performing Test OGL_HIP_UBSAN_SUPPORTED - Failed
-- Looking for C++ include cxxabi.h
-- Looking for C++ include cxxabi.h - found
-- The C compiler identification is GNU 11.4.1
-- The CXX compiler identification is GNU 11.4.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done
-- Generating done
-- Build files have been written to: /home/klaus/OpenFOAM/ThirdParty-v2206/OGL-dev/third_party/ginkgo/download
[ 11%] Creating directories for 'ginkgo_external'
[ 22%] Performing download step (git clone) for 'ginkgo_external'
Klone nach 'src'...
Branch 'sparse-communicator' folgt nun 'origin/sparse-communicator'.
Zu neuem Branch 'sparse-communicator' gewechselt
[ 33%] Performing update step for 'ginkgo_external'
HEAD ist jetzt bei 672caab3d remove overloaded to avoid c++17 dependency
[ 44%] No patch step for 'ginkgo_external'
[ 55%] Performing configure step for 'ginkgo_external'
-- The C compiler identification is GNU 11.4.1
-- The CXX compiler identification is GNU 11.4.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found hipconfig: /opt/rocm/hip/bin/hipconfig
-- Could NOT find PAPI (missing: PAPI_LIBRARY PAPI_INCLUDE_DIR sde) (Required is at least version "7.0.1.0")
-- HIP platform set to amd
-- Found HIP: /opt/rocm/hip (found version "5.7.0-0") 
CMake Error at /usr/share/cmake/Modules/CMakeFindDependencyMacro.cmake:47 (find_package):
  By not providing "FindAMDDeviceLibs.cmake" in CMAKE_MODULE_PATH this
  project has asked CMake to find a package configuration file provided by
  "AMDDeviceLibs", but CMake did not find one.

  Could not find a package configuration file provided by "AMDDeviceLibs"
  with any of the following names:

    AMDDeviceLibsConfig.cmake
    amddevicelibs-config.cmake

  Add the installation prefix of "AMDDeviceLibs" to CMAKE_PREFIX_PATH or set
  "AMDDeviceLibs_DIR" to a directory containing one of the above files.  If
  "AMDDeviceLibs" provides a separate development package or SDK, be sure it
  has been installed.
Call Stack (most recent call first):
  /opt/rocm-5.7.0/lib/cmake/hip/hip-config-amd.cmake:67 (find_dependency)
  /opt/rocm/hip/lib/cmake/hip/hip-config.cmake:150 (include)
  /usr/share/cmake/Modules/CMakeFindDependencyMacro.cmake:47 (find_package)
  /opt/rocm/hipblas/lib/cmake/hipblas-config.cmake:90 (find_dependency)
  cmake/hip.cmake:167 (find_package)
  CMakeLists.txt:104 (include)


-- Configuring incomplete, errors occurred!
See also "/home/klaus/OpenFOAM/ThirdParty-v2206/OGL-dev/third_party/ginkgo/build/CMakeFiles/CMakeOutput.log".
gmake[2]: *** [CMakeFiles/ginkgo_external.dir/build.make:93: ginkgo_external-prefix/src/ginkgo_external-stamp/ginkgo_external-configure] Fehler 1
gmake[1]: *** [CMakeFiles/Makefile2:83: CMakeFiles/ginkgo_external.dir/all] Fehler 2
gmake: *** [Makefile:91: all] Fehler 2
CMake Error at cmake/package_helpers.cmake:46 (message):
  Build step for ginkgo_external/download failed: 2
Call Stack (most recent call first):
  third_party/ginkgo/CMakeLists.txt:2 (ginkgo_load_git_package)


-- Configuring incomplete, errors occurred!
See also "/home/klaus/OpenFOAM/ThirdParty-v2206/OGL-dev/CMakeFiles/CMakeOutput.log".
See also "/home/klaus/OpenFOAM/ThirdParty-v2206/OGL-dev/CMakeFiles/CMakeError.log".

I tried to set the AMDDeviceLibs_DIR to /opt/rocm/lib/cmake/AMDDeviceLibs using:

cmake -DGINKGO_BUILD_HIP=ON -DGINKGO_BUILD_OMP=ON -AMDDeviceLibs_DIR=/opt/rocm/lib/cmake/AMDDeviceLibs which seemed to work but results in another error:

CMake Warning:
  No source or binary directory provided.  Both will be assumed to be the
  same as the current working directory, but note that this warning will
  become a fatal error in future CMake releases.


CMake Error: Error: generator platform: MDDeviceLibs_DIR=/opt/rocm/lib/cmake/AMDDeviceLibs
Does not match the platform used previously: 
Either remove the CMakeCache.txt file and CMakeFiles directory or choose a different binary directory.
[klaus@localhost OGL-dev]$ cmake -DGINKGO_BUILD_HIP=ON -DGINKGO_BUILD_OMP=ON -AMDDeviceLibs_DIR=/opt/rocm/lib/cmake/AMDDeviceLibs
CMake Warning:
  No source or binary directory provided.  Both will be assumed to be the
  same as the current working directory, but note that this warning will
  become a fatal error in future CMake releases.


CMake Error at CMakeLists.txt:3 (project):
  Generator

    Unix Makefiles

  does not support platform specification, but platform

    MDDeviceLibs_DIR=/opt/rocm/lib/cmake/AMDDeviceLibs

  was specified.


CMake Error: CMAKE_C_COMPILER not set, after EnableLanguage
CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
-- Configuring incomplete, errors occurred!
See also "/home/klaus/OpenFOAM/ThirdParty-v2206/OGL-dev/CMakeFiles/CMakeOutput.log".
See also "/home/klaus/OpenFOAM/ThirdParty-v2206/OGL-dev/CMakeFiles/CMakeError.log".

Can this be fixed? I am using ROCM 5.7.1 for RHEL and a gfx1031

Unify symmetric and asymetric solvers

Instead of having for example several BiCGStab implementations for ldu and Ldu base classes, the base class should be specified as template parameter.

bad_type id issues

If running with forceHostBuffer true; ginkgo throws a

terminate called after throwing an instance of 'std::bad_typeid'

error

Basic scattering

After solving on the device the solution vector has to be communicated back to to all ranks.

  1. The simplest solution for now would be to have a global array on all ranks to which the solution vector is scattered.
  2. In a second only the rank specific data should be copied to the respective ranks

Add cmake workflow preset

Since we have a configure and build preset. A workflow preset should be also available. However, this might force user to use a more recent version of cmake.

Create a base class for device persistent data

At several places in the OGL code base the following pattern occurs.

  • Given a name the object registry is consulted wether this object was already created
  • If an object with the given name does not exist create one, add to the object registry, and return the pointer to it
  • If it exists and no update was requested the pointer is returned and nothing else happens,
  • If it exists and an update was requested call the an update routine

This could be all wrapped into a base class and used for:

  • the sparsity pattern,
  • the matrix values,
  • the global addressing

and probably other IO classes like the IOExecPtr. It should also address rank specific and global data for parallel runs.

Refactor lduLduBase

The current structure of the lduLduBase class is not ideal and several changes should be considered:

  1. rename the class to make the name more descriptive, a candidate would be SegregatetedCoupledSolverBase
  2. Instead of deriving from HostMatrix and IOGKOMatrixHandler an intermediate class LduCsrWrapper should be implemented

The new SegregatetedCoupledSolverBase class has the following responsibilities

Data members:

  • vector of DevicePersistent initial guesses and rhs

Methods:

  • generate_solver
  • generate_preconditioner
  • initialisation of initial guesses and rhs
  • solve_impl

The new intermediate LduCsrWrapper class has the following responsibilities:

Data members:

  • DevicePersistent sparsity pattern
  • DevicePersistent LduCsrMapping
  • DevicePersistent values

Methods:

  • init/update sparsity pattern
  • init/update lduCsrMapping
  • init/update values

Most of the functionality could be implement as a mixin which provide the needed update and init functions.

This depends on the implementation of DevicePersistent class #49

Offload matrix value reordering to device.

Currently, the matrix values are reordered in serial on the host. This should be avoided and offloaded to the device.

TODO:

  • Test whether it is more expensive to create COO on the device and convert to CSR
  • or keep sorting the elements before creating COO

Multiple GPUs

Once global addressing #43 is implemented and #42 merge, single GPU per rank should be implemented.

Add test case to CI/CD pipeline

The CI/CD pipeline should include a dnsBoxturb16 test case to ensure that basic solver and the omp/reference executor work.

Rename dpcpp executor to sycl

Rename dpcpp executor to sycl. Since dpcpp has been rebranded to sycl we should also reflect this. The executor selection takes place in ExecutorHandler.H. Ideally, we would add a deprecation warning for dpcpp.

IR support

Currently, the iterative refinement (IR) solver is excluded from compilation. It should be re-added including integration tests.

Compiling against single precision OpenFOAM leads to undefined symbol errors

Currently, we only support DP scalars. Compiling against SP OpenFOAM leads to the following error.

v2312/platforms/linux64GccSPInt32Opt/lib/libOGL.so: undefined symbol: _ZNK4Foam9lduMatrix6solver11scalarSolveERNS_5FieldIdEERKS3_h

A workaround is to modify the CMakeLists.txt and set

  target_compile_definitions(OGL PUBLIC WM_LABEL_SIZE=32 WM_ARCH_OPTION=32
              NoRepository WM_SP)

here the important bit is to set WM_SP since this will be used with #ifdef macros to set the scalar type in scalar.H. Eventually this should be done automatically by cmake, by reading$WM_LABEL_SIZE and $WM_PRECISION_OPTION.

Attempt to cast type * to type processor Error

The current version of OGL (v.0.5.1) fails with the follow error if other interfaces besides processor interfaces are present.

101 [OGL LOG][Proc: 0]: global_index_init: 0.305 [ms]
102 [0] 
103 [0] 
104 [0] --> FOAM FATAL ERROR: (openfoam-2306)
105 [0] Attempt to cast type cyclicACMI to type processor
106 [0] 
107 [0]   From Type& Foam::refCast(U&) [with Type = const Foam::processorFvPatch; U = const Foam::lduInterface]
108 [0]   in file /home/greole/OpenFOAM/openfoam/src/OpenFOAM/lnInclude/typeInfo.H at line 154.
109 [0] 
110 FOAM parallel run aborting

Compilation issue

Hi i have compilation issue, GINGKO is compiling fine, however compilation of OGL produces errors. I tried different OF (OF 6, OF 8, OF10, OFv2106, OF2206) Always it causes similar error messages full log in attachemnt :

/root/OpenFOAM/-8/platforms/linux64GccDPInt32Opt/include/ginkgo/core/base/utils_helper.hpp:215:58: error: static assertion failed: p must be an rvalue for this function to work
  215 |     static_assert(std::is_rvalue_reference<decltype(p)>::value,
      |                                                          ^~~~~

I tried both GCC and CLANG compilers.
makelog.txt

Incorporate global adressing

Whenever matrix indices are handled a clear way is needed to distinguish local and global adressing. This includes for example the HostMatrix class.

Use OF solver cache instead of own device persistency

Currently, device persistency is guaranteed by implementing deriving from DevicePersistent<>, however OF allows to cache solver. This should be sufficient to avoid re-instantiation of the full solver. This should be explored more in detail to remove the need for DevicePersistent<> alltogether.

Improve logging capabilities.

Currently there are only two options, logging on or off. I would be helpful add a verbosity level, in the following manner

  • 0 important information giving an high level overview like timings
  • 1 main program flow
  • 2 Information that allows to monitor allocations and a detailed overview of the program flow

Permute on device

Previously it was possible to reorder matrix coefficients on device, see

this has been temporarily deactivated in favor of repartitioning. Since repartitioning should be handled ideally on the Ginkgo side and in anycase should be possible to do on the device. An optional reordering on the device should be reintroduced.

Add pragma once include guards.

We should replace the long include guards

#ifndef OGL_lduLduBase_INCLUDED_H
#define OGL_lduLduBase_INCLUDED_H
...
#endif

by #pragma once . Also some .H might not even have an include guard.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.