Giter VIP home page Giter VIP logo

matcha's People

Contributors

davytorres avatar dominick99 avatar everythingfunctional avatar jerrymiah12 avatar rouson avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

jeffhammond

matcha's Issues

Object-Oriented branch issue




Program received signal SIGSEGV: Segmentation fault - invalid memory reference.



Backtrace for this error:

Running Tests



Test that

    a t_cell_collection

        is constructed with positions in the specified domain



A total of 1 test cases





Program received signal SIGSEGV: Segmentation fault - invalid memory reference.



Backtrace for this error:

#0  0x7f2b5353b692 in ???

#1  0x7f2b5353a865 in ???

#2  0x7f2b531d551f in ???

	at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0

#0  0x7f864ccbe692 in ???

#1  0x7f864ccbd865 in ???

#2  0x7f864c95851f in ???

	at ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0

#3  0x7f2b5323847e in __GI___libc_free

	at ./malloc/malloc.c:3368

#4  0x403fe7 in check_constructed_domain

	at test/t_cell_collection_test.f90:34

#5  0x45d95a in __garden_simple_test_case_m_MOD_run_without_input

	at build/dependencies/garden/src/garden/simple_test_case_m.f90:111

#6  0x475181 in __garden_test_item_m_MOD_run_without_input

	at build/dependencies/garden/src/garden/test_item_m.f90:86

#7  0x466e4a in __garden_simple_test_collection_m_MOD_run_without_input

	at build/dependencies/garden/src/garden/simple_test_collection_m.f90:138

#8  0x475181 in __garden_test_item_m_MOD_run_without_input

	at build/dependencies/garden/src/garden/test_item_m.f90:86

#9  0x466e4a in __garden_simple_test_collection_m_MOD_run_without_input

	at build/dependencies/garden/src/garden/simple_test_collection_m.f90:138

#10  0x475181 in __garden_test_item_m_MOD_run_without_input

	at build/dependencies/garden/src/garden/test_item_m.f90:86

#11  0x407778 in __garden_run_tests_m_MOD_run_tests

	at build/dependencies/garden/src/garden/run_tests_m.f90:63

#12  0x403a57 in run

	at test/main.f90:22

#13  0x403c68 in MAIN__

	at test/main.f90:5

#14  0x403ccd in main

	at test/main.f90:5

#3  0x7f864c9bb47e in __GI___libc_free

	at ./malloc/malloc.c:3368

#4  0x403fe7 in check_constructed_domain

	at test/t_cell_collection_test.f90:34

#5  0x45d95a in __garden_simple_test_case_m_MOD_run_without_input

	at build/dependencies/garden/src/garden/simple_test_case_m.f90:111

#6  0x475181 in __garden_test_item_m_MOD_run_without_input

	at build/dependencies/garden/src/garden/test_item_m.f90:86

#7  0x466e4a in __garden_simple_test_collection_m_MOD_run_without_input

	at build/dependencies/garden/src/garden/simple_test_collection_m.f90:138

#8  0x475181 in __garden_test_item_m_MOD_run_without_input

	at build/dependencies/garden/src/garden/test_item_m.f90:86

#9  0x466e4a in __garden_simple_test_collection_m_MOD_run_without_input

	at build/dependencies/garden/src/garden/simple_test_collection_m.f90:138

#10  0x475181 in __garden_test_item_m_MOD_run_without_input

	at build/dependencies/garden/src/garden/test_item_m.f90:86

#11  0x407778 in __garden_run_tests_m_MOD_run_tests

	at build/dependencies/garden/src/garden/run_tests_m.f90:63

#12  0x403a57 in run

	at test/main.f90:22

#13  0x403c68 in MAIN__

	at test/main.f90:5

#14  0x403ccd in main

	at test/main.f90:5

--------------------------------------------------------------------------

Primary job  terminated normally, but 1 process returned

a non-zero exit code. Per user-direction, the job has been aborted.

--------------------------------------------------------------------------

--------------------------------------------------------------------------

mpiexec noticed that process rank 0 with PID 0 on node dominick-VirtualBox exited on signal 11 (Segmentation fault).

--------------------------------------------------------------------------

Error: Command:

   `/home/linuxbrew/.linuxbrew/bin/mpiexec -n 2 --oversubscribe build/caf_3E83561A255DFE60/test/matcha-test`

failed to run.

<ERROR> Execution failed for object " matcha-test "

<ERROR>*cmd_run*:stopping due to failed executions

STOP 1

Separate interfaces from implementations

Define procedure interfaces in modules and procedure definitions in submodules:

  • initialize_positions (Damian)
  • create_distribution (Dominick)
  • move_tcells (Jeremiah)

Make cell motion random

The current algorithm for moving T cells doesn't use the generated random numbers. These need to be added as multiplicative factors in the time step algorithm inside the move_tcells subroutine.

Multi-image (parallel) execution using OpenCoarrays crashes.

Steps to reproduce error:

1.) Checkout main and do a git pull: git checkout main

2.) `Run set up script: ./setup.sh

3.) Run command for parallel execution: ./build/run-fpm.sh run --compiler caf --runner "cafrun -n 1"

dominick@dominick-VirtualBox:~/matcha$ ./setup.sh

Setting FC=/home/linuxbrew/.linuxbrew/bin/gfortran-12

Setting CC=/home/linuxbrew/.linuxbrew/bin/gcc-12

Setting CXX=/home/linuxbrew/.linuxbrew/bin/g++-12

PREFIX=/home/dominick/.local

PKG_CONFIG_PATH=/home/dominick/.local/lib/pkgconfig

Project is up to date



________________ Caffeine has been dispensed! ________________



To rebuild or to run tests or examples via the Fortran Package

Manager (fpm) with the required compiler/linker flags, pass a

fpm command to the build/run-fpm.sh script. For example, run

the program example/hello.f90 as follows:



./build/run-fpm.sh run --example hello

/home/dominick/matcha

PKG_CONFIG_PATH=/home/dominick/.local/lib/pkgconfig



________________ Matcha has been poured! ________________



To rebuild or to run tests or examples via the Fortran Package

Manager (fpm) with the required compiler/linker flags, pass a

fpm command to the build/run-fpm.sh script. For example, to

run the Matcha test suite in a single image execute the

following command in a Linux or macOS shell:



./build/run-fpm.sh test --flag -fcoarray=single



For parallel execution instructions, please see the README.md file.

./build/run-fpm.sh run --compiler caf --runner "cafrun -n 1"

input_m.f90                            compiling...

[  0%] Compiling...Fortran runtime error: EXECUTE_COMMAND_LINE: Invalid command line



Error termination. Backtrace:

#0  0x7fb0b9e259f2 in ???

#1  0x7fb0b9e264e9 in ???

#2  0x7fb0b9e2694c in ???

#3  0x7fb0ba088723 in ???

#4  0x7fb0ba088891 in ???

#5  0x55ab65268eaf in __fpm_filesystem_MOD_run

	at ./src/fpm_filesystem.F90:910

#6  0x55ab65263319 in __fpm_compiler_MOD_compile_fortran

	at ./src/fpm_compiler.f90:888

#7  0x55ab6524eac0 in build_target

	at ./src/fpm_backend.F90:321

#8  0x55ab6524eac0 in __fpm_backend_MOD_build_package

	at ./src/fpm_backend.F90:128

#9  0x55ab652488f0 in __fpm_MOD_cmd_run

	at ./src/fpm.f90:451

#10  0x55ab65241ebc in MAIN__

	at app/main.f90:74

#11  0x55ab6524143e in main

	at app/main.f90:11

Evaluate performance impacts of reducing copying

Motivation

At the heart of our current T-cell tracking algorithm inside the matcha function, we allocate a history array of t_cell_collection_t objects at line 52. We then invoke the t_cell_collection_t constructor function on the right-hand-side (RHS) of line 56 to create a new object at each time step. We then use an intrinsic assignment to copy each newly constructed object into successive elements of the history array. As we begin to study and tune Matcha's performance, let's plan to study the current code's performance relative to versions that eliminate the aforementioned object copying.

Two options for avoiding the copies are

  1. Using a subroutine constructor instead of a function constructor.
  2. Replacing the intrinsic assignment with a pointer assignment.

Option 1 is a much more common style in Fortran. Option 2 addresses a common criticism of functional programming: the frequent need for temporary objects such as the right-hand-side function result in line 56.

Hypothesis

I hypothesize that options 1 and 2 will have only minimal impact on execution times and scalability and will have associated costs in code clarity and robustness relative to the functional programming pattern currently exhibited in Matcha.

Subroutine Constructor

I recommend writing a generic binding named after the type to make it clear that the purpose of the corresponding subroutine is to serve as a constructor.

module type t_cell_collection_m

! ...

type t_cell_collection_t
   ! ...
contains
   procedure :: define
   generic :: t_cell_collection_t => define
end type

where ! ... indicates code that was omitted for brevity. With this approach, we would also write a corresponding pure module subroutine interface body in the same module and a procedure definition in the corresponding submodule. The pure subroutine passed-object dummy argument (often named self) will have the intent(out) attribute. All other arguments will have the intent(in) attribute.

Pointer Assignment

The code below provides an example of what this approach will look like. For simplicity, t_cell_collection_t has just one scalar component in this example below. The basic idea is to define a sister t_cell_collection_ptr_t type, which facilitates creating an array of pointers (history below). With this approach, we can still use a function constructor, but we replace the intrinsic assignment of line 52 with the following pointer assignment:

history(i)%t_cell_collection => t_cell_collection_t(i)

which eliminates the copy that motivated this approach.

Demonstration pointer code

module t_cell_collection_m
  implicit none

  private
  public :: t_cell_collection_t, t_cell_collection_ptr_t

  type t_cell_collection_t
    private
    real x_
  contains
    procedure :: x
  end type

  type t_cell_collection_ptr_t
    type(t_cell_collection_t), pointer :: t_cell_collection=>null()
  end type

  interface t_cell_collection_t

    pure module function construct(j) result(t_cell_collection)
      implicit none
      integer, intent(in) :: j
      type(t_cell_collection_t), pointer :: t_cell_collection
    end function

  end interface

  interface

    pure module function x(self) result(self_x)
      implicit none
      class(t_cell_collection_t), intent(in) :: self
      integer self_x
    end function

  end interface

end module

submodule(t_cell_collection_m) t_cell_collection_s
  implicit none
contains

  module procedure construct
    allocate(t_cell_collection)
    t_cell_collection%x_ = real(j)
  end procedure 

  module procedure x
    self_x = self%x_
  end procedure 

end submodule

program main
  use t_cell_collection_m, only : t_cell_collection_t, t_cell_collection_ptr_t
  implicit none

  type(t_cell_collection_ptr_t), allocatable :: history(:)
  integer I

  allocate(history(10))

  do i =1,size(history)
    history(i)%t_cell_collection => t_cell_collection_t(i)
  end do

  do i =1,size(history)
    print *,history(i)%t_cell_collection%x()
  end do

  do i =1,size(history)
    if (associated(history(i)%t_cell_collection)) deallocate(history(i)%t_cell_collection)
  end do
end program main

Distribute cells across images

  • Write a unit test that checks that the number of cells summed across images is the total number of cells.
  • Add the Sourcery library as a dependency in the fpm manifest (fpm.toml).
  • Use the Sourcery library's data_partition_t class to distribute the cells.

Single-image crash in ifx libc

Steps to reproduce the runtime error:

git clone -b use_ifx https://github.com/berkeleylab/matcha
cd matcha
cp templates/fpm.toml-template ./fpm.toml
export FOR_COARRAY_NUM_IMAGES=1
$ fpm run --compiler ifx --flag "-coarray=shared"
 + mkdir -p build/dependencies
Initialized empty Git repository in /storage/users/rouson/tmp/matcha/build/dependencies/assert/.git/
remote: Enumerating objects: 29, done.
remote: Counting objects: 100% (29/29), done.
remote: Compressing objects: 100% (27/27), done.
remote: Total 29 (delta 0), reused 16 (delta 0), pack-reused 0
Unpacking objects: 100% (29/29), 13.94 KiB | 528.00 KiB/s, done.
From https://github.com/sourceryinstitute/assert
 * branch            a3065a9dffaedf085fbd262c6bf31b309aa43a4a -> FETCH_HEAD
distribution_m.f90                     done.
input_m.f90                            done.
assert_m.F90                           done.
characterizable_m.f90                  done.
data_partition_m.f90                   done.
input_s.f90                            done.
t_cell_collection_m.f90                done.
assert_s.F90                           done.
intrinsic_array_m.F90                  done.
data_partition_s.F90                   done.
do_concurrent_m.f90                    done.
output_m.f90                           done.
t_cell_collection_s.F90                done.
intrinsic_array_s.F90                  done.
matcha_m.f90                           done.
distribution_s.F90                     done.
do_concurrent_s.f90                    done.
output_s.f90                           done.
matcha_s.F90                           done.
main.F90                               done.
libmatcha.a                            done.
matcha                                 done.
[100%] Project compiled successfully.
[jupiter:2521117:0:2521117] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x7f6ade4e1498)
==== backtrace (tid:2521117) ====
 0  /lib/libucs.so.0(ucs_handle_error+0x2e4) [0x7f6b4c42be74]
 1  /lib/libucs.so.0(+0x3008f) [0x7f6b4c42c08f]
 2  /lib/libucs.so.0(+0x303c4) [0x7f6b4c42c3c4]
 3  /lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7f6b5104a420]
 4  build/ifx_C4BBCE17D21A365D/app/matcha() [0x40b43f]
 5  build/ifx_C4BBCE17D21A365D/app/matcha() [0x40a308]
 6  build/ifx_C4BBCE17D21A365D/app/matcha() [0x407469]
 7  build/ifx_C4BBCE17D21A365D/app/matcha() [0x40552e]
 8  build/ifx_C4BBCE17D21A365D/app/matcha() [0x40527d]
 9  /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x7f6b50e68083]
10  build/ifx_C4BBCE17D21A365D/app/matcha() [0x40519e]
=================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 0 PID 2521117 RUNNING AT jupiter
=   KILLED BY SIGNAL: 11 (Segmentation fault)
===================================================================================
<ERROR> Execution failed for object " matcha "
<ERROR>*cmd_run*:stopping due to failed executions
STOP 1

Apply license and copyright

  • Add the code-blocked text below in a new LICENSE.txt file at the top level of the source tree.
  • Add a brief statement referencing the LICENSE.txt file at the top of each source file in the app, src, and test subdirectories.

For an example of the brief reference statement, see lines 1-2 of the sources files in Caffeine.

**********************************
** Matcha Terms and Conditions **
**********************************

All files in this directory and all subdirectories (except where otherwise noted)
are subject to the following copyright and licensing terms:

*** Copyright Notice ***

Motility Analysis of T-Cell Histories in Activation (Matcha) Copyright (c) 2022, 
The Regents of the University of California, through Lawrence Berkeley National 
Laboratory (subject to receipt of any required approvals from the U.S. Dept. of 
Energy) and Northern New Mexico College. All rights reserved.

If you have questions about your rights to use or distribute this software,
please contact Berkeley Lab's Intellectual Property Office at
[[email protected]](mailto:[email protected]).

NOTICE.  This Software was developed under funding from the U.S. Department
of Energy and the U.S. Government consequently retains certain rights.  As
such, the U.S. Government has been granted for itself and others acting on
its behalf a paid-up, nonexclusive, irrevocable, worldwide license in the
Software to reproduce, distribute copies to the public, prepare derivative 
works, and perform publicly and display publicly, and to permit others to do so.


****************************

*** License Agreement ***


Motility Analysis of T-Cell Histories in Activation (Matcha) Copyright (c) 2022,
The Regents of the University of California, through Lawrence Berkeley National
Laboratory (subject to receipt of any required approvals from the U.S. Dept. of
Energy) and Northern New Mexico College. All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

(1) Redistributions of source code must retain the above copyright notice,
this list of conditions and the following disclaimer.

(2) Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.

(3) Neither the name of the University of California, Lawrence Berkeley
National Laboratory, U.S. Dept. of Energy, Northern New Mexico College 
nor the names of its contributors may be used to endorse or promote 
products derived from this software without specific prior written permission.


THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
POSSIBILITY OF SUCH DAMAGE.

You are under no obligation whatsoever to provide any bug fixes, patches,
or upgrades to the features, functionality or performance of the source
code ("Enhancements") to anyone; however, if you choose to make your
Enhancements available either publicly, or directly to Lawrence Berkeley
National Laboratory, without imposing a separate written license agreement
for such Enhancements, then you hereby grant the following license: a
non-exclusive, royalty-free perpetual license to install, use, modify,
prepare derivative works, incorporate into other computer software,
distribute, and sublicense such enhancements or derivative works thereof,
in binary and source code form.

Declare array dummy arguments as assumed-shape

Current array dummy arguments are automatic explicit-shape arrays as defined by Note 2 in clause 8.5.8.1 of the Fortran 2018 standard. This approach was common in Fortran 77, but they pose a consistency issue in that the array dimensions are passed as dummy arguments, opening the possibility that incorrect dimensions could be passed. Switching to the more modern (Fortran 90 and beyond) assumed-shape array dummy arguments makes the code more concise (fewer arguments) and more robust. This is both a stylistic best practice and a prelude to parallelization because eliminating the dimensions makes it more obvious that the loop over the elements can be replaced with an array statement. Let's switch

  • random_positions in the initialize_positions subroutine
  • cumulative_distribution and vel in the create_distribution subroutine (David) (no longer arguments)
  • x, y , z, vel, randome_speeds, random_direction in the move_tcells subroutine (Dominick Damian)

Define an integration test

  • Convert the original main program into a function that returns a result for comparison to the new code. The result would probably be declared something like double precision, allocatable :: positions(:,:,:), which would be allocated to shape [ncells, npositions, 3], where 3 is the number of spatial directions and where positions(:,:,1) holds x values, positions(:,:,2) holds y values, and positions(:,:,3) holds z values.
  • TBD: we can discuss in an upcoming call how best to compare the function result with that of the new code.

Name magic numbers

  • main: name the velocity_space_dimension (4)
  • move_m: name the velocity_space_dimension (4)
  • t_cell_s: name the domain scaling factor (100)

GPU offloading

Implement GPU offloading using one the following:

  • Automatic offloading via the Intel ifx compiler running at University of Oregon
  • OpenMP offloading
  • OpenACC

Dynamically allocate cell data arrays

Rationale

The original version of TCelSim fixes the number of cells as the compile-time constant value ncells. In parallel execution, the set of cells will be distributed across images. Because the number of images will not necessarily be known at compile time, we need to allow for the number of cells in a given image to be determined at runtime. For this reason, arrays that have ncells as one dimension's extent need to be dynamically allocated.

Steps

  • Add the allocatable attribute to the following array declarations in the main program: random_number_table, x, y, and z.
  • Add an allocate statement in the main program for each of the above arrays

Enable the use of Caffeine as an alternative to OpenCoarrays

  • Add Caffeine as a dependency in the fpm manifest (fpm.toml)
  • Change the extension in the name of files that will need preprocessing: .f90 -> .F90
  • Add use caffeine_m statements using the C preprocessor
  • Condition the compilation of the new use statements in Matcha on a C preprocessor macro such as USE_CAFFEINE. For example, the matcha_m module will need
#ifdef USE_CAFFEINE
    use caffeine_m, only : this_image => caf_this_image
#endif
  • Repeat the 2nd through 4th steps above for the Caffeine dependency Sourcery, focusing on the data_partition class. If many additional parts of Sourcery need similar revision, we might consider splitting data_partition_t off from Sourcery to become its own library.
  • Also repeat the 2nd through 4th steps above for the Sourcery dependency Assert.
  • Add instructions to the README.md for parallel execution with Caffeine: probably something like fpm test --flag "-DUSE_CAFFEINE"

Associate statements cause 'forrtl: severe (174): SIGSEGV, segmentation fault occurred' error on ifx

The following associate statements appear to cause an error:

do_concurrent_s.f90: Line 14
do_concurrent_s.f90: Line 84
distribution_s.F90: Line 58
matcha_s.F90: Line 46
matcha_s.F90: Line 50

After replacing them, matcha can be run on the ifx compiler using:
fpm run --compiler ifx --flag "-g -O0 -traceback -coarray"

However, using:
fpm test --compiler ifx --flag "-g -O0 -traceback -coarray"

results in the following error:

build/dependencies/iso_varying_string/src/iso_varying_string.f90(1316): error #5623: **Internal compiler error: internal abort** Please report this error along with the circumstances in which it occurred in a Software Problem Report.  Note: File and line given may not be explicit cause of this error.

        string_iachar = iachar(char(c))

------------------------^

compilation aborted for build/dependencies/iso_varying_string/src/iso_varying_string.f90 (code 3)

<ERROR> Compilation failed for object " build_dependencies_iso_varying_string_src_iso_varying_string.f90.o "

<ERROR>stopping due to failed compilation

STOP 1

Diagnose test failure on sync-images branch

As of commit 30c8096, running fpm test on the sync-images branch yields one test failure:

 A subdomain_t
    passes on computing a correctly shaped Laplacian for a 2D flat-topped, step-like plateau.
    passes on reaching the correct steady state solution.
    FAILS on functional pattern results matching procedural results.
 2 of 3 tests pass.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.