This repository provides simple Python implementations of 2D forward modeling for acoustic media. The implementations are based on the Devito and Dask libraries. The h5py, segyio packages are also used, along with the 2D Marmousi model for testing. The main goal here is to monitor the memory during the execution of the modeling using the memory-profile tool.
In a nutshell, this is the full python packages required
- devito
- segyio
- h5py
- memory_profiler
- PyYAML
- dask-jobqueue
To install devito follow the instructions from Devito documentation. I recommend to set up a python environment with conda as also suggested in the installation web page
git clone https://github.com/devitocodes/devito.git
cd devito
conda env create -f environment-dev.yml
source activate devito
pip install -e .
All other packages (i.e. dask jobqueue, segyio) can be easily installed via pip
once you have activated the devito environment. To install them, run the following:
pip install segyio
pip install dask-jobqueue --upgrade
pip install h5py
pip install PyYAML
pip install memory-profiler
You can generate synthetic seismic data in parallel or sequential way. Aditionally, you can choose between to use the AcousticWaveSolver
Class from the Devito's folder examples
or a Devito Operator
. Thus, there are four possible options for generating the shot gathers:
- Forward modeling for all the shots in sequential using the
AcousticWaveSolver
Class - Forward modeling for all the shots in sequential using a Devito
Operator
(such as that in tutorial 1) - Forward modeling for all the shots in parallel in a dask cluster using the
AcousticWaveSolver
Class - Forward modeling for all the shots in parallel in a dask cluster using a Devito
Operator
(such as that in tutorial 1)
The parameters used in the modeling are defined in a YAML file (config.yaml
) in the config
folder. It includes parameters for configuring the geometry and the absolute paths of the folders where the physical parameters (in hdf5 format) will be read and generated shots (in segy format) will be stored. These paths are set in the variables parfile_path
and shotfile_path
of the dictionary solver_params
. Further control parameters are specified in the dictionary, including the minimum and maximum recording times t0
and tn
, sampling rate dt
, peak frequency f0
, order of accuracy of the finite difference scheme for spatial derivatives space_order
, number of layers of the absorbing boundary condition nbl
and floating-point precision to be used by Devito in code generation. In these settings you can decide whether or not to use an instance of the AcousticWaveSolver
to produce the seismic data. Specify a boolean value (true/false) in the use_solver
key, in according with the case.
To run in parallel you need set up the dask cluster in config.yaml
. Depending on available hardware, you can launch a local Dask cluster (on your laptop or single computing node) or a non-local one (on a distributed HPC with job scheduler installed). In latter case I only support an HPC cluster which uses the SLURM scheduler, but you can change this easily. Set use_local_cluster
to false
or true
, depending of the hardware. Here are the settings for running the forward modeling of 184 shots for the Marmousi model, on a quad core laptop (i.e., Intel(R) Core(TM) i7-8565U CPU @1.80GHz)
# dask cluster
use_local_cluster: True
queue: "GPUlongB"
project: "cenpes-lde"
n_workers: 4
cores: 1
processes: 1
memory: 2
shot_batch_size: 4
extra_job:
- '-e slurm-%j.err'
- '-o slurm-%j.out'
- '--time=72:00:00'
- '--requeue'
- '--job-name="dask-modeling"'
To optimize the processing of a large number of shots, we divided them into batches. The number of shots per batch is set in shot_batch_size
and is equal to the number of workers n_workers
. If the local cluster configuration is used, the memory
key represents the memory limit (in GB) per worker processes. On the contrary, it represents the total amount of memory (in GB) that the code could possibly use (not the memory per worker). For the local cluster configuration, only n_workers
and memory
parameters are of interest. The keys queue
, project
and extra_job
are meaningful only in the case of non-local cluster usage.
After the configuration is completed, you can profile each of the options provided in the above list using GNU Make
. To do so, you can run make
with the specific target for each option. For first and second options, use
make serial.solver
and
make serial.operator
In the case of the third and fourth options
make dask.solver
and
make dask.operator
In each case, check the shotfile_path
folder to see the generated segy files. After finishing the modeling process for each option, you can plot the recorded memory usage running the mprof plot
command for the files generated (in date time format). You should obtain figures such as the following:
For serial case I initially found results somewhat curious. When the Operator
was used, the memory remained almost constant over time, but when an AcousticWaveSolver
instance was created and the class method forward
was called repeatedly, the level of memory increased significantly. Since the memory comsumption was not approximately the same in both cases, I considered the possibility of the AcousticWaveSolver
class was not working as expected. After some experiments I came to realize that the creation of (Sparse)TimeFunctions
by the forward
method was the cause of the growth in memory (it is more than likely related to the issue #845). Fortunately, I sorted out the problem by reusing those objects and arrived to the same memory comsumption obtained by using the Operator
.
In the parallel case is possible to see a remarkable increase in the memory for both cases, when Operator
as AcousticWaveSolver
class are used. Although, each dask task (dedicated to the modeling of one shot) creates its own Operator
or solver
according to the case, I did not expect for a such increasing because theoretically the memory should be released after each function call. The memory growth is probably related to the same issue mentioned above. It is also important to note that the creation of a new Operator
or solver
for each source slows down the whole code. Added to this, sleep
functionality from time
module is used to wait (10 seconds) for the cluster, which ends up generating a longer running time, when compared to the time spent by serial versions. In the broadcast_solver branch scatter
function is used to send the Operator
or solver
to all dask tasks. In addition, a new parallelism strategy is used for trying to maintain a stable memory comsumption as far as possible.