Turbulent Flow Simulation using Autoregressive Conditional Diffusion Models (ACDM)

This repository contains the source code for the paper Turbulent Flow Simulation using Autoregressive Conditional Diffusion Models by Georg Kohl, Liwei Chen, and Nils Thuerey.

Our work targets the prediction of turbulent flow fields from an initial condition using autoregressive conditional diffusion models (ACDMs). Our method relies on the DDPM approach, a class of generative models based on a parameterized Markov chain. They can be trained to learn the conditional distribution of a target variable given a conditioning. In our case, the target variable is the flow field at the next time step, and the conditioning is the flow field at the current time step, i.e., the simulation trajectory is created via autoregressive unrolling of the model. We showed that ACDMs can accurately and probabilistically predict turbulent flow fields, and that the resulting trajectories align with the statistics of the underlying physics. Furthermore, ACDMs can generalize to flow parameters beyond the training regime, and exhibit high temporal rollout stability, without compromising the quality of generated samples.

Further information about this work can be also found at our project website. Feel free to contact us if you have questions or suggestions regarding our work or the source code provided here.

Simple Demonstration

Using the button above, you can run a simple example notebook in Google Colab that demonstrates ACDM (a Google account is required). Instead, it is also possible to locally run the provided acdm-demo.ipynb file, by following the installation instructions in the next section and running the notebook inside the created conda environment afterwards.

Installation

In the following, Linux is assumed as the OS but the installation on Windows should be similar.

We recommend to install the required python packages (see requirements.yml) via a conda environment (e.g. using miniconda), but it may be possible to install them with pip (e.g. via venv for a separate environment) as well.

conda env create -f requirements.yml
conda activate ACDM

In the following, all commands should be run from the root directory of this source code. Running the training or sampling code in the src directory requires the generation of data sets as described in the following.

Basic Usage

Once the data sets are generated, models can be trained using the scripts src/training_*.py, trained models can be sampled with src/sample_models_*.py, and model predictions can be evaluated and visualized with src/plot_*.py. Each script contains various configuration options at the beginning of the file. All files should be run according to the following pattern:

python src/training_*.py
python src/sample_models_*.py
python src/plot_*.py

Directory Structure

The directory src/turbpred contains the general code base that the training and sampling scripts rely on. The src/lsim directory contains the LSiM metric that is used for evaluations. The data directory contains data generation scripts, and downloaded or generated data sets should end up there as well. The runs directory contains the trained models which are loaded by the sampling scripts, as well as further checkpoints and log files. The results directory contains the results from the sampling, evaluation, and plotting scripts. Sampled model predictions are written to this directory as compressed numpy arrays, that are read by the plotting scripts, which in turn write the resulting plots to the same directory.

Training Monitoring with Tensorboard

During training, various values, statistics and plots are logged to Tensorboard, allowing for monitoring the training progress. To start Tensorboard, use the following command:

tensorboard --logdir=runs --port=6006

and open http://localhost:6006/ in your browser to inspect the logged data.

Data Generation, Download, and Processing

Downloading our Data

Our simulated data sets will be available to download soon! Thus, all data sets have to be generated locally for now as described below.

Generation with PhiFlow: Incompressible Wake Flow (Inc)

To generate data with the fluid solver PhiFlow, perform the following steps:

Download the PhiFlow source code and follow the installation instructions. We use the PyTorch backend, that should work out of the box with a correction installation of PhiFlow. Our scripts assume the usage of version 2.0.3 at commit abc82af2! Substantially newer versions might not work.
Ensure that the packages numpy, matplotlib, and imageio are installed in the python environment used for PhiFlow.
Add our data generation scripts that handle the solver setup and data export to the PhiFlow installation by copying all files from the data/generation_scripts/PhiFlow directory to the demos directory in your PhiFlow directory.
The copied files contain the PhiFlow scene for the Inc training and test data set (.py files), that can be run in the same way as the other example PhiFlow scene files in the demos directory. The corresponding batch generation scripts (.sh files) simply run the scene multiple times with different parameters to build the full data set.
Adjust paths and settings in the python generation file if necessary, and run it or alternatively the batch script to generate the data.
Copy or move the generated data set directory to the data directory of this source code for training. Make sure to follow the data set structure described below.

Generation with SU2: Transonic Cylinder Flow (Tra)

To generate data with the fluid solver SU2, perform the following steps:

Follow the general SU2 installation instructions with the Python Modules. We ran the generation on SU2 version 7.3.1 (Blackbird)! Substantially newer versions might not work. Make sure that you have access to an MPI implementation on your system as well, since our generation script runs SU2 via the mpiexec command. Try running the provided SU2 test cases to ensure the installation was successful.
Ensure that the packages numpy, matplotlib, and scipy are installed in the python environment used for SU2.
Add our data generation scripts that handle the solver setup and data export to the SU2 installation by copying all files from the data/generation_scripts/SU2 directory to a new directory in the root of your SU2 installation (for example called SU2_raw). These include the main generation script data_generation.py, the python helper file convert_data.py to convert the data to compressed numpy arrays, as well as the mesh file grid_quad_2d.su2 for all simulations. Furthermore, the SU2 configuration files for the three consecutive simulations (1. steady simulation as initialization: steady.cfg, 2. unsteady warmup: unsteady_2d_initial.cfg, 3. actual simulation for data generation: unsteady_2d_lowDissipation.cfg) run by the python script are included.
Adjust paths and settings in the python generation file if necessary, and run it with the following command to generate the data (from the root directory of the SU2 installation):

python SU2_raw/data_generation.py [Thread count] [Reynolds number] [List of Mach numbers] [List of corresponding simulation folder IDs] [Restart iteration]

For example, to create three simulations at Mach numbers 0.6, 0.7, and 0.8 with Reynolds number 10000 using 112 threads, run the following command:

python SU2_raw/data_generation.py 112 10000 0.60,0.70,0.80 0,1,2 -1

Copy or move the generated data set directory to the data directory of this source code for training.
Post-process the data set directory structure with the src/convert_SU2_structure.py script (adjust script settings if necessary), that also extracts some information from the auxiliary simulation files. Make sure that the converted data directory follows the data set structure described below.

Download from the Johns Hopkins Turbulence Database: Isotropic Turbulence (Iso)

To extract sequences from the Johns Hopkins Turbulence Database, the required steps are:

Install the pyJHTDB package for local usage, and make sure numpy is available.
Request an authorization token to ensure access to the full data base, and add it to the script data/generation_scripts/JHTDB/get_JHTDB.py
Adjust the paths and settings in the script file if necessary, and run the script to download and convert the corresponding regions of the DNS data. The script data/generation_scripts/JHTDB/get_JHTDB_scheduler.py can be run instead as well. It reconnects to the data base automatically in case the connection is unstable or is otherwise interrupted, and resumes the download.
Copy or move the downloaded data set directory to the data directory of this source code for training if necessary. Make sure to follow the data set structure described below.

Data Set Structure

Ensure that the data set folder structure resulting from the data generation is the following to ensure the data set can be loaded correctly: data/[datasetName]/sim_[simNr]/[field]_[timestep].npz. Here datasetName is any string, but has to be adjusted accordingly when creating data set objects. The simulation folder numbers should be integers with a fixed width of six digits and increae continuously. Similarly, the timestep numbering should consist of integers with a fixed width of six digits and increase continuously. Start and end points for both can be configured when creating Dataset objects. Fields should be strings that describe the physical quantity, such as pressure, density, or velocity. Velocity components are typically stored in a single array, apart from the Iso case where the velocity z-component is stored separately as velocityZ. For example, a density snapshot at timestep zero from the Tra data set is referenced as data/128_tra/sim_000000/density_000000.npz.

General Data Post-Processing

src/copy_data_lowres.py can be used to downsample the generation resolution of 256x128 to the training and evaluation resolution of 128x64 for the simulated data sets. It processes all .npz data files, while creating copies of all supplementary files in the input directory. Computing mean and standard deviation statistics for the data normalization to a standard normal distribution is performed via src/compute_data_mean_std.py.

Citation

If you use the source code or data sets provided here, please consider citing our work:

@article{kohl2023_acdm,
  author = {Georg Kohl and Li{-}Wei Chen and Nils Thuerey},
  title = {Turbulent Flow Simulation using Autoregressive Conditional Diffusion Models},
  journal = {arXiv},
  year = {2023},
  eprint = {2309.01745},
  primaryclass = {cs},
  publisher = {arXiv},
  url = {https://doi.org/10.48550/arXiv.2309.01745},
  doi = {10.48550/arXiv.2309.01745},
  archiveprefix = {arxiv}
}

Acknowledgements

This work was supported by the ERC Consolidator Grant SpaTe (CoG-2019-863850).

mirirfan11 / autoreg-pde-diffusion Goto Github PK

autoreg-pde-diffusion's Introduction