How to Generate Data Products

First, create a directory in configs/

mkdir configs/MW_mass

Next, run the following commands to diagnose any early issues (replacing the /oak/stanford/orgs/kipac/users/ycwang19/ZEUS/MWmass_new/Halo* substring with one that accesses all your halo directories)

ls /oak/stanford/orgs/kipac/users/ycwang19/ZEUS/MWmass_new/Halo*/output/rockstar/trees/ > configs/MW_mass/tree_locations.txt
du -h /oak/stanford/orgs/kipac/users/ycwang19/ZEUS/MWmass_new/Halo*/output/rockstar/trees/ > configs/MW_mass/tree_sizes.txt
head -n 1 /oak/stanford/orgs/kipac/users/ycwang19/ZEUS/MWmass_new/Halo*/output/rockstar/trees/tree_0_0_0.dat > MW_mass/tree_headers.txt

Note that if you have very complicated, non uniform directory structure, you can pipe arguments out of config files and around with awk and xargs like this:

cat config.txt | awk '{print $7}' | xargs ls > tree_locations.txt

This will create three files that you can browse to diagnose problems. Look through tree_locations.txt and see if there are any simulations which don't have tree files yet: these would be jobs that haven't finished running. Look through tree_sizes.txt to get a sense for which simulations are going to be time intensive, and check that none of the tree files are suspisciously small. Look through tree_headers.txt and confirm that all your simulations used the same version of Rockstar/consistent-trees.

The code currently asusmes that your header looks like this:

#scale(0) id(1) desc_scale(2) desc_id(3) num_prog(4) pid(5) upid(6) desc_pid(7) phantom(8) sam_Mvir(9) Mvir(10) Rvir(11) rs(12) vrms(13) mmp?(14) scale_of_last_MM(15) vmax(16) x(17) y(18) z(19) vx(20) vy(21) vz(22) Jx(23) Jy(24) Jz(25) Spin(26) Breadth_first_ID(27) Depth_first_ID(28) Tree_root_ID(29) Orig_halo_ID(30) Snap_idx(31) Next_coprogenitor_depthfirst_ID(32) Last_progenitor_depthfirst_ID(33) Last_mainleaf_depthfirst_ID(34) Tidal_Force(35) Tidal_ID(36) Rs_Klypin Mvir_all M200b M200c M500c M2500c Xoff Voff Spin_Bullock b_to_a c_to_a A[x] A[y] A[z] b_to_a(500c) c_to_a(500c) Ax Ay Az T/|U| M_pe_Behroozi M_pe_Diemer Halfmass_Radius RVmax

If you have any headers that don't look like this, contact me (Phil) and I'll change the code accordingly.

Next, copy the Python script print_config.py into your config folder

cp print_config.py configs/MW_mass

Open the file and edit all the lines which have a comment in front of them to match your simulation specifics. You'll need to be familiar with "%s"-style printf formatting, but might be able to pattern-match off the examples if you aren't. You'll also need to manually list the halo names: I'd recommend having your tree_locations.txt file open in another tab list while you do this. Next, run the Python script and pipe the output to the main config file.

python configs/MW_mass/print_config.py > configs/MW_mass/config.txt

Now you're ready to start running the pipeline. You run a series of commands as

go run <file_name>.go path/to/config.txt

You'll probably want to do test runs in an interactive session to make sure everything is working okay. If so, you cna specify the index of the halo you want to look at as

go run <file_name>.go path/to/config.txt <index>

I'd recommend using the halo with the smallest tree files.

The steps are the following:

write_binary_tree.go - Converts text consistent-trees files into binary depth-first files.
write_tree_header.go - Identifies all main branches and annotates them with tons of useful information.
write_subhalo_files.go - Collects major subhalos of the central halo into a single, easy to access file.
tag_particles.go - Assigns particles to each subhalo and tags them with their infall times and other useful information
xv.go - Extracts x and v for all major subhalos.
phi.go - Computes potentials for all major subhalos.

The first three are responsible for building up files related to merger trees and the last three are reponsible for tracking paritcles over time. The tree scripts don't require any particle access and are vastly cheaper to run, so you may be interested in running them on their own. phi.go is by far the most expensive and most niche, so you may not want to run it even if you're doing particle tracking.

For the particle tracking files, you'll want to check what level your highest-reoslution particles are in your simulations (Symphony has them in Gadget's 1-indexed level, so that's the default).0 If you don't still have the original IC configuraiton files lying around, you can check by opening up the gadget files and seeing which level is the first to have any particles (using, say, github.com/phil-mansfield/read_gadget.) Once you know that, change HRLevel at the top of those files, if needed.

You may also want to change the number of files that particles are split across. By default, this number is 8. But you can increase it if this would lead to particle files that are too large to comfortably hold in RAM at one time.

Making data downloadable

If you haven't already, you'll want to add your suite to list of objects that symlib can analyze. There are enough special cases that this has to be done manually, I'm afraid.

Add an entry to symlib.paramter_table corresponding to your suite. (symlib/lib.py)
Add your simulation to symlib.SUITE_NAMES (symlib/util.py)
Add an entry symlib.DEAFULT_HALO_NAMES (symlib/util.py)
Generate scale_factors.py. To do this, call go run scale_factor_table . This reads in the scale factors of every snapshot of every halo, so you might not want to do this on a login node. Once you've done this for your suite, add a line to symlib/download_tables/generate_scale_factor_table.py s main function corresponding to your suite.

(TODO: add packing and auto-generating instructions)

bhadnott / symphony_pipeline Goto Github PK

symphony_pipeline's Introduction

How to Generate Data Products

Making data downloadable

symphony_pipeline's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent