marbl-ecosys / cesm2-marbl Goto Github PK
View Code? Open in Web Editor NEWDocumentation of MARBL in CESM2
Documentation of MARBL in CESM2
I'll fix this in #42
Globally integrated NPP, POC flux, air-sea CO2 flux time series -- full historical and RCPs; use forcing_iron_flux
notebook as sample. Maybe single RCP per variable? Alternative would be ensemble mean / spread with all RCPs in same plot.
Two related questions:
What rows do we want in the table that eventually gets published? (i.e. what variables should we include, and in what order)
What columns do we want in the table that eventually gets published? (i.e. what model(s) / configurations should we include?)
Additionally, I want to make sure that we are using the same time periods for the historical
and preindustrial
averages as @klindsay28 uses in his paper -- I think we've settled on 1990 - 2014 for the historical, with CESM1 using 1990 - 2004 as well as 2005 - 2014 from RCP8.5. @klindsay28 is going to do a little analysis to determine the best preindustrial slice(s), and then I'll update this notebook.
notebooks/forcing_iron_flux.ipynb hangs up on this command:
dq = cesm2.search(experiment=['historical'], variable='IRON_FLUX').to_xarray(chunks={'time': 48})
I get the following error. It's possible that one ensemble member has a different file, I haven't diagnosed. First need to diagnose the problem and might need a mechanism to get around this with intake_esm
.
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-5-67f3a48652c5> in <module>
----> 1 dq = cesm2.search(experiment=['historical'], variable='IRON_FLUX').to_xarray(chunks={'time': 48})
2 _, ds2 = dq.popitem()
3 ds2 = ds2.drop([v for v in ds2.variables if v not in keep_vars])
4 ds2
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/intake_esm/source.py in to_xarray(self, **kwargs)
157 _kwargs.update(kwargs)
158 self.kwargs = _kwargs
--> 159 return self.to_dask()
160
161 def _get_schema(self):
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/intake_xarray/base.py in to_dask(self)
67 def to_dask(self):
68 """Return xarray object where variables are dask arrays"""
---> 69 return self.read_chunked()
70
71 def close(self):
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/intake_xarray/base.py in read_chunked(self)
42 def read_chunked(self):
43 """Return xarray object (which will have chunks)"""
---> 44 self._load_metadata()
45 return self._ds
46
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/intake/source/base.py in _load_metadata(self)
115 """load metadata only if needed"""
116 if self._schema is None:
--> 117 self._schema = self._get_schema()
118 self.datashape = self._schema.datashape
119 self.dtype = self._schema.dtype
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/intake_esm/source.py in _get_schema(self)
166
167 if self._ds is None:
--> 168 self._open_dataset()
169 metadata = {}
170
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/intake_esm/cesm.py in _open_dataset(self)
151 member_column_name='member_id',
152 variable_column_name='variable',
--> 153 file_fullpath_column_name='file_fullpath',
154 )
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/intake_esm/source.py in _open_dataset_groups(self, dataset_fields, member_column_name, variable_column_name, file_fullpath_column_name, file_basename_column_name)
134 dsets,
135 time_coord_name_default=kwargs['time_coord_name'],
--> 136 override_coords=kwargs['override_coords'],
137 )
138 var_dsets.append(var_dset_i)
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/intake_esm/aggregate.py in concat_time_levels(dsets, time_coord_name_default, restore_non_dim_coords, override_coords)
181 objs_to_concat = [first] + rest
182
--> 183 ds = xr.concat(objs_to_concat, dim=time_coord_name, coords='minimal')
184
185 new_history = f"\n{datetime.now()} xarray.concat(<ALL_TIMESTEPS>, dim='{time_coord_name}', coords='minimal')"
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/xarray/core/concat.py in concat(objs, dim, data_vars, coords, compat, positions, fill_value, join)
131 "objects, got %s" % type(first_obj)
132 )
--> 133 return f(objs, dim, data_vars, coords, compat, positions, fill_value, join)
134
135
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/xarray/core/concat.py in _dataset_concat(datasets, dim, data_vars, coords, compat, positions, fill_value, join)
321 raise ValueError(
322 "variables %r are present in some datasets but not others. "
--> 323 % absent_merge_vars
324 )
325
ValueError: variables {'DXU', 'latent_heat_fusion_mks', 'nsurface_u', 'HU', 'rho_sw', 'UAREA', 'radius', 'TAREA', 'HUS', 'vonkar', 'latent_heat_fusion', 'salinity_factor', 'dzw', 'DXT', 'hflux_factor', 'moc_components', 'salt_to_ppt', 'T0_Kelvin', 'HUW', 'sound', 'nsurface_t', 'HTN', 'ANGLE', 'DYU', 'rho_fw', 'salt_to_mmday', 'KMT', 'omega', 'TLAT', 'sea_ice_salinity', 'ULONG', 'KMU', 'DYT', 'momentum_factor', 'stefan_boltzmann', 'HT', 'dz', 'latent_heat_vapor', 'ULAT', 'cp_sw', 'rho_air', 'ocn_ref_salinity', 'transport_regions', 'ppt_to_salt', 'fwflux_factor', 'heat_to_PW', 'cp_air', 'transport_components', 'mass_to_Sv', 'REGION_MASK', 'grav', 'HTE', 'TLONG', 'ANGLET', 'salt_to_Svppt', 'days_in_norm_year', 'sflux_factor'} are present in some datasets but not others.
I added code to plot surface nutrients for the paper, but for some reason, the PO4 map from the model is messed up (panel D).
This code (no projection)
cf = plt.contourf(ds_surf_plot.TLONG, ds_surf_plot.TLAT, ds_surf_plot['PO4'], **contour_spec['PO4'])
plt.colorbar(cf)
yield this (the right answer):
However, adding the projection seems to screw things up:
ax = plt.axes(projection=prj)
cf = ax.contourf(ds_surf_plot.TLONG, ds_surf_plot.TLAT, ds_surf_plot['PO4'],
**contour_spec['PO4'],
transform=ccrs.PlateCarree())
plt.colorbar(cf)
The other nutrients seem to plot fine; something about the PO4 field is screwy...but the obs and bias plots seem ok. I am at a loss.
From zulip, @matt-long pointed out
I would have expected greater coverage at high northern latitudes in JJA. At some point, would be good to confirm that we're not missing something there.
Two panel plot: (a) N fixation (b) column integral of denitrification. Include global integrals in title or caption.
I would like to include zonal section plots of ABIO_DIC14
, comparable to nutrient sections. I think the best observational dataset is GLODAPv1.
These data are available here,
https://www.nodc.noaa.gov/ocads/oceans/glodap/GlopDV.html
though my recollection is there used to be netCDF files available...but those seem to have disappeared.
Fortunately, the data are available on disk at /glade/p/cesm/bgcwg/GLODAP
@klindsay28 pointed out there's a discrepancy in how NPP is computed for CESM1 and CESM2 data. CESM1 contains 3D fields for productivity by each autotroph, but those fields are only 150m deep. CESM2 contains pre-computed integrals for 100m and full-depth.
Right now I'm computing the 150m integral of the CESM1 data and reading in the full-depth integral from CESM2. Keith's recommendation would be to compute the 100m integral from CESM1 data and read in the 100m integral from CESM2 as that's the only available apples-to-apples comparison.
I'll update this in #42 since we want the updated data anyway
Want to download the data, interpolate to gx1v7
grid, and then generate an intake-esm
catalog for it. (Interpolation should result in something similar to /glade/work/mclong/woa2013v2/POP_gx1v7
, perhaps with a data structure more akin to /glade/p/cesm/bgcwg/obgc_diag/data/obs_data/WOA2005
?)
We don't want to include oxygen results in the table because they can not be trusted.
I am thinking about my OMWG talk...so this list is coming out of me right now. I think my refactor of the nutrient plotting tries to move towards some of these principles.
pop_tools.get_grid(...)
reads CESM INPUTDATA
files via web protocol andxarray
datasetsxarray.Dataset
objects encapsulate data and metadataTo be continued....
pop-tools
provides a mechanism to divvy the Southern Ocean among the Atlantic, Pacific, and Indian Oceans so that zonal means for those three extend beyond 40 S. We want that in the nutrient plots (and @kristenkrumhardt would like that feature for her work as well)
Is there a reason for using an older version of xESMF (v. 0.1.1) instead of the more recent Pangeo led releases (v. 0.5+)?
I would like to include a table of model parameters in the paper.
In the Latex document we have a table:
\begin{table}[p!]
\caption{Model parameters (from pandas).}
\centering
\begin{tabular}{lrll}
\toprule
Parameter & Value & Units & Description \\
\midrule
$\mu_{\mathrm{ref},\mathrm{sp}}$ & 5.0 & $\mathrm{d^{-1}}$ & Maximum C-spec growth rate at Tref \\
$\mu_{\mathrm{ref},\mathrm{diat}}$ & 5.0 & $\mathrm{d^{-1}}$ & Maximum C-spec growth rate at Tref \\
$\mu_{\mathrm{ref},\mathrm{diaz}}$ & 2.5 & $\mathrm{d^{-1}}$ & Maximum C-spec growth rate at Tref \\
\bottomrule
\end{tabular}
\end{table}
The "from pandas" leads me to believe this was auto generated. Can we revive this workflow and expand to include other parameters?
2090s for CESM2 is an SSP run, not RCP8.5 (slightly different forcings). I'll fix this in #42
ann_avg_utils
has unnecessary input parameter:
experiment_longnames
Also, the user pulls things like units
out of the module, then puts them back into the model via this API. I would prefer that the routine returns a dictionary of xarray datasets with the units conversion already applied. I am applying the unit conversion post facto:
for var in variables:
for exp in cesm2_exp:
ds = dsets[var][exp]
var_units = ds[var].attrs['units']
var_data_new_units = (
ds[var].data * pint_units[var_units]
).to(final_units[var]).magnitude
ds[var].values = var_data_new_units
new_units = final_units[var]
if new_units in pretty_units:
new_units = pretty_units[new_units]
ds[var].attrs['units'] = new_units
This routine doesn't seem to actually be computing timeseries, but rather constructing cache file names. I would prefer that the computation happens, if necessary, here.
Must we do any concatenation inside this function? It seems like mission creep.
We should make
Which chlorophyll product?
We could use a http://wiki-measures.eri.ucsb.edu/GSM product or SeaWiFS.
I should have submitted a PR for the changes I pushed today.
I split nutrient data.ipynb
into
_data-nutrient-plots.ipynb
: produces zarr archives of map, zonal-mean, and global profile datasetsnutrient-plots.ipynb
: makes map, zonal-mean, and global profile plotsThe plots are nearly there, but I would like to refine the contour levels for the surface maps. For the zonal sections, I used the levels from Sarmiento & Gruber (2006) so I think we're good there.
I think we should aim for a number of levels that yields clear dilineations on the colorbar.
I would also like to add contour lines to all color plots. Labels preferably included. Might need a stride > 1 for color levels.
Using a 200-year mean from the piControl
runs, it would be useful to also get the standard deviation among the 20 decade-long chunks.
Sign wrong in https://github.com/marbl-ecosys/cesm2-marbl/blob/master/notebooks/ann_avg_utils.py#L386 -- should be adding river flux, not subtracting it.
I'll fix in #42 since we want this table to use latest SSP data
There is a ton of code that I simply copied for one notebook to another (or copied and then made a minor tweak to). Once all the notebooks are up and running, I'd like to go back and see what should be pulled into a .py
file or two.
It took ~6 minutes to re-run the computation-intensive cell in forcing_iron_flux_notebook.ipynb
, and it would be nice if the results of that were dumped to netcdf
Compute biomass-weighted-mean limitation terms in the upper ocean (i.e., top 150 m).
Make 3 panel plot: maps of most limiting nutrient for each phytoplankton taxa (diatom, small phyto, diazotrophs).
Something in utils.py
breaks watermark. I get the following error from the watermark magic.
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-173-35b364ad726b> in <module>
12
13 get_ipython().run_line_magic('load_ext', 'watermark')
---> 14 get_ipython().run_line_magic('watermark', '-d -iv -m -g -h')
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/IPython/core/interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
2305 kwargs['local_ns'] = sys._getframe(stack_depth).f_locals
2306 with self.builtin_trap:
-> 2307 result = fn(*args, **kwargs)
2308 return result
2309
</glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/decorator.py:decorator-gen-126> in watermark(self, line)
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/watermark/watermark.py in watermark(self, line)
134 self._get_git_branch(bool(args.machine))
135 if args.iversions:
--> 136 self._print_all_import_versions(self.shell.user_ns)
137 if args.watermark:
138 if self.out:
/glade/work/mclong/miniconda3/envs/cesm2-marbl/lib/python3.7/site-packages/watermark/watermark.py in _print_all_import_versions(vars)
248 longest = max([len(i[0]) for i in to_print] + [0]) + 1
249 for entry in to_print:
--> 250 print(('%s' % entry[0]).ljust(longest) + entry[1])
251
252
TypeError: can only concatenate str (not "NoneType") to str
We should have the option to apply a drift correction computed from the control integration.
This is made more complicated because the historical runs have different branch years.
catalog = intake.open_esm_datastore('data/campaign-cesm2-cmip6-timeseries.json', sep=':')
catalog.search(experiment='historical').df.ctrl_branch_year.unique()
array([601, 631, 661, 501, 691, 721, 751, 781, 811, 841, 871])
We need
e.g. in the forcing_iron_flux
notebook, it would be much faster to dump the annual means (and the global annual means) to disk.
Should there be a tool that precomputes annual means and climatologies and creates an intake
catalog? Then the analysis notebooks wouldn't need to repeat the esmlab
calls every time they are run.
attn: @matt-long (this issue cropped up in a conversation I had with @klindsay28)
What time period should we use when computing global averages from historical
runs for our table? There are several factors at play, some of which conflict with others:
@klindsay28 thinks there might be guidance from the IPCC on this (e.g. a rule of thumb like "use 1990-2014, and for CMIP5 runs the 2005-2014 data should come from RCP")
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.