Giter VIP home page Giter VIP logo

epsproc's People

Contributors

dependabot[bot] avatar phockett avatar stevenjoezhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

epsproc's Issues

Symmetry and matrix element handling

For handling matrix elements, esp. for fitting from experimental data, need some proper symmetrization routines. Started this previously, see esp. circa 2016 N2 AF fitting work, code in symm_coeffs_ePS.m on code distro https://figshare.com/articles/Bootstrapping_to_the_Molecular_Frame_with_Time-domain_Photoionization_Interferometry/4480349

To investigate:

Xarray attrs (attributes) nested case propagation

Noticed that xr.copy() isn't deep-copying nested dicts (note xr.copy(deep=True) is default, see https://docs.xarray.dev/en/stable/generated/xarray.DataArray.copy.html).

Fixed for sphRealConvert() in e2ad5eb, but may need to propagate elsewhere.

E.g.

    dataCalc = dataIn.copy()   # Works for base attrs dict, but not nested dicts.
    dataCalc.attrs = copy.deepcopy(dataIn.attrs)  # THIS WORKS ALSO FOR NESTED DICT CASE

TODO: may have also fixed elsewhere? See PEMtk and IO codes for more .attrs handling methods.

TODO: check Xarray versions, tested in 2022.3.0 which is not latest.

See also pydata/xarray#2835

UPDATE: this is an issue in 2022.3.0, but not in 2022.6.0.

Data structures: Xarray datasets and class-based methods

To consider, esp. for geomFunc calcs, which are getting a bit overloaded.

Can use Xarray attrs for variables, but these may be dropped during some calcs. What about datasets to consolidate matrix elements and associated properties?

Could also consider a full class to consolidate data + methods.

Plotting routines

To do:

  • Better line plots - Holoviews with Bokeh back end? See tests in geometric_method_dev_low-level_E-fields_200320.ipynb, plus XeF2 test notebooks & old data processing notebooks.
  • Consolidate improved plotting routines for all Xarray objects.
  • Update modified Seaborn routines used in lmPlot(), this currently needs Seaborn==0.9.0

Surface and volumetric plotting

Currently have some basics in place, and tried a few methods, for surface plots (see docs, also test local version of notebook).

Data types:

  • PADs, single surfaces or stacked.
  • Wavefns, volumetric cart or sph coords.
  • VMI type data, volumetric 3D cart plus projections.

To try/implement:

  • pyVista (new, looks best - VTK/ITK on backend, has Jupyter support)
  • yt (Briefly tested for wavefn plots)
  • VTK
  • Paraview/paraview glance (Some testing of Paraview desktop performed, but not yet tested via API/python).

To revisit/improve:

  • Plotly - PAD surface plots, work in Notebook, but not in HTML export currently.
  • Holoviews - for plots with widgets? Volumetric plotting support?
  • Mayavi - surface and volumetric plots. (Briefly tried for wavefn plots.)

Molecule data handling

Things to add...

  • Parse molecule data from ePS file.
  • Display and plot.
  • More sophisticated data handling... with comp chem libraries?
  • Read electronic structure file & plot.

Spherical functions

Currently implemented using Moble's spherical_functions or Scipy & Sympy.

  • Should make these optional, since they're not required for basic post-processing.
  • Switch from spherical_functions (now deprecated) to spherical package, which supercedes (see https://github.com/moble/spherical).
  • Fix Numba-based import routines too, specifically in geomFunc.w3jVecMethods, currently throwing errors in dev env after Numba update (to 0.53.1, although likely just local env inconsistencies - this is for epsdev on bemo, although pre-update version epsdev-030821 is OK).

Code structure, things to tidy up

  • move core functions to subpackages (started March 2020 with geometric funcs, currently on dev branch only)
  • rationalise/move/tidy some general parts, e.g. blmXarray() appears in a couple of places.
  • implement classes for some data-structures? (See issue 25.)
  • (April 2023) consistency in data array attrs - have gradually been adding to class routines, but may not be totally consistent. See also PD conversion routines, epsproc.classes._IO.matEtoPD(), which may be missing in some cases too (or use inconsistent settings).

GeomFunc todo & tidy-up

Basics now working. Some outstanding todo items:

  • Verify further AF test cases, may have phase issue with x/y pol geoms? (POSSIBLE BUG) (27/07/21 fixed in 5b5fbcf)
  • #38
  • Fix cross-sections (XC) names once testing complete.
  • E-field methods to add. (Test also p!=0 cases.)
  • Handling of sel/sum dims, for latter should have option to add dims here rather than always list/pass all (for class wrapper, started to add this with global class settings, still in progress).
  • Sph fns. (Use existing, try SHtools, consolidate.) Note current issue with BLM dim labels (L,M or l,m in old fns.)
  • Time-dependent calcs & plotting.
  • Speed/parallelism.
  • Xarray version issues... see notes below. (fixed April 2022 in #51)

Method development: see https://trello.com/c/7czxUutK/16-theory-development

QNs and dim labelling for Xarrays

Currently listed in util.py, but also hard-coded in some functions.

Adding functionality is creating conflicts with dim names, so need to consolidate this and debug.

Propagation of setPolGeoms() labels is also patchy, e.g. missing from wDcalc().

Wavefunction class and plotting

Basics now in place, but a few things still to address...

  • More robust handling of plotting options.
  • Implement more PyVista methods.
  • Better logic for selecting allowed plot types/methods for various PyVista plotter options.

... and various TODO items in the source.

See 4f53ecb for more.

Fix Sphinx-RTD build chain

Failing since last week (Commit: 084227c) for reasons unknown, but seems to be something to do with maths formatting in geomFunc - even though this was working in the prior commit. Build fails at TeX

Edit: actually HTML does seem to be updating, but build is failing at pdflatex stage (even without PDF output set, strangely).

(That makes 100 errors; please try again.) ! ==> Fatal error occurred, no output PDF file produced!

MFPADs with polarization calculations

Notes

Issues

  • In MF-BLM testing for some cases low cross-sections may result in unphysical ๐›ฝ๐ฟ,๐‘€ results. These currently require manual checking of the corss-sections, but should be automatically flagged/filtered in future.

Main IO routines

Possibly ongoing issues with checkDims() and subselectDims() and Xarray assumptions/versions causing main IO routines to break.

  • Just fixed bug in 5cfb04f, arose due to changes in refDims (see
    # 21/07/22 - actually this breaks main file IO for singleton items to xr.sel() for MultiIndex cases (see subselectDims() below).
    ).
    • This may, however, break new codes! May need to change to an optional flag, or additional output.
    • UPDATE: now reinstated and fixed with .copy() in dc91f32.
  • General rethink/tidy-up of old IO routines is worthwhile at some point too, esp. Xarray stacking routines should be improved.

ePSdata interface

Basic interface now in place, see https://epsproc.readthedocs.io/en/dev/demos/ePSdata_download_demo_300720.html

TO do:

  • Hash checking for downloads.
  • Better archive handling to allow for just ePS .out file (or other specified file) extraction.
  • Consider general data handling and class structure - should be able to implement this for general ePSproc case (see #25).
  • Separate file-parsing functions to allow for parsing without unzip (currently embedded in unzip functionality).

Segment types

Just added EDCS segment support to python code (dev branch).

TO DO:

  • List supported data types (util function?).
  • Add other data types (DCS).
  • Tidy & generalise IO functions based on type selection.

Xarray IO code and stacking

In XR v2022.6 some base IO code is broken, specifically issue(s) with restacking routines (raw data to XR) and/or dropping selected dims?

To do:

  • Retest with XR v2022.3 (OK)
  • Test with XR v2022.9 (FAILS)
  • Test routines more carefully.
  • Rewrite some IO code? Needs a tidy-up in any case. (See also recently added R-matrix IO for possibly better methods.)

Data formats consistency

(March 2020) adding geomFunc sub-module. This has handling for various output datatypes, which should be consolidated and implemented in older functions (e.g. sphCalc() functions).

BLM calculations (python)

Things to do:

  • Normalisation
  • Vectorization/parallelization
  • Verification
  • Saving of Xarrays
  • Tensor formalism
  • Matrix formalism
  • AF-BLM calculations
  • ADM (and other, esp. external values) normalisation options/routines.

lmPlot overloading labels

For default setting, labels all items on x-axis... which is potentially problematic for new high-res jobs.

Global plotting style

Currently using Matplotlib defaults mainly, but setting Seaborn styling for lmPlot(). These are reset at end of routine, although some sticky.

Should decide on this and set globally at init.

XR selection routines

Some selection routines broken in XR 2022.3.0 (and presumably more recent versions...?).

Issue seems to be with slicing on float indexes, produces key or type errors, but used to work.

For example, slicing ADMs by t-index:

   # And for the ADMs...
    # SLICE version - was working, but not working July 2022, not sure if it's data types or Xarray version issue? Just get KeyErrors on slice.
    # data.selOpts['ADM'] = {}   #{'thres': 0.01, 'inds': {'Type':'L', 'Eke':1.1}}
    # data.setSubset(dataKey = 'ADM', dataType = 'ADM', sliceParams = {'t':[38, 44, 4]}) 

    #********** HACKS/DEBUG
    # Inds/mask version - seems more robust?
    # trange=[38, 44]  # Set range in ps for calc
    # tStep=4  # Set tStep for downsampling
    # tMask = (data.data['ADM']['ADM'].t>trange[0]) & (data.data['ADM']['ADM'].t<trange[1])
    # data.data[data.subKey]['ADM'] = data.data['ADM']['ADM'][:,tMask][:,::tStep]  # Set and update
    # print(f"ADMs: Selecting {data.data['subset']['ADM'].t.size} points from {data.data['ADM']['ADM'].t.size}")

Robust handling of missing data

Usually due to issues with E points missing in file output (essentially due to text buffer overflow). Needs to be handled in IO functions elegantly - currently just crashes IO.

NO2 ePS demo file issues (Matlab)

Some OS/Matlab versions are producing errors at file read for the supplied NO2 demo file (no2_demo_ePS.out). This seems to be an issue with the file encoding leading to cascading read errors, and the exact cause remains to be determined.

Compatibility with recent ePS output files (https://osf.io/psjxt/) seems unaffected.

Verify/fix degenerate state handling.

Handling of degenerate states (it variable) needs some general though.

Currently:

  • Select or keep it dim working OK in general, but may lead to accidental dropping or neglect of degenerate components.
  • For AFBLMs using tensor formalism (see #26 ), may have phase issues with it>1, resulting in -ve cross-sections (or null values if summed over), although betas seem correct (only tested for N2 orb6/pig case so far). Not yet sure if this is issue in formalism (missing phase or rotation?), assumptions or numerics. Quick fix might be to just set degeneracy factor here?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.