Giter VIP home page Giter VIP logo

qmflows's People

Contributors

bvb93 avatar dependabot[bot] avatar felipez avatar h4nsu avatar jhidding avatar ridderl avatar scmtestadf avatar sergio-lopez avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

qmflows's Issues

Case sensitivity of functional names in qmworks

Dear All,

Qmworks is giving an error when we type name of functional in upper case letters (e.g. BLYP). It is only working when we use all lower case letters to define functions (e.g. blyp instead of BLYP). We should get make them case insensitive.

Kind regards,
Satesh

First Release

WE have been developing QMWORKS for almost a year and it is time to do a release. Notice that we don't a full-fledge but right now the packages is gaining momentum and users need several features,
the most important of all is a manual. So, I will star to do some code review together with an initial manual. Also, I will open issues about things that we must done urgently in oder to get accomplish our first release.

Parsing output files

The properties.json files should have the following format:

{property: {
    "parser": "awk" or "python" or "kfreader",
    "file_ext": "out" or "dat" or "hess" etc,
    "function": "some awk script" or "name_of_a_function_in_parser.py" or ["section", "property"] in t21
}}

Settings lower case

in the Settings definition in the master branch we are explicitly converting the name to lower case. While in the develop branch something different is going on. Are we going to inforce the conversation to lower-case or the user should be aware of potential pitfalls due to case-insensitive nature of the Settings?
The overlay test is failing in the develop branch because now if a keyword is upper-case in the template and the user redefines it using lower-case the results are two new branches with the same name but different case.

Plams fails to read Geometries

I have added a new test called test_ethene (this test was created originally by a student). The Worflow is failing with the following error:

  File "$HOME/escience/src/qmworks/qmworks/packages/packages.py", line 61, in __call__
    result = self.run_job(job_settings, mol, **kwargs)
  File "$HOME/escience/src/qmworks/qmworks/packages/SCM.py", line 47, in run_job
    settings=adf_settings).run()
  File "$HOME/miniconda/envs/qmworks/lib/python3.5/site-packages/plams/basejob.py", line 112, in run
    jobrunner._run_job(self, jobmanager)
  File "$HOME/miniconda/envs/qmworks/lib/python3.5/site-packages/plams/jobrunner.py", line 33, in wrapper
    func(self, *args, **kwargs)
  File "$HOME/miniconda/envs/qmworks/lib/python3.5/site-packages/plams/jobrunner.py", line 109, in _run_job
    if job._prepare(jobmanager):
  File "$HOME/miniconda/envs/qmworks/lib/python3.5/site-packages/plams/basejob.py", line 182, in _prepare
    prev = jobmanager._check_hash(self)
  File "$HOME/miniconda/envs/qmworks/lib/python3.5/site-packages/plams/jobmanager.py", line 100, in _check_hash
    h = job.hash()
  File "$HOME/miniconda/envs/qmworks/lib/python3.5/site-packages/plams/basejob.py", line 327, in hash
    h.update(self.get_input().encode())
  File "$HOME/miniconda/envs/qmworks/lib/python3.5/site-packages/plams/scmjob.py", line 182, in get_input
    self._parsemol()
  File "$HOME/miniconda/envs/qmworks/lib/python3.5/site-packages/plams/scmjob.py", line 272, in _parsemol
    self.settings.input.atoms['_'+str(i+1)] = ('%5i'%(i+1)) + atom.str(symbol=smb, suffix=suffix)
AttributeError: 'str' object has no attribute 'atoms'

The workflow corrected creates the structures and optimized them but it fails when Plams calls the function _parsemol to create a new input using the optimized geometries.

Storing and Retrieving results

Dear Johan,

I have a following problem with our current TS script.

script

Look at below example:

[14:59:46] Executing ac01_r2_DFTB.run
[14:59:57] Execution of ac01_r2_DFTB.run finished with returncode 154
[14:59:57] WARNING: Job ac01_r2_DFTB finished with nonzero return code
[14:59:57] Job ac01_r2_DFTB finished with status 'crashed'

So ac01_r2_DFTB is crashed due to exceeding number of geometry optimizations. It is pure DFTB issue nothing to do with qmworks. But if you see the below line:

Job et01_r2_DFTB started
[15:05:08] Starting et01_r2_DFTB.prerun()
[15:05:08] et01_r2_DFTB.prerun() finished
[15:05:08] Job et01_r2_DFTB previously run as ac01_r2_DFTB, using old results
[15:05:08] Copying results of ac01_r2_DFTB failed because of the following error: Using Results associated with crashed or failed job

To run et01_r2_DFTB, it is using ac01_r2_DFTB as it is the same structure in both reactions. if one of them failed irrespective of which one then qmworks automatically retrieving that information for the second rjob. As it is failed in the first run, it simply omits to re-run in the second job, that leads to loss of all files(except .dill file) in the second job folder.

Kind regards,
Satesh

components

There are several components modules that are not export by the package. Are they operational?

hdf5 headers

When calling h5py one user received the following error:

Warning! _HDF5 library version mismatched error_
The HDF5 header files used to compile this application do not match
the version used by the HDF5 library to which this application is linked.
Data corruption or segmentation faults may occur if the application continues.
This can happen when an application was compiled by one version of HDF5 but
linked with a different version of static or shared HDF5 library.
You should recompile the application or check your shared library related
settings such as 'LD_LIBRARY_PATH'.
You can, at your own risk, disable this warning by setting the environment
variable 'HDF5_DISABLE_VERSION_CHECK' to a value of '1'.
Setting it to 2 or higher will suppress the warning messages totally.
Headers are 1.8.10, library is 1.8.16
SUMMARY OF THE HDF5 CONFIGURATION
=================================

General Information:

     HDF5 Version: 1.8.16
    Configured on: Mon Apr  4 16:08:17 CDT 2016
    Configured by: [email protected]
   Configure mode: production
      Host system: x86_64-unknown-linux-gnu
    Uname information: Linux centos5x64.corp.continuum.io 2.6.18-400.1.1.el5 #1 SMP Thu Dec 18 00:59:53 EST 2014 x86_64 x86_64 x86_64 GNU/Linux
         Byte sex: little-endian
        Libraries: shared
   Installation point: /opt/anaconda1anaconda2anaconda3

Compiling Options:

           Compilation Mode: production
                 C Compiler: /usr/bin/gcc ( gcc (GCC) 4.4.7 20120313 )
                     CFLAGS: 
                  H5_CFLAGS: -std=c99 -pedantic -Wall -Wextra -Wundef -Wshadow -Wpointer-arith -Wbad-function-cast -Wcast-qual -Wcast-align -Wwrite-strings -Wconversion -Waggregate-return -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wredundant-decls -Wnested-externs -Winline -Wno-long-long -Wfloat-equal -Wmissing-format-attribute -Wmissing-noreturn -Wpacked -Wdisabled-optimization -Wformat=2 -Wunreachable-code -Wendif-labels -Wdeclaration-after-statement -Wold-style-definition -Winvalid-pch -Wvariadic-macros -Wnonnull -Winit-self -Wmissing-include-dirs -Wswitch-default -Wswitch-enum -Wunused-macros -Wunsafe-loop-optimizations -Wc++-compat -Wstrict-overflow -Wlogical-op -Wlarger-than=2048 -Wvla -Wsync-nand -Wframe-larger-than=16384 -Wpacked-bitfield-compat -O3
                  AM_CFLAGS: 
                   CPPFLAGS: 
                H5_CPPFLAGS: -D_GNU_SOURCE -D_POSIX_C_SOURCE=200112L   -DNDEBUG -UH5_DEBUG_API
                AM_CPPFLAGS: -D_LARGEFILE64_SOURCE -D_LARGEFILE_SOURCE 
           Shared C Library: yes
           Static C Library: no

Statically Linked Executables: no
LDFLAGS:
H5_LDFLAGS:
AM_LDFLAGS:
Extra libraries: -lrt -lz -ldl -lm
Archiver: ar
Ranlib: ranlib
Debugged Packages:
API Tracing: no

Languages:

                    Fortran: no

                        C++: yes
               C++ Compiler: /usr/bin/g++ ( g++ (GCC) 4.4.7 20120313 )
                  C++ Flags: 
               H5 C++ Flags:  
               AM C++ Flags: 
         Shared C++ Library: yes
         Static C++ Library: no

Features:

              Parallel HDF5: no
         High Level library: yes
               Threadsafety: no
        Default API Mapping: v18

With Deprecated Public Symbols: yes
I/O filters (external): deflate(zlib)
MPE: no
Direct VFD: no
dmalloc: no
Clear file buffers before write: yes
Using memory checker: no
Function Stack Tracing: no
Strict File Format Checks: no
Optimization Instrumentation: no
Bye...
Aborted (core dumped)

Job Name in the Result objects

Currently we can name the input jobs using the job_name keyword, but there is not such label in the result object. User may want to do further processing of the result object using the job_name label.

Serialize numpy arrays or store in HDF5

I tried to run this script:

h2o = rdkitTools.smiles2plams('O')
h2o_freq = gamess(templates.freq, h2o, job_name = "freq", work_dir = "/home/lars/scr").hessian
s = Settings()
s.inithess = h2o_freq
h2o_opt = adf(templates.geometry.overlay(s), h2o, job_name = "opt")

It crashed with:

....
  File "/home/lars/workspace/workflowengine/noodles/serial/registry.py", line 324, in encode
    raise NotImplementedError(msg)
NotImplementedError: Cannot encode [  5.83587488e-01  -3.38289452e-03   9.93129190e-14  -2.97726953e-01
  -2.30248008e-01  -1.47776756e-14  -2.85862141e-01   2.33630549e-01
  -3.73399228e-14  -3.38289452e-03   3.62691600e-01   2.51534904e-14
  -1.53536080e-01  -1.75418715e-01  -2.25427316e-13   1.56919114e-01
  -1.87275345e-01   1.28022593e-13   9.93129190e-14   2.51534904e-14
  -1.58563114e-02  -5.85902854e-14  -5.06539255e-14   7.92743537e-03
  -4.07660017e-14   2.55871713e-14   7.92968842e-03  -2.97726953e-01
  -1.53536080e-01  -5.85902854e-14   3.32297741e-01   1.91303251e-01
   2.97938757e-14  -3.45700387e-02  -3.77674987e-02  -4.11996826e-15
  -2.30248008e-01  -1.75418715e-01  -5.06539255e-14   1.91303251e-01
   1.71549718e-01   1.16616786e-13   3.89449502e-02   3.87034274e-03
  -4.84855212e-14  -1.47776756e-14  -2.25427316e-13   7.92743537e-03
   2.97938757e-14   1.16616786e-13  -6.96379742e-03  -1.50053581e-14
   1.08680426e-13  -9.64044822e-04  -2.85862141e-01   1.56919114e-01
  -4.07660017e-14  -3.45700387e-02   3.89449502e-02  -1.50053581e-14
   3.20433037e-01  -1.95863384e-01   4.13297868e-14   2.33630549e-01
  -1.87275345e-01   2.55871713e-14  -3.77674987e-02   3.87034274e-03
   1.08680426e-13  -1.95863384e-01   1.83406115e-01  -7.96238075e-14
  -3.73399228e-14   1.28022593e-13   7.92968842e-03  -4.11996826e-15
  -4.84855212e-14  -9.64044822e-04   4.13297868e-14  -7.96238075e-14
  -6.96604910e-03]: encoder for type `ndarray` is not implemented.

So we now run into the issue of serializing the hessian.
Here we are back to our discussion on how to handle potentially big data arrays and their storage in HDF5.

How about the following solution:
We could add a serializer in packages.py for numpy arrays. This serializer saves the numpy array to an HDF5 and returns the path of the HDF5.
What do you think?

Gamess crashes when symmetry is not defined

When a gamess job is started without symmetry definition, plams selects by default C1 symmetry. However, it then crashes with the follow error:

 THE POINT GROUP OF THE MOLECULE IS C1      
 THE ORDER OF THE PRINCIPAL AXIS IS     0


 *** ERROR!
 BLANK CARD FOUND WHILE TRYING TO READ INPUT ATOM    1
 POSSIBLE ERRORS INCLUDE:
 1. C1 GROUP SHOULD NOT HAVE A BLANK CARD AFTER IT.

Plams puts an empty line after the line specifying the symmetry, Apperently this is wrong with C1, but it is fine with other symmetries like Cs. Pretty weird behavior of Gamess, but confirmed on slide 26 of http://www.msg.ameslab.gov/tutorials/gamessintro.pdf

Needs to be solved in plams/gamessjob.py, function print_molecule

Generic keywords

Are we happy with the format of the generic2package.json files?

  • The interpretation of the json relies on the qmworks Settings subclass which interprets ["key1.key2"] as ["key1"]["key2"]. If possible if would be great if we could stick to the plams Settings somehow. But: I don't know how to implement e.g. the basis becomes basis.type modification using just plams Setttings.
  • Currently, if values in the dictionary are a list (of two items) the first item is the specific keyword and the second item is a specific value. If the second item is a dictionary, the dictionary defines how to translate the user specified generic value to a package specific value (see https://github.com/SCM-NV/qmworks/blob/master/qmworks/data/dictionaries/generic2ADF.json). This is not very intuitive.
  • One improvement could be to use a "key" and "value" attribute, so:
"generic_key": {
    "key": "specific_key",
    "value": "specific_value"}
  • Then still specific_value could be a dictionary defining how a user provided generic value should be translated into a package specific value.

Any idea's?

Suspending a calculation if a warning is issue

There are some packages, including CP2K, that report a normal termination event though the SCF did not converge.
For example

 SCF WAVEFUNCTION OPTIMIZATION

  Step     Update method      Time    Convergence         Total energy    Change
  ------------------------------------------------------------------------------
..............................
   196 P_Mix/Diag. 0.40E+00   11.4   165.68819872      4330.1476633861  2.12E+03
   197 P_Mix/Diag. 0.40E+00   11.4    99.38644504       192.6044412835 -4.14E+03
   198 P_Mix/Diag. 0.40E+00   11.4    94.46161095      2939.4526751424  2.75E+03
   199 P_Mix/Diag. 0.40E+00   11.4   322.70007533      4201.4125916074  1.26E+03
   200 P_Mix/Diag. 0.40E+00   11.4   165.15054963      3928.7731404706 -2.73E+02

  Leaving inner SCF loop after reaching   200 steps.

  Electronic density on regular grids:      -2111.9999999764        0.0000000236
  Core density on regular grids:             2111.9999997104       -0.0000002896
  Total charge density on r-space grids:       -0.0000002660
  Total charge density g-space grids:          -0.0000002660

  Overlap energy of the core charge distribution:               0.00003969715808
  Self energy of the core charge distribution:              -9931.63094211405405
  Core Hamiltonian energy:                                   4332.57900394370881
  Hartree energy:                                           10407.72552904863733
  Exchange-correlation energy:                               -879.90049010482494
  Total energy:                                              3928.77314047062464

 *** WARNING in qs_scf.F:479 :: SCF run NOT converged ***

This results are useless, but QMWorks keeps on running because CP2K reports a Normal terminantion message.

the Question is then, who is responsible to terminste the calculation? The programer of the workflow, QMWorks?
Any suggestion about how to implement this?

Status Change

Dear All,

I am querying for status to be successful (status == successful) in my workflow to print results. But I found that for some reason instead of status == successful, it is writing status ==copied after workflow execution and it obviously not print the results for me.

Many thanks in advance,
Satesh

Settings

User Settings should get preference over defaults Settings. But now it is printing both user and settings in the case of functionals. For example look at the XC section of following example script where it prints both PBE and LDA, where it should only give preference to PBE(user preferred functional). It should be fixed as we are now having more people started using our code.
`# Default imports
from qmworks import Settings, templates, run
from noodles import gather
from plams import Molecule

from qmworks.packages.SCM import adf

import plams

========== =============

plams.init()

import os
import sys
import fnmatch
from os.path import join

path = "/home/ganga/Workflows/Workflows3/Ayers/WithoutDFTBHessian/Moleculeswithmul2"
files = os.listdir(path)
xyzFiles = filter(lambda x: fnmatch.fnmatch(x,"*.xyz"), files)
pathsXYZ = map(lambda x: join(path,x), xyzFiles)
molecules = [Molecule(name, 'xyz') for name in pathsXYZ]

settings = Settings()
settings.functional = "pbe"
settings.basis = "TZ2P"
settings.specific.adf.charge = "0 2"
settings.specific.adf.unrestricted = ""

job_list = []
for m in molecules:
ts = adf(templates.ts.overlay(settings),m)
job_list.append(adf(templates.freq.overlay(settings), ts.molecule))

wf = gather(*job_list)

results = run(wf, n_processes = 1)`

Scf
  Converge 1e-06
  Iterations 100
End

Xc
    Gga PBE
    Lda
End

End Input

issue-10

QMWORKS fails with an error similar to:

Internal error encountered. Contact the developers:
<class 'NotImplementedError'> Cannot encode <noodles.files.path.Path object at
0x2aaac81ba6a0>: encoder for type Path is not implemented.

Error raise by noodles serializer

Conditional execution

Noodles has a new feature that allows for easier conditional execution. Having code like this in QMworks, would allow you to run several codes until one is successful.

def find_first(pred, lst):
    """Receives a predicate (non-scheduled) and a list of promised objects,
    Promises are executed until we find one for which the predicate returns
    true."""
    if lst:
        return s_find_first(pred, lst[0], [quote(l) for l in lst[1:]])
    else:
        return None


@schedule
def s_find_first(pred, first, lst):
    if pred(first):
        return first
    elif lst:
        return s_find_first(pred, unquote(lst[0]), lst[1:])
    else:
        return None

Improve Error reporting

Right now the error handling mechanism is triggered when a property is requested from a Result object. If the library can not get the queried property it may raise an error that is difficult to follow by the user.

see the get_property implementation.

Implement generic input keywords for Orca

There is not implemented the translation from generic to specific keywords in the handle_special_keywords for Orca class.
The first to keywords to translate should be:

  • basis
  • functional

Add a proper logger to QMWorks

Currently we are printing everything (Info, Warnings and Errors) to the standard output. A proper logger is required.

Querying for properties

Dear Johan and Lars,

I was running workflow for TS search with DFTB frequencies(test1_withDFTBfreq.py) and it was successful with results(slurm-2471538.out).

I made a small modification in the workflow that I removed DFTB frequency calculation in the middle of workflow(test1_withoutDFTBfreq.py).

I ran second script in the same folder of previous script. First few steps in the both workflows are similar and the results are retrieved from cashe.json from previous run and remaining jobs were run and the results were saved into new plams folder.

When I try to print table in the second workflow, it seems to be that it can't find results from first steps because the results are living in first plams folder.

I believe noodles thinks these as two different workflows irrespective of some common jobs. Is there any way to copy the retrieved information from cashe.json to new plams folder so that all the results from new workflows stays together. or am I missing something?

Could one of you please help me to solve these issue?

Kind regards,
Satesh

Packages documentation

in docs there are leaving all the tutorial and manuals that we are developing. Please fill free to change things and add new documentation specially to the package file.
Also some examples of how to run each of the packages will be nice.

Is the registry still use this way by Noodles

Inside the packages module there are several functions related to the noodles serialisation procedure, and it is not clear if they are actively use by Noodles. can someone please comment about the following function:

class SerMolecule(Serialiser):
    def __init__(self):
        super(SerMolecule, self).__init__(plams.Molecule)

    def encode(self, obj, make_rec):
        return make_rec(obj.as_dict())

    def decode(self, cls, data):
        return plams.Molecule.from_dict(**data)
class SerSettings(Serialiser):
    def __init__(self):
        super(SerSettings, self).__init__(Settings)

    def encode(self, obj, make_rec):
        return make_rec(obj.as_dict())

    def decode(self, cls, data):
        return Settings(data)
def registry():
    return Registry(
        parent=serial.base(),
        types={
            Package: AsDict(Package),
            plams.Molecule: SerMolecule(),
            Result: SerAutoStorable(Result),
            Settings: SerSettings()})

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.