Giter VIP home page Giter VIP logo

dpdata's People

Contributors

amcadmus avatar angusezhang avatar chentao168 avatar dmh1998dmh avatar ericwang6 avatar felix5572 avatar haidi-ustc avatar hongritianqi avatar huangjiameng avatar iprozd avatar liu-rx avatar liuliping0315 avatar marian-code avatar njzjz avatar njzjz-bot avatar panxiang126 avatar pkufjh avatar pre-commit-ci[bot] avatar pxlxingliang avatar robinzyb avatar shigeandtomo avatar silvia-liu avatar starinthesky72 avatar tuoping avatar vibsteamer avatar wanghan-iapcm avatar wardlt avatar yi-fanli avatar yuliu98 avatar zezhong-zhang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dpdata's Issues

box.raw file not found with nopbc file

I install dpdata 0.2.1 via conda.

When I tryed read the nobc example data via dpdata an error is rise.
Here belew the command I gave:
x = dpdata.LabeledSystem("mypath/deepmd-kit/examples/nopbc/data/C1H4O2", fmt='deepmd/npy')
Here the error:
FileNotFoundError: [Errno 2] No such file or directory: '/mypath/deepmd-kit/examples/nopbc/data/C1H4O2/set.000/box.npy'
Here what is written In the doc. "If one needs to train a non-periodic system, an empty nopbc file should be put under the system directory. box.raw is not necessary is a non-periodic system."
Infact in the example directory no box.raw or virial is present.

[BUG] __add__ function of system.py is not working

Summary

When I changed

self.system_1.append(self.system_2)

to

self.system_3=self.system_1+self.system_2

in tests/test_system_append.py, a pile of errors were reported.

..............[[[4.3485389  4.20903041 5.2       ]
  [2.30878039 6.27327007 1.13      ]
  [4.64061163 3.49272294 4.58      ]
  [3.97070725 3.80408719 6.03      ]
  [1.38402421 6.25106647 1.5       ]
  [2.30624337 5.88874931 0.21      ]]]
.EEEEEEEEEEEEEE
======================================================================
ERROR: test_add_func (__main__.TestVaspXmlAppend)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
IndexError: index 12 is out of bounds for axis 0 with size 10

======================================================================
ERROR: test_atom_names (__main__.TestVaspXmlAppend)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
IndexError: index 12 is out of bounds for axis 0 with size 10

======================================================================
ERROR: test_atom_numbs (__main__.TestVaspXmlAppend)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
IndexError: index 12 is out of bounds for axis 0 with size 10

======================================================================
ERROR: test_atom_types (__main__.TestVaspXmlAppend)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
IndexError: index 12 is out of bounds for axis 0 with size 10

======================================================================
ERROR: test_cell (__main__.TestVaspXmlAppend)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
IndexError: index 12 is out of bounds for axis 0 with size 10

======================================================================
ERROR: test_coord (__main__.TestVaspXmlAppend)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
IndexError: index 12 is out of bounds for axis 0 with size 10

======================================================================
ERROR: test_energy (__main__.TestVaspXmlAppend)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
IndexError: index 12 is out of bounds for axis 0 with size 10

======================================================================
ERROR: test_force (__main__.TestVaspXmlAppend)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
IndexError: index 12 is out of bounds for axis 0 with size 10

======================================================================
ERROR: test_is_pbc (__main__.TestVaspXmlAppend)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
IndexError: index 12 is out of bounds for axis 0 with size 10

======================================================================
ERROR: test_len_func (__main__.TestVaspXmlAppend)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
IndexError: index 12 is out of bounds for axis 0 with size 10

======================================================================
ERROR: test_nframs (__main__.TestVaspXmlAppend)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
  1 import os
IndexError: index 12 is out of bounds for axis 0 with size 10

======================================================================
ERROR: test_nopbc (__main__.TestVaspXmlAppend)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
IndexError: index 12 is out of bounds for axis 0 with size 10
  1 import os

======================================================================
ERROR: test_orig (__main__.TestVaspXmlAppend)
  1 import os
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
IndexError: index 12 is out of bounds for axis 0 with size 10

======================================================================
ERROR: test_virial (__main__.TestVaspXmlAppend)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_system_append.py", line 35, in setUp
    self.system_1 = self.system_1.sub_system([0, 12, 4, 16, 8])
  File "/home/tuoping/dpdata/dpdata/system.py", line 966, in sub_system
    tmp_sys.data = System.sub_system(self, f_idx).data
  File "/home/tuoping/dpdata/dpdata/system.py", line 286, in sub_system
    tmp.data['cells'] = self.data['cells'][f_idx].reshape(-1, 3, 3)
IndexError: index 12 is out of bounds for axis 0 with size 10

----------------------------------------------------------------------
Ran 29 tests in 0.310s

FAILED (errors=14)

Steps to Reproduce

Further Information, Files, and Links

[BUG] _Failed to post_fp

After Gaussian has calculated all the cluster candidates, it terminates in step 8 (post_fp).
run.log reports an error ValueError: could not convert string to float: '219.870882556-1030.493229556' .

If other documents are needed, I will provide them in time.

run.log

[BUG] OUTCAR transformation

Bug summary

I encountered an error converting OUTCAR data using DPData

IndexError: list index out of range

DeePMD-kit Version

2.0.4

TensorFlow Version

2.8.2

How did you download the software?

conda

Input Files, Running Commands, Error Log, etc.

Running Commands:

import dpdata
dsys = dpdata.LabeledSystem('OUTCAR')

Error Log:
Traceback (most recent call last):
File "", line 1, in
File "/home/wangchenyang/anaconda3/envs/deepmd/lib/python3.9/site-packages/dpdata/system.py", line 227, in init
self.from_fmt(file_name, fmt, type_map=type_map, begin= begin, step=step, **kwargs)
File "/home/wangchenyang/anaconda3/envs/deepmd/lib/python3.9/site-packages/dpdata/system.py", line 253, in from_fmt
return self.from_fmt_obj(load_format(fmt), file_name, **kwargs)
File "/home/wangchenyang/anaconda3/envs/deepmd/lib/python3.9/site-packages/dpdata/system.py", line 1013, in from_fmt_obj
data = fmtobj.from_labeled_system(file_name, **kwargs)
File "/home/wangchenyang/anaconda3/envs/deepmd/lib/python3.9/site-packages/dpdata/plugins/vasp.py", line 68, in from_labeled_system
= dpdata.vasp.outcar.get_frames(file_name, begin=begin, step=step, ml=ml)
File "/home/wangchenyang/anaconda3/envs/deepmd/lib/python3.9/site-packages/dpdata/vasp/outcar.py", line 71, in get_frames
coord, cell, energy, force, virial, is_converge = analyze_block(blk, ntot, nelm, ml)
File "/home/wangchenyang/anaconda3/envs/deepmd/lib/python3.9/site-packages/dpdata/vasp/outcar.py", line 134, in analyze_block
virial[0][2] = tmp_v[5]
IndexError: list index out of range

Steps to Reproduce

There is no steps

Further Information, Files, and Links

No response

[BUG] fmt="cp2k/aimd_output" unable to read CP2K9.1 log file

Summary
Using dpdata with fmt="cp2k/aimd_output" returns the error:
File "<stdin>", line 1, in <module> File "/home/jxzhang/deepmd-kit/lib/python3.10/site-packages/dpdata/system.py", line 136, in to return self.to_fmt_obj(load_format(fmt), *args, **kwargs) File "/home/jxzhang/deepmd-kit/lib/python3.10/site-packages/dpdata/system.py", line 903, in to_fmt_obj return fmtobj.to_labeled_system(self.data, *args, **kwargs) File "/home/jxzhang/deepmd-kit/lib/python3.10/site-packages/dpdata/format.py", line 77, in to_labeled_system return self.to_system(data, *args, **kwargs) File "/home/jxzhang/deepmd-kit/lib/python3.10/site-packages/dpdata/plugins/deepmd.py", line 52, in to_system dpdata.deepmd.comp.dump( File "/home/jxzhang/deepmd-kit/lib/python3.10/site-packages/dpdata/deepmd/comp.py", line 83, in dump nframes = data['cells'].shape[0] AttributeError: 'list' object has no attribute 'shape'

Further Information, Files, and Links
cp2k.zip

The trouble encountered in extracting data from 'OUTCAR'

I 've meet troubles in extracting data from OUTCAR (AIMD) using this script:
#script#
from dpdata import LabeledSystem,MultiSystems
from glob import glob
"""
process multi systems
"""
fs=glob('OUTCAR') # remeber to change here !!!
ms=MultiSystems()
for f in fs:
try:
ls=LabeledSystem(f)
except:
print(f)
if len(ls)>0:
ms.append(ls)

ms.to_deepmd_raw('deepmd')
ms.to_deepmd_npy('deepmd',set_size=2000)

My OUTCAR contains three kinds of atoms and the errors are reported as follows:
#Error Report#
Traceback (most recent call last):
File "script.py", line 13, in
if len(ls)>0:
NameError: name 'ls' is not defined

The above script does not report an error when dealing monatomic system.

Failed to read cp2k aimd files from restarted files

When reading cp2k aimd outputs from restarted files, dpdata goes error with "AssertionError: (array([], dtype=float32), array([], dtype=float32), 'There may be errors in the file')"
Because with a start of cp2k aimd run, the initialized step (0th) was writen both in xyz and log files, but with a restart from former aimd runs, the initialized step was not written in xyz file.
dpdata failed to match the first energy in log and xyz file from a restarted aimd run.

[BUG] Bug in reading lammps trajectory files with random type id

dpdata assumes the types of elements in one file are the same, which is not true in some cases. For example, the trajectory file cannot be loaded correctly:

ITEM: TIMESTEP
0
ITEM: NUMBER OF ATOMS
3
ITEM: BOX BOUNDS xy xz yz pp pp pp
0.0000000000000000e+00 6.8043376809999998e+00 2.5385198599999999e-02
0.0000000000000000e+00 6.7821075796999999e+00 1.8630761460000000e-01
0.0000000000000000e+00 6.6801861338000004e+00 6.5204177000000002e-02
ITEM: ATOMS id type x y z
1 1 3.48873 0.0697213 6.67774
2 2 3.38621 0.033338 3.34239
3 3 1.79424 1.7281 5.01015
ITEM: TIMESTEP
10
ITEM: NUMBER OF ATOMS
3
ITEM: BOX BOUNDS xy xz yz pp pp pp
3.0951719137647604e-02 6.7713982144168243e+00 2.5146837349522749e-02
3.1535098850918430e-02 6.7499602284333751e+00 1.8455822840494820e-01
3.1362715442244227e-02 6.6488234183577575e+00 6.4591924584292706e-02
ITEM: ATOMS id type x y z
1 3 6.63593 3.49936 3.46086
2 2 3.44881 6.57204 3.4593
3 1 1.85117 5.11268 4.96295

xyz data mistaken as containing 'virials'

When using dpdata I found that MultiSystems kept thinking I had 'virials' in my data, when actually I didn't.

system.py line 1292 Object of type Nonetype has no len()

I circumvented this issue by adding a data['virials'] is not None check everytime a related error was reported.

Would someone like to explain where (and why) 'virials' is added as a key in self.data by default?

The cp2k module gives incorrect result

The default unit for coordinate of cp2k output is angstrom, but dpdata considers it as bohr, and make a transformation.
line 193~ 195 in dpdata/cp2k/output.py /
coords_list.append([float(line_list[1])*AU_TO_ANG,
float(line_list[2])*AU_TO_ANG,
float(line_list[3])*AU_TO_ANG])

[Feature Request] ASE Support

Dear DP Team,

I'm hoping to use ASE's xyz output format with dpdata.

Please let me know if this feature is already available.

Thank you,

Yulie

type_map specified element order with MultiSystem does not work

”How to generate deepmd/raw in specified element order with MultiSystems?”

--I use the code to generate the raw and npy files, but it generate the output data in order "C
F
H
Li
N
O
S".

from dpdata import LabeledSystem,MultiSystems
from glob import glob
"""
process multi systems
"""
fs=glob('./*/OUTCAR') # remeber to change here !!!
ms=MultiSystems()
for f in fs:
try:
ls=LabeledSystem(f,type_map=['N', 'S', 'O','C','F','Li','H'])
except:
print(f)
if len(ls)>0:
ms.append(ls)

ms.to_deepmd_raw('deepmd')
ms.to_deepmd_npy('deepmd')

[BUG] Read `qe/cp/traj` with unit `angstrom` without `.cel` file

Summary

When dpdata read 'qe/cp/traj' with unit angstrom without '.cel' file, it will generate wrong result.

dpdata 0.2.6

Steps to Reproduce

Files list without cp.cel

cp.in
cp.pos

cp.in

...
CELL_PARAMETERS { angstrom }
       19.7299995422         0.0000000000         0.0000000000
        0.0000000000        19.7299995422         0.0000000000
        0.0000000000         0.0000000000        19.7299995422
...

command

dp_sys = dpdata.System(
    file_name = 'cp',
    fmt = 'qe/cp/traj',
    )
print(dp_sys['cells'][0])

output

[[10.44066613  0.          0.        ]
 [ 0.         10.44066613  0.        ]
 [ 0.          0.         10.44066613]]

When I add cp.cel file in filelist, then the output was right

cp.in
cp.pos
cp.cel
[[19.72999963  0.          0.        ]
 [ 0.         19.72999963  0.        ]
 [ 0.          0.         19.72999963]]

Further Information, Files, and Links

not working predictions with v2

Summary

predict method is not working with v2 version. I get error concernig cell array reshape

I am running version 0.2.1 installed with pip on linux

Steps to Reproduce

from dpdata import LabeledSystem
s = LabeledSystem(".", fmt="deepmd/raw")
s.predict()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-5-9f4569986ead> in <module>
----> 1 s.predict("../../../../../selective_train3/gen5/train5_5/ge_all_s5_5.pb")

~/Raid/conda_envs/dpmd_gpu_v2.0/lib/python3.9/site-packages/dpdata/system.py in predict(self, dp)
    986             else:
    987                 cell = None
--> 988             e, f, v = dp.eval(coord, cell, atype)
    989             data = ss.data
    990             data['energies'] = e.reshape((1, 1))

~/Raid/conda_envs/dpmd_gpu_v2.0/lib/python3.9/site-packages/deepmd/infer/deep_pot.py in eval(self, coords, cells, atom_types, atomic, fparam, aparam, efield)
    244         else :
    245             if self.auto_batch_size is not None:
--> 246                 e, f, v = self.auto_batch_size.execute_all(self._eval_inner, numb_test, natoms,
    247                               coords, cells, atom_types, fparam = fparam, aparam = aparam, atomic = atomic, efield = efield)
    248             else:

~/Raid/conda_envs/dpmd_gpu_v2.0/lib/python3.9/site-packages/deepmd/utils/batch_size.py in execute_all(self, callable, total_size, natoms, *args, **kwargs)
    114         results = []
    115         while index < total_size:
--> 116             n_batch, result = self.execute(execute_with_batch_size, index, natoms)
    117             if not isinstance(result, tuple):
    118                 result = (result,)

~/Raid/conda_envs/dpmd_gpu_v2.0/lib/python3.9/site-packages/deepmd/utils/batch_size.py in execute(self, callable, start_index, natoms)
     64         """
     65         try:
---> 66             n_batch, result = callable(max(self.current_batch_size // natoms, 1), start_index)
     67         except OutOfMemoryError as e:
     68             # TODO: it's very slow to catch OOM error; I don't know what TF is doing here

~/Raid/conda_envs/dpmd_gpu_v2.0/lib/python3.9/site-packages/deepmd/utils/batch_size.py in execute_with_batch_size(batch_size, start_index)
    106             end_index = start_index + batch_size
    107             end_index = min(end_index, total_size)
--> 108             return (end_index - start_index), callable(
    109                 *[(vv[start_index:end_index] if isinstance(vv, np.ndarray) and vv.ndim > 1 else vv) for vv in args],
    110                 **{kk: (vv[start_index:end_index] if isinstance(vv, np.ndarray) and vv.ndim > 1 else vv) for kk, vv in kwargs.items()},

~/Raid/conda_envs/dpmd_gpu_v2.0/lib/python3.9/site-packages/deepmd/infer/deep_pot.py in _eval_inner(self, coords, cells, atom_types, fparam, aparam, atomic, efield)
    276         else:
    277             pbc = True
--> 278             cells = np.array(cells).reshape([nframes, 9])
    279 
    280         if self.has_fparam :

ValueError: cannot reshape array of size 1 into shape (1,9)

The problem is here I think:

cell = ss['cells'].reshape((-1,1))

The v2 version seems to require cell vector transposed. If I swap the dimension everything works fine:

cell = ss['cells'].reshape((1,-1))

[BUG]When I use a script to convert OUTCAR to npy format, I encounter the following error:

dp.zip

Traceback (most recent call last):
File "/work/wq/zcx/dtwistNiN2/0/dp/cov-outcar2dp.py", line 4, in
dsys.to('deepmd/npy', 'deepmd_data', set_size = dsys.get_nframes())
File "/home/wqt/anaconda2/envs/deepmd/lib/python3.9/site-packages/dpdata/system.py", line 136, in to
return self.to_fmt_obj(load_format(fmt), *args, **kwargs)
File "/home/wqt/anaconda2/envs/deepmd/lib/python3.9/site-packages/dpdata/system.py", line 903, in to_fmt_obj
return fmtobj.to_labeled_system(self.data, *args, **kwargs)
File "/home/wqt/anaconda2/envs/deepmd/lib/python3.9/site-packages/dpdata/format.py", line 77, in to_labeled_system
return self.to_system(data, *args, **kwargs)
File "/home/wqt/anaconda2/envs/deepmd/lib/python3.9/site-packages/dpdata/plugins/deepmd.py", line 52, in to_system
dpdata.deepmd.comp.dump(
File "/home/wqt/anaconda2/envs/deepmd/lib/python3.9/site-packages/dpdata/deepmd/comp.py", line 85, in dump
coords = np.reshape(data['coords'], [nframes, -1]).astype(comp_prec)
File "<array_function internals>", line 5, in reshape
File "/home/wqt/anaconda2/envs/deepmd/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 298, in reshape
return _wrapfunc(a, 'reshape', newshape, order=order)
File "/home/wqt/anaconda2/envs/deepmd/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc
return bound(*args, **kwds)
ValueError: cannot reshape array of size 0 into shape (0,newaxis)
dp.zip

[BUG] `virials` issue in vasprun.xml

Summary

When using dpdata to process vasprun.xml, an error will occur regarding virials: IndexError: index 0 is out of bounds for axis 0 with size 0.

Suggested solution (it works)
Two files need to be modified:
(1) ./dpdata/vasp/xml.py. One can delete the last output variable all_strs (around line 101)
(2) ./dpdata/system.py. (around line 1284) One can comment out the following code, as shown in below:

# for ii in range (self.get_nframes()) :                                                                                                                                                                                      
#     vol = np.linalg.det(np.reshape(self.data['cells'][ii], [3,3]))                                                                                                                                                          
#     self.data['virials'][ii] *= v_pref * vol 

Don't know whether it will affect anything (for other calculations), but so far it works for me.

Best,
Zhengda

[Feature Request] Converting OUTCAR not fully successful when NWRITE is specified in INCAR

Summary

When a user specified NWRITE in INCAR before they begins a vasp task, Vasp may not dump enough information into OUTCAR, i.e. some keywords are missing. Thus the converted type_map.raw will be empty.

Detailed Description

In my case for instance. As a beginner, I manually specified NWRITE=1, which suppressed some vital information including keyword TITEL to be dumped into OUTCAR. This prevented dpdata from recognizing Elements(C, Zr and W) by keyword TITEL.
So finally I had to manually add rows like "TITEL = PAW_PBE C 08Apr2002" as a workaround...
An excerption of my OUTCAR(a C-Zr-W system) is attached in the end. I suggest users be informed of this by an enhanced version of dpdata/dpdata's documentation.

Further Information, Files, and Links
...
INCAR:
POTCAR: PAW_PBE C 08Apr2002
POTCAR: PAW_PBE Zr 08Apr2002
POTCAR: PAW_PBE W 08Apr2002
....
POTCAR: PAW_PBE C 08Apr2002
<Missing lots of keywords from POTCAR, including the most important TITEL>
local pseudopotential read in
partial core-charges read in
partial kinetic energy density read in
atomic valenz-charges read in
non local Contribution for L= 0 read in
real space projection operators read in
non local Contribution for L= 0 read in
real space projection operators read in
non local Contribution for L= 1 read in
real space projection operators read in
non local Contribution for L= 1 read in
real space projection operators read in
PAW grid and wavefunctions read in

number of l-projection operators is LMAX = 4
number of lm-projection operators is LMMAX = 8
...

[BUG] LabeledSystem from OUTCAR not working

Summary

The generation of a LabeledSystem from an OUTCAR file is not working (anymore, since version 0.2.6).

Running on CentOS 8 with python 3.7.9.

Steps to Reproduce

dsys = dpdata.LabeledSystem('OUTCAR.out', fmt='vasp/outcar')

Command works until version 0.2.5, with >=0.2.6 getting an error, here evaluated for version 0.2.8:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-2-d285c09cc64a> in <module>
----> 1 dsys = dpdata.LabeledSystem('OUTCAR.out', fmt='vasp/outcar')

/opt/python/python-3.7.9/lib/python3.7/site-packages/dpdata/system.py in __init__(self, file_name, fmt, type_map, begin, step, data, **kwargs)
    225         if file_name is None :
    226             return
--> 227         self.from_fmt(file_name, fmt, type_map=type_map, begin= begin, step=step, **kwargs)
    228 
    229         if type_map is not None:

/opt/python/python-3.7.9/lib/python3.7/site-packages/dpdata/system.py in from_fmt(self, file_name, fmt, **kwargs)
    251         if fmt == 'auto':
    252             fmt = os.path.basename(file_name).split('.')[-1].lower()
--> 253         return self.from_fmt_obj(load_format(fmt), file_name, **kwargs)
    254 
    255     def from_fmt_obj(self, fmtobj, file_name, **kwargs):

/opt/python/python-3.7.9/lib/python3.7/site-packages/dpdata/system.py in from_fmt_obj(self, fmtobj, file_name, **kwargs)
   1011 
   1012     def from_fmt_obj(self, fmtobj, file_name, **kwargs):
-> 1013         data = fmtobj.from_labeled_system(file_name, **kwargs)
   1014         if data:
   1015             if isinstance(data, (list, tuple)):

/opt/python/python-3.7.9/lib/python3.7/site-packages/dpdata/plugins/vasp.py in from_labeled_system(self, file_name, begin, step, **kwargs)
     75                 vol = np.linalg.det(np.reshape(data['cells'][ii], [3, 3]))
     76                 data['virials'][ii] *= v_pref * vol
---> 77         data = uniq_atom_names(data)
     78         return data
     79 

/opt/python/python-3.7.9/lib/python3.7/site-packages/dpdata/utils.py in uniq_atom_names(data)
     87     data['atom_names'] = unames
     88     tmp_type = list(data['atom_types']).copy()
---> 89     data['atom_types'] = np.array([uidxmap[jj] for jj in tmp_type], dtype=int)
     90     data['atom_numbs'] = [sum( ii == data['atom_types'] ) for ii in range(len(data['atom_names'])) ]
     91     return data

/opt/python/python-3.7.9/lib/python3.7/site-packages/dpdata/utils.py in <listcomp>(.0)
     87     data['atom_names'] = unames
     88     tmp_type = list(data['atom_types']).copy()
---> 89     data['atom_types'] = np.array([uidxmap[jj] for jj in tmp_type], dtype=int)
     90     data['atom_numbs'] = [sum( ii == data['atom_types'] ) for ii in range(len(data['atom_names'])) ]
     91     return data

IndexError: list index out of range

Converting cp2k data to dp data

Hello,

I have some simulations run on cp2k and was trying to train a DP-model from the AIMD data. While I tried converting the pos.xyz and *.log files to dp files, it ended up with this error.
File "cp2k-to-dp.py", line 2, in <module> g_out = dpdata.LabeledSystem("*", fmt = 'cp2k/aimd_output') File "/ihome/kjohnson/ska31/.virtualenvs/tensorflow/lib/python3.6/site-packages/dpdata-0.1.15-py3.6.egg/dpdata/system.py", line 945, in __init__ self.from_fmt(file_name, fmt, type_map=type_map, begin= begin, step=step) File "/ihome/kjohnson/ska31/.virtualenvs/tensorflow/lib/python3.6/site-packages/dpdata-0.1.15-py3.6.egg/dpdata/system.py", line 134, in from_fmt func(self, file_name, **kwargs) File "/ihome/kjohnson/ska31/.virtualenvs/tensorflow/lib/python3.6/site-packages/dpdata-0.1.15-py3.6.egg/dpdata/system.py", line 999, in from_cp2k_aimd_output xyz_file=glob.glob("{}/*pos*.xyz".format(file_dir))[0]

Is this a bug or has this code been make for a certain version of cp2k?
The version that I am using is 6.1.

I did end up making changes to the source code and managed to convert these files to dp, however, there were no raw files created for forces. So I wasn't sure if my cp2k input scripts were correct.

It would be great if you could:

  1. Mention the version of cp2k to use (if there is no bug in the code)
  2. Probably provide sample examples for cp2k (where the forces are accounted for)

Best,
Sid

[BUG] _!!!The newest dpdata can not read vasp6.3.0 machine learning molecular dynamic OUTCAR to make a train set!!!_

Summary

The use of the vasp6.3.0 machine learning module leads to the change of the output file of the OUTCAR file. The outcar.py of dpdata cannot read the OUTCAR, and can only read some data in vasprun.xml that is not calculated by the machine learning module.
(We tried to use vasp6 3.0 is because it greatly improves the speed of our calculation and generation of training set from a few days to a few hours. Combined with deepmd, we can quickly get the dynamic results of large-scale system)

Deepmd-kit version=2.0.3
dpdata version =0.25

Installation via pip,Very common NVT ensemble room temperature dynamics. I think the output file format is not adapted

Code:
(dpdatanew) [js_wangyl@login1 dpdata]$ python OUTCAR1.py
/fs08/home/js_wangyl/.conda/envs/dpdatanew/lib/python3.9/site-packages/dpdata/vasp/outcar.py:87: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
return atom_names, atom_numbs, atom_types, np.array(all_cells), np.array(all_coords), np.array(all_energies), np.array(all_forces), all_virials
OUTCAR
Traceback (most recent call last):
File "/fs08/home/js_wangyl/work/test/dpdata/OUTCAR1.py", line 14, in
if len(ls)>0:
NameError: name 'ls' is not defined
(dpdatanew) [js_wangyl@login1 dpdata]$

platform :
CENTOS 7

we attached a OUTCAR and vasprun.xml, it was formed by vasp6.3.0 (Using machine learning)
OUTCAR and vasprunxml.zip

Failed to read cp2k aimd files - energy difference between xyz and log

Hello,
when reading cp2k aimd outputs ( dpdata.LabeledSystem('test_read_cp2k', fmt='cp2k/aimd_output')) I found this error:

assert log_info_dict['energies']==xyz_info_dict['energies'], (log_info_dict['energies'][0], xyz_info_dict['energies'][0],'There may be errors in the file')
AssertionError: (-10087.061, -10087.062, 'There may be errors in the file')

I controlled the error and it seems a numeric error due to the conversion from a.u. to eV and/or the used of float with different mantissa since in the .xyz it is reported -370.6925172920 and in the .log -370.692517291966794.

I'm running dpdata 0.2.1 intalled with pip with python 3.9.7 and numpy 1.21.2.

Thanks,
Lorenzo

Problem reading multiple OUTCAR files

Summary
When reading multiple OUTCAR files having different numbers of frames, dpdata does not read all the files.

Detailed Description
I'm analyzing historical data and only OUTCAR files are available. When I try to read a group of seven OUTCAR files representing a single lengthy trajectory, only the first three files (each having the same number of frames) are read. The remaining OUTCAR files (having differing numbers of frames) appear to be ignored. Is there a fix for this in the current release or would it be possible to to generalize the current code to allow it to read multiple OUTCAR files each having a different number of frames? Thanks much.

Further Information, Files, and Links
Here is a portion of the script I'm using:

from dpdata import LabeledSystem,MultiSystems
from glob import glob

fs=glob('../OUTCAR.swf0.*')
ms=MultiSystems()
for f in fs:
try:
ls=LabeledSystem(f,fmt = 'vasp/outcar')
except:
print(f)
if len(ls)>0:
ms.append(ls)

ms.to_deepmd_raw('deepmd')
ms.to_deepmd_npy('deepmd')

This script generates the following files & directories:
box.raw coord.raw energy.raw force.raw set.000 set.001 set.002 type_map.raw type.raw virial.raw

Apparently, only the first three OUTCAR files (each having the same number of frames) were read. The last four OUTCAR files (having different numbers of frames) were apparently not read.

Each 'set' directory looks like this:
box.npy coord.npy energy.npy force.npy virial.npy

[BUG] Failure in Gaussian's parser

After Gaussian has calculated all the cluster candidates, it terminates in step 8.
run.log reports an error ValueError: could not convert string to float: '219.870882556-1030.493229556'.

If other documents are needed, I can provide them in time.
run.log

[Feature Request] Add unit tests

Summary

Add unit tests for current modules. Some of them do not have packages installed.

Detailed Description

According to Codecov, the following codes doesn't have unit tests.
system:

  • System.__str__
  • System.dump
  • System.extend
  • System.predict (#277)
  • LabeledSystem.__str__
  • MultiSystems.__len__ (fixed by #172)
  • MultiSystems.__repr__
  • MultiSystems.__str__ (fixed by #172)
  • MultiSystems.from_dir
  • MultiSystems.get_nframes (fixed by #172)
  • MultiSystems.predict
  • MultiSystems.pick_atom_idx

formats:

  • amber/md: from_system
  • ase/structure: to_system (fixed by #171)
  • ase/structure: to_labeled_system
  • pymatgen/structure: to_system (fixed by #171)
  • pymatgen/structure: to_labeled_system (fixed by #171)
  • mol: to_bond_order_system
  • sdf: to_bond_order_system (#188)
  • siesta/output: from_system
  • siesta/aimd_output: from_system

[BUG] fmt="cp2k/output" unable to read CP2K log file

Summary
Using dpdata with fmt="cp2k/output" returns the error:

Traceback (most recent call last):
  File "/scratch3/usr/felix/deepmd-wat-dcm/train-128w-36dcm-wif-interfacial-virial-240NN-manual-sel/test-lammps/36dcm/0/dcm-recal/50/generate-data.py", line 4, in <module>
    d_cp2klog.to("deepmd/raw", "dpmd_raw")
  File "/home/felix/.conda/envs/dpdata/lib/python3.10/site-packages/dpdata/system.py", line 281, in to
    return self.to_fmt_obj(load_format(fmt), *args, **kwargs)
  File "/home/felix/.conda/envs/dpdata/lib/python3.10/site-packages/dpdata/system.py", line 1026, in to_fmt_obj
    return fmtobj.to_labeled_system(self.data, *args, **kwargs)
  File "/home/felix/.conda/envs/dpdata/lib/python3.10/site-packages/dpdata/format.py", line 77, in to_labeled_system
    return self.to_system(data, *args, **kwargs)
  File "/home/felix/.conda/envs/dpdata/lib/python3.10/site-packages/dpdata/plugins/deepmd.py", line 22, in to_system
    dpdata.deepmd.raw.dump(file_name, data)
  File "/home/felix/.conda/envs/dpdata/lib/python3.10/site-packages/dpdata/deepmd/raw.py", line 64, in dump
    nframes = data['cells'].shape[0]
AttributeError: 'list' object has no attribute 'shape

However, using fmt="cp2k/aimd_output" works well without any issue.

CP2K version: 7.1
dpdata version: 0.2.7

Steps to Reproduce
In the attached zip file:

  1. python generate-data.py uses dpdata with fmt="cp2k/output"
  2. python aimd-generate-data.py uses dpdata with fmt="cp2k/aimd_output"

Further Information, Files, and Links
50.zip

The dpdata is not supported with the vasp6.2.0 for make data set.

Summary
If I use dpdata to collect the data set form outcar calculated with vasp.6.2. I will get :
Traceback (most recent call last):
File "outtdp.py", line 13, in
if len(ls)>0:
NameError: name 'ls' is not defined

the script dp.py I used :

from dpdata import LabeledSystem,MultiSystems
from glob import glob
"""
process multi systems
"""
fs=glob('./OUTCAR') # remeber to change here !!!
ms=MultiSystems()
for f in fs:
try:
ls=LabeledSystem(f)
except:
print(f)
if len(ls)>0:
ms.append(ls)

ms.to_deepmd_raw('deepmd')
ms.to_deepmd_npy('deepmd')

the appendix is the OUTCAR calculated in vasp5.4.4 and vasp6.2.0 edition. How to fix this?

OUTCAR5.4.4.log
OUTCAR6.2.0.log

Difference in virial data from cp2k `.log` file and `.stress` file

Summary

The virial outputs in the .log file and the .stress file by CP2K are different, but dpdata (fmt="cp2k/output") only reads the .log file. However, CP2K seems to output pv_virial in the .log file but print the pv_total in the .stress file. This might be problematic for generating virial data using dpdata.

Details

STRESS TENSOR FROM .log FILE OF ONE SINGLE FRAME CALCULATION

  STRESS TENSOR [GPa]

            X               Y               Z
  X      -3.22841220      0.03875193      0.35205443
  Y       0.03875193     -2.99491119     -0.26930668
  Z       0.35205443     -0.26930668     -3.16014758

  1/3 Trace(stress tensor):  -3.12782366E+00

  Det(stress tensor)      :  -2.99521219E+01


.stress FILE OF THE SAME CALCULATION

#   Step   Time [fs]            xx [bar]            xy [bar]            xz [bar]            yx [bar]            yy [bar]            yz [bar]            zx [bar]            zy [bar]            zz [bar]
       0       0.000   -27535.1132302270      252.2330873873     3489.2605728168      252.2330873873   -25741.2101685947    -2388.1540627668     3489.2605728168    -2388.1540627668   -26927.8388906513

After loading cp2k data, the result is empty

I try to use cp2k to obtain the data, and I have some problems that I can't fix it.

  1. I prepare the pos.xyz and .log file under output folder, and use out = dpdata.LabeledSystem('output', fmt='cp2k/aimd_output') to load it. But the result is nothing, such as print(out['atom_numbs']) == [].

Here is 1 of 4 frames of H2O-pos-1.xyz:

       3
 i =        0, time =        0.000, E =       -17.1635848163

  O        12.2353220000        1.3766420000       10.8698800000
  H        12.4151390000        2.2331250000       11.2576110000
  H        11.9224760000        1.5737990000        9.9869940000

and I attach the log file and .inp file ( I add a .md suffix to upload, rename it and you can open it)
H2O.log
H2O.inp.md

What should I do to fix this problem and convert it to the raw format for next deepmd training?

  1. What's the difference between fmt='cp2k/output' and fmt='cp2k/aimd_output', what should I use?

Code format

Summary

Do we have a general python format guideline (i.e. Black)? I noticed some format difference between files. If I use a format plugin in vscode, the actually code changes can be mixed with the format changes which can be confusing. Maybe we can fix the format issue in one commit and use a format workflow for all the following codes submitted.

Converting gromacs data to raw files

Hello!

I am using dpdata for converting gromacs output files to raw files for deepmd-kit. I was able to get atom, atom types and coordinates from .gro file, but I was wondering how could I use dpdata to extract forces and energies? I have forces and energies in .trr & .xvg formats respectively from gromacs.

My questions are:

  1. do I need any specific format (from gromacs) to use dpdata and dump it to deepmd_raw files
  2. if not, how would I use dpdata for the same

Thanks,
Nisarg

Error with new version of CP2K package

Dear developers,

I'm having problems to use dpdata to import files from new version of CP2K package. Is this an issue or something I'm missing?

I made the calculation with the new CP2K version for the ~/dpdata/tests/cp2k/aimd directory and after that I got the problem.

Best regards,
Thank you
Filipe

VASP virial data processing

Hello.

Considering the link below, virial.raw file should include virials in the unit of eV, not virial pressure in kBar.
deepmodeling/deepmd-kit#230

Since the virial pressure of the VASP OURCAR file including is 'pressure' in unit kBar,
isn't it right to multiply the volume of the box to make 'virial.raw' file?

But I couldn't find any part multiplying volume.
Please let me know if I'm understanding wrong.

thx!
Sincerely, YJ Choi

A more efficient way of reading MD trajectory

In the workflow, we do not need to read every frame of trajectory, but only what we want. So, we should firstly make the following dict to map the frame to the trajectory:

frames_dict = {
  Trajectory0: [23, 56, 78],
  Trajectory1: [22],
  ...
}

Then, reading each trajectory:

for traj, f_idx in frames_dict.items():
    traj.read(f_idx)

For a LAMMPS trajectory or other raw text files, the read should be

def read(self, f_idx: list[int]):
    with open(self.fname) as f:
        for ii, lines in enumerate(itertools.zip_longest(*[f] * self.nlines)):
            if ii not in f_idx:
                continue
            self.process_block(lines)

where nlines is the number of lines in each block, which should be determined in the very beginning. Usually, every frame has the same number of lines.

process_block method should convert a LAMMPS frame to dpdata.

[Feature Request] Format plugin system

Summary

It's not a good idea to put all specific format functions into the main class. We should implement a plugin system like ASE's calculator or so on, where the main class can register the plugin class.

Detailed Description

Further Information, Files, and Links

[BUG] The order of elements affects the judgment of molecular formula

Summary

dpdata will change the order of elements in molecular formula, so KeyError is raised. The essentially same molecular formula should be recognized.

Steps to Reproduce
script.py

import dpdata
import os
sys_entire = dpdata.MultiSystems().from_deepmd_npy(os.path.join("data.rest"), labeled = False)
print(i for i in sys_entire)
print(sys_entire.systems)
subsys = sys_entire['I12Pb4C4N4H24'][0]
(base) ➜ test_dpdata python3 script.py
<generator object <genexpr> at 0x7fb43b5ecb30>
{'C4H24I12N4Pb4': Data Summary
Unlabeled System
-------------------
Frame Numbers     : 21316
Atom Numbers      : 48
Element List      :
-------------------
C  H  I  N  Pb
4  24  12  4  4, 'C8H48I24N8Pb8': Data Summary
Unlabeled System
-------------------
Frame Numbers     : 1887
Atom Numbers      : 96
Element List      :
-------------------
C  H  I  N  Pb
8  48  24  8  8}
Traceback (most recent call last):
  File "/root/test_simplify/test_dpdata/script.py", line 6, in <module>
    subsys = sys_entire['I12Pb4C4N4H24'][0]
  File "/opt/anaconda3/lib/python3.9/site-packages/dpdata-0.2.8.dev12+g3968d3d.d20220706-py3.9.egg/dpdata/system.py", line 1158, in __getitem__
    return self.systems[key]
KeyError: 'I12Pb4C4N4H24'

data is too large to attach

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.