amber-md / cpptraj Goto Github PK

Biomolecular simulation trajectory/data analysis.

License: Other

Makefile 0.30% C++ 70.25% C 19.29% Fortran 3.65% Shell 3.06% Awk 0.11% Cuda 0.63% CMake 2.10% Batchfile 0.03% Roff 0.58%

cpptraj's People

Contributors

Stargazers

Watchers

cpptraj's Issues

increase version number

from you new PR #57 (comment), it's necessary to update cpptraj internal version number since the API change (break pytraj).

better (and nicer) data label?

currently pytraj use cpptraj's Dataset's legends as keys for python dictionary.

for example

In [3]: d = pt.multidihedral(traj, resrange=[0, 3 ,5])

In [4]: d.keys()
Out[4]:
['psi:1',
 'phi:4',
 'psi:4',
 'chip:4',
 'omega:4',
 'phi:6',
 'psi:6',
 'chip:6',
 'omega:6']

It's nice that when converting to panda's DataFrame, we can access the key by class's attribute

In [5]: df = pt.multidihedral(traj, resrange=[0, 3 ,5], dtype='dataframe')

In [6]: df.psi_1
Out[6]:
0    176.615564
1    166.821296
2    168.795100
3    167.425619
4    151.183350
5    134.176110
6    160.992079
7    165.112697
8    147.943321
9    145.429014
Name: psi_1, dtype: float64

So my question to discuss is:

Should we spend a bit time to name the labels to make them nicer? Or I just go ahead changing them in pytraj to what I want and cpptraj still keep its name's convention? If Dan agrees that we can change cpptraj' side, I can spend more time make suggestions about naming (although I know that @swails is really good at naming stuff). :D

For example, hbond in cpptraj use 'UU', "UV" for legends. To my view, those are hard to understand and remember. Especially cpptraj prints those to output.

And to use pandas like above, I need to replace : to _ (psi:1 to 'psi_1) and - to _ in python.

multidihedral: suppose to calculate all supported dihtype but only few

hi,

this is my input

parm G5.pdb
trajin G5.pdb
multidihedral out test_all.out
multidihedral delta out test_delta.out

For the multidihedral out test_all.out line, cpptraj said

[multidihedral out test_all.out]
    MULTIDIHEDRAL: Calculating phi psi chip omega alpha beta gamma delta epsilon 
zeta nu1 nu2 chin chin dihedrals for all solute residues.
    Output to test_all.out
    Output range is -180 to 180 degrees.

But this is all I got

#Frame        gamma:1      delta:1        nu1:1        nu2:1       chin:1
       1      60.6213      96.8920     -35.6519      35.4184    -176.7765

To make sure I cal get delta value, I explicitly add multidihedral delta out test_delta.out and got

#Frame        delta:1
       1      96.8920

Is this your intention?

Full output is here: https://gist.github.com/hainm/54883c68d13f3d0cb3a7
pdb file is here: https://github.com/pytraj/pytraj/blob/master/tests/data/Test_NAstruct/G5.pdb

thanks

Hai

Reading of CONECT records in PDB leads to bad molecule assignment

In PDB 3OIE the CONECT records bond MG to 6 coordinating oxygen atoms - however, these atoms are in arbitrary order so oxygen 547 ends up with a lower molecule number than oxygen 546. Should reading of CONECT records be optional?

"Active reference" not passed to COORDS data sets

When a COORDS data set is created it creates a separate copy of the associated topology. However, this means that any selection by distance that relies on the COORDS internal topology reference coordinates will fail since these are not ever updated. It may be time to rework the "internal reference" coords of topology files used for distance-based mask selection.

vmd style for atom selection?

hi @swails @Mojyt

what do you guys think about adding AtomMask parsing like vmd style? ("water and CA", ...) into cpptraj/pytraj/ParmEd? I guess both of you already though about this and there must a reason not to do this in AMBER (much more concise syntax?).

Hai

PCA flip sign

In the process of satisfying my curiosity, I compare PCA between cpptraj and sklearn.

Both give the same absolute result for the projection values. However, the sign is opposite.

Please check the plot here

is it your intention? (already tried to search PCA flip sign cpptraj on google)

add SetMass

Dan,

can you review my proposal here? (forked from frame_externalData branch)
hainm@34089f4
hainm/cpptraj@c74b221
it fixes this issue #49

Feel free to rewrite it as you like.

segmentation fault for TRJ dataset

This directly related to this issue in pytraj: Amber-MD/pytraj#807 (comment)

pytraj is using DataSet_Coords_TRJ as engine for TrajectoryIterator. I double-check pytraj code and don't see anything wrong obviously yet.

I tried with cpptraj and got segmentation fault:

parm system.prmtop
loadtraj ./trunc.nc name test
crdaction test radgyr out test.dat

cpptraj -i water_issue807.in 

CPPTRAJ: Trajectory Analysis. V16.00b
    ___  ___  ___  ___
     | \/ | \/ | \/ | 
    _|_/\_|_/\_|_/\_|_

| Date/time: 09/16/15  18:26:38
| Available memory: 1686.41 MB

INPUT: Reading Input from file water_issue807.in
  [parm system.prmtop]
    Reading 'system.prmtop' as Amber Topology
  [loadtraj ./trunc.nc name test]
    Reading './trunc.nc' as Amber NetCDF
  [crdaction test radgyr out test.dat]
    Using set 'test'
    RADGYR: Calculating for atoms in mask *.
    * (5808 atoms).
 0% Segmentation fault

Note: it's OK with CRD dataset and regular cpptraj workflow

parm system.prmtop
trajin trunc.nc
radgyr out test.dat

Add 'cphstats' functionality to cpptraj

@swails Not a high priority, but I think this would be good to do at some point.

developer doc - lyx to md or rst

just a placeholder to remind you.

arbitrarily iterate Trajectory with given frame indices: parallel benefit

Dan,

AFIAK, cpptraj does not have option to iterate trajectory with given frame indices (only support start, stop, stride).

I am thinking about parallelize CpptrajState for simple actions.

For example

parm 2koc.parm7
trajin md0.nc 0 1000 10
trajin md1.nc 0 1000 10
trajin md2.nc 0 1000 10
....
ref restart.nc
autoimage
rms reference @CA
distance :1 :2

In pytraj, above script will be like this

traj = pt.iterload('md*.nc', '2koc.parm7', frame_slice=[(0, 1000, 10),]*3)
data = pt.load_batch(traj, '''
ref restart.nc
autoimage
rms reference @CA
distance :1 :2
''', n_cores=8)

So my wish is to have frame_indices = range(traj.n_frames): if using n_cores=8, pytraj will create 8 CpptrajState in each core and lt each state perform calculation with a chunk of frame indices.

What's your advice here?

My idea is quite simple, just make the parallel run more abstract:

load a bunch of files by iterload with arbitrary frames
tell pytraj/cpptraj to run the job in N nodes
take the data

change log

I am not sure if you have a private log about new feature or changes or not. I think it's good idea to push it (if you have one) or create new one to cpptraj repo. It's easier to keep track.

This is example from cython project.
https://github.com/cython/cython/blob/master/CHANGES.rst

And it's easier to gather doc for new Ambertools release.

Tagging @swails and @pengfeili1 here for your packages too.

turn verbose for ensemble

I am playing with remd stuff and CpptrajState.

For trajin keyword, I got not verbose (since pytraj turns it off)

state = pt.datafiles.load_cpptraj_state('''
        parm ala2.99sb.mbondi2.parm7
        trajin rem.nc.000
        rms
        ''')

state.run()
print(state.data)

--> output <pytraj.datasets.DatasetList - 1 datasets>

but when using, I got a bunch of lines:
Input code

state = pt.datafiles.load_cpptraj_state('''
        parm ala2.99sb.mbondi2.parm7
        ensemble rem.nc.000 remdtraj remdtrajtemp 300.
        rms
        ''')
state.run()
print(state.data)

output:

TIME: Run Initialization took 0.0001 seconds.

BEGIN ENSEMBLE PROCESSING:
        ENSEMBLE: OPENING 4 REMD TRAJECTORIES
.....................................................
ACTION SETUP FOR PARM 'ala2.99sb.mbondi2.parm7' (1 actions):
  0: [rms]
        Target mask: [*](22)
        Reference mask: [*](22)
----- rem.nc.000 (1-10, 1) -----
 0% 11% 22% 33% 44% 56% 67% 78% 89% 100% Complete.

Read 10 frames and processed 10 frames.
TIME: Trajectory processing: 0.0008 s
TIME: Avg. throughput= 12626.2626 frames / second.

ENSEMBLE ACTION OUTPUT:

DATASETS:
  4 data sets:
        RMSD_00000%0 "RMSD_00000" (double, rms), size is 10
        RMSD_00001%1 "RMSD_00001%1" (double, rms), size is 10
        RMSD_00002%2 "RMSD_00002%2" (double, rms), size is 10
        RMSD_00003%3 "RMSD_00003%3" (double, rms), size is 10
---------- RUN END ---------------------------------------------------
---------- RUN BEGIN -------------------------------------------------
Warning: No actions/output trajectories specified.

DATASETS:
  4 data sets:
        RMSD_00000%0 "RMSD_00000" (double, rms), size is 10
        RMSD_00001%1 "RMSD_00001%1" (double, rms), size is 10
        RMSD_00002%2 "RMSD_00002%2" (double, rms), size is 10
        RMSD_00003%3 "RMSD_00003%3" (double, rms), size is 10
---------- RUN END ---------------------------------------------------
<pytraj.datasets.DatasetList - 4 datasets>

add `parm` to cpptraj input for testing

I prefer to have cpptraj testing input to have consistent syntax. It's easier to parse.

for example, this is a good traj.in that pytraj can uses to compare data to cpptraj.

cat traj.in

parm 2koc.parm7 # need to have parm here
trajin traj.x
....

pyraj will use above input to create CpptrajState

(dir: cpptraj/test/Test_DRMSD)

In [8]: state = pt.load_cpptraj_file('./drmsd.in')

In [9]: state.run()
Out[9]: 0

In [10]: state.datasetlist.values
Out[10]:
array([[  0.00000000e+00,   2.95729211e+00,   4.49201771e+00, ...,
          5.18656700e+00,   5.37325298e+00,   4.82324321e+00],
       [  0.00000000e+00,   4.02238772e+00,   6.57384971e+00, ...,
          9.72083983e+00,   1.02589014e+01,   9.38592381e+00],
       [  2.43182129e-07,   4.01623189e+00,   6.41421043e+00, ...,
          8.27504991e+00,   8.19405473e+00,   7.77917637e+00],
       [  0.00000000e+00,   2.95729211e+00,   4.49201771e+00, ...,
          5.18656700e+00,   5.37325298e+00,   4.82324321e+00]])

For example:

this test does not parm in input: cpptraj/test/Test_IRED
this one does: cpptraj/test/Test_DRMSD

If you think this is an ok proposal, I will remind you in the future if you add more test folder.

Inside DataSetList.cpp reference to non-existent file in repo

Hello!

Inside DataSetList.cpp there's a reference to DataSet_Coords_REF.h (#include "DataSet_Coords_REF.h") in line 23 but there's no such file in the repository. I guess it's a recent addition and hasn't been created or added.

sync with Amber master?

Hi @Mojyt

can you please sync github version with Amber master? I guess you made significantly changed (like Trajout class, ...) but have not seen in github yet. I need to follow your change too.

Hai

could not load pdb from `reduce` program

both cpptraj and parmed could not open SAM_addH.pdb file (output from reduce program). I already opened an issue in parmed repo and open a new one here (just change the message a bit)

cpptraj -p SAM_addH.pdb

CPPTRAJ: Trajectory Analysis. V16.00b
    ___  ___  ___  ___
     | \/ | \/ | \/ |
    _|_/\_|_/\_|_/\_|_

| Date/time: 08/17/15  19:10:37
| Available memory: 2196.01 MB

Error: Could not determine format of topology 'SAM_addH.pdb'
Error: Could not open topology 'SAM_addH.pdb'

This is how I got the pdb file.

import parmed as pmd

p = pmd.download_PDB('3gx5')
p[':SAM'].save('SAM.pdb')

then using reduce to get SAM_addH.pdb

$AMBERHOME/bin/reduce $pdbname.pdb > ${pdbname}_addH.pdb

all files are here: https://github.com/hainm/fftools/tree/master/sam

notes both cpptraj and parmed don't recognize the pdb file but vmd can.

add set_frame_pointer too?

I am trying this issue (#44) from your branch.
So far it works great. So I am re-designing in-memory Trajectory as a wrapper of numpy array (instead of Frame vector). The iterating time is still a bit slow compared to iterating the raw numpy array

In [58]: t0
Out[58]: <pytraj.api.Trajectory with 1000 frames, 17443 atoms>

In [59]: %timeit  for frame in t0: pass
100 loops, best of 3: 8 ms per loop

In [60]: %timeit for xyz in t0.xyz: pass
1000 loops, best of 3: 315 µs per loop

In [61]: 8000 / 315.
Out[61]: 25.396825396825395

this is ok for single loop for frame in traj, but quite slow for nested loop for frame0 in traj: for frame1 in traj (about 8 seconds for above example).

This a bit slow because pytraj need to create a new Frame for a new iteration (as a view). (I think)

So I am propose adding set_frame_pointer (or any name you prefer).

Frame::set_frame_pointer(double *ptr) {
    X_ = ptr;
    memIsExternal_ = True;
}

With this method, pytraj only need to allocate Frame once

xyz0 = traj.xyz
# create a frame view pointing to 1st element of memory block
frame = Frame(n_atoms, xyz0[0])
for xyz in traj.xyz:
    frame.set_frame_pointer(xyz)

What do you think?

add more test for cpptraj

I (again) encourage to upload more cpptraj's tests

We could make 'Amber-MD/CpptrajAdditionalTestSuite' (or any interesting name).

and add to .travis.yml file

git clone https://github.com/Amber-MD/CpptrajAdditionalTestSuite
cd CpptrajAdditionalTestSuite && make test

Benefit

pytraj can pull and test
minimize break cpptraj's code when someones make PR
travis does all the tests, so we don't 'forget'

updated
Some of tests I would like to have on github

lifetime

command line: cpptraj -p ../a.prmtop -y A_struc_* -x test.nc

I got this error
Unrecognized input on command line: 5: A_struc_10.mdcrd

Does cpptraj support load a bunch of files? if not, do you think it's good idea to add to it?

Action_Density, write to DataSets

I am working on this Action to write data to Dataset. Any suggestion which Dataset I should use?

(it's ok if saying figure it out yourself).

Action_Mask: dataset

What do you think about adding Datasets for this Action? If you think it's ok, I can spend time exploring. :D

Improve SPAM functionality

Current functionality can be improved. From @swails: "The version included in AmberTools 14 and 15 is more recent, but it is
still not well-automated -- it's work in progress. The suggested method
for running SPAM calculations are to use the "volmap" command in cpptraj to
identify peaks in solvent density, then feed those peaks to the "spam"
command which can be used to reorganize water molecule indices (so the same
water appears in the same site every frame) and compute an energy-shifted
electrostatic and van der Waals interaction for that water.

It then dumps a time series of energies for each water site, which you need
to post-process (using numpy and scipy, for instance) into a free energy
(from which you can back-out the enthalpy and entropy). The code in
SPAM.py provides code for turning this energy time-series into free
energies, although SPAM.py relies on a patched cpptraj from AmberTools12. You can use that as a template. Hopefully SPAM.py can be updated to
work with the latest version of AmberTools and automate the part of the
procedure that cpptraj doesn't yet."

upload cpptraj pdf - devguide

Dan,

can you render and upload pdf version of devguide so I can link from pytraj.
http://amber-md.github.io/pytraj/doc/build/html/developer_guide.html#read-also

thanks

simple pytraj test for cpptraj

Hi @Mojyt

I've just finished updating pytraj to catch up with cpptraj-dev. Please following this to run the tests.

export CPPTRAJHOME (pytraj will find the header files and libcpptraj in $CPPTRAJHOME/src and $CPPTRAJHOME/lib/)
Install pytraj (work well with python2.7, 3.3, 3.4)
- git clone https://github.com/pytraj/pytraj
- cd pytraj
- python setup.py install
add libcpptraj to LD_LIBRARY_PATH before running pytraj
Run simple test
- (from anywhere: python -c 'import pytraj; pytraj.run_tests()'
- from pytraj root folder:
  - cd tests
  - python ./run_simple_test.py

Note: pytraj's compiling time (linking to libcpptraj.so) is currently few times slower than compiling libcpptraj. :D

Let me know if you have any question. thx.

Hai

wrong number of residue when reading from babel file

pdb file is here: https://github.com/hainm/pytraj/blob/master/tests/data/A.pdb

> resinfo
#Res  Name First  Last Natom #Orig  #Mol
    1 A        1     1     1     1     1 A
    2 A        2    35    34     1     2 A
    3 A       36    36     1     1     3 A

full output

$ cpptraj -p A.pdb

CPPTRAJ: Trajectory Analysis. V16.00b
    ___  ___  ___  ___
     | \/ | \/ | \/ | 
    _|_/\_|_/\_|_/\_|_

| Date/time: 08/05/15  11:27:42
| Available memory: 688.141 MB

    Reading 'A.pdb' as PDB File
Warning: Malformed CONECT record: CONECT    3                                                           
Warning: Malformed CONECT record: CONECT    7                                                           
Warning: Malformed CONECT record: CONECT    8                                                           
Warning: Malformed CONECT record: CONECT   10                                                           
Warning: Malformed CONECT record: CONECT   12                                                           
Warning: Malformed CONECT record: CONECT   14                                                           
    A.pdb: determining bond info from distances.
Warning: A.pdb: Determining default bond distances from element types.
Warning: 2 or more molecules share residue numbers.
Warning:   Either residue information is incorrect or molecule determination was inaccurate.
Warning:   Basing residue information on molecules.
Warning:   Old # residues= 1, new # residues = 3

Note: parmed correctly detect it.

In [2]: p = pmd.load_file('./data/A.pdb')

In [3]: p.residues
Out[3]: 
ResidueList([
    <Residue A[1]; chain=A>
])

conda build for libcpptraj on travis

What do you think about adding 2nd build on travis for libcpptraj.

My idea is to build libcpptraj when you successfully merge to upstream repo, then push to https://anaconda.org/ambermd.

Whenever I build pytraj, I only need to conda install -c ambermd libcpptraj. pytraj will be pushed to https://anaconda.org/ambermd too.

If you're ok with this, I can make a PR. This is a sample build script in pytraj for libcpptraj but I prefers to have it on cpptraj.

https://github.com/Amber-MD/pytraj/tree/master/devtools/conda_recipe/libcpptraj

Let me know if you are not clear.

trajout: int Trajout_Single::SetupTrajWrite

Dan,

previously, with this method in cpptraj,
int Trajout_Single::SetupTrajWrite(Topology* tparmIn)

pytraj is able to open trajout by calling int Trajout_Single::SetupTrajWrite(Topology* tparmIn)
then iterate Frame and write, then close without knowing how many frames ahead.

Is there any reason to change this behavior?

Internal Error: Adding DataSet test copy to invalid list.

So I am having trouble with adding Analysis class to CpptrajState.

I just created a State and add a MatrixDouble dataset with name 'mat'

In [105]: s2
Out[105]:
CpptrajState, include:
<datasetlist: 1 datasets>

In [106]: s2.datasetlist[0]
Out[106]: <pytraj.datasets.DatasetMatrixDouble: size=666, key=mat>

I just want to add Analysis_Matrix class to analyze the matrix:

In [109]: s2.add_analysis(Analysis_Matrix(), ArgList('matrix mat name test'))
Internal Error: Adding DataSet test copy to invalid list.
Error: Could not setup analysis [matrix]
Out[109]: 1

It's clearly that I don't understand how cpptraj works at all. Can you point me where I can read? thanks.

implement new Frame constructor

from closed PR: #22

reorganize cpptraj folder

I got lost when starting to look at cpptraj's code in the begining. Mostly because there are so many files in a single folder. I think it's good idea to reorganize it. Just give example of plumed, which has similar coding style as cpptraj.

Another advantage is much easier to browse code with phone. (it took very long thumb swiping for me to read Trajout_Single.cpp file (near the end).

PS: to avoid issues out of your (Dan) radar, you can create different labels for different topics, just like in pandas's repo. In my opinion (and Jason too?), issues in github are much more convenient than g-doc.

rotate molecule with axis = atom1 - atom2

@drroe

I am not sure how to use "rotate" in cpptraj to do this "Rotate the molecule by an amount (in degrees, specified by -d) along the axis defined by two atoms (specified by -a1 and -a2)" from antechamber

Does cpptraj support this? thanks

cpptraj: https://github.com/Amber-MD/cpptraj/blob/master/src/Action_Rotate.cpp#L9-L10

AddSet(DataSet*): how can I get this right?

hi @Mojyt,

I am playing with AddSet methods in DataSetList class in pytraj. I have three tests here and expected got size=1 for DataSetList after using AddSet(DataSet*) method (where size is the number DataSet in DataSetList). However, two first tests resulted size=0. I got lost here. Can you tell me about this? thanks

def test_0(self):
    dset_traj = DataSet_Coords_TRJ()
    dslist = DataSetList()

    # wrapper of "AddSet(DataSet*)
    dslist.add_existing_set(dset_traj)
    print (dslist.size) # = 0, but I expected "=1"

def test_1(self):
    dset_traj = DataSet_MatrixDbl()
    dslist = DataSetList()

    # wrapper of "AddSet(DataSet*)
    dslist.add_existing_set(dset_traj)
    print (dslist.size) # = 0, but I expected "=1"

def test_2(self):
    dslist = DataSetList()

    # wrapper of "AddSet(DataType, name, default_name)"
    dslist.add_set("coords", "name", "funny_name")
    print (dslist[0])
    print (dslist.size) # = 1 (is what I expected)

(you can pull Amber master, recompile pytraj and go to tests folder to run python ./test_DataSetList_add_set_question.py

DataFileList.cpp: method's names are confused

I just look at two methods below. The first one returns "DataFile*" while the 2nd returns "void".
// DataFileList::RemoveDataFile()
// DataFileList::RemoveDataSet()

https://github.com/mojyt/cpptraj/blob/master/src/DataFileList.cpp#L36

Hai

how to tag ensemble run for Dispatch?

So I have two situations with identical input but giving different result

input

parm ala2.99sb.mbondi2.parm7
trajin rem.nc.000 remdtraj remdtrajtemp 300.
rms out test.dat

If using Command::ProcessInput(CpptrajState&, std::string const&) to load to CpptrajState, I got what I am expecting (rmsd calculation for only 300-K frames).

however, I am using 'Command:Dispatch` to dispatch each line of above input (with CpptrajState), I got rmsd for rem.nc.000 instead.

Looking at cpptraj's code, TrajinList need to be tagged with 'ENSEMBLE' to get correct result.

So my question is

how to tag it with current CpptrajState
is it possible to make RunEnsemble so we don't need to tag at all.
what's your best solution?

in-memory reference

like we discussed a while ago about reference structure (via email), currently cpptraj use reference from disk.

But it's more convenient to use in-memory reference (especially in pytraj). For example, we can in the fly create reference structure from any source (by loading xyz coordinate, like from Gaussian output) and use the reference for extra actions.

(PS: I raise this issue again when looking at Dan's example

parm ../tz2.parm7
reference ../tz2.rst7
trajin pp2.rst7.save
# Apply backbone dihedrals from reference structure residues 1-13 to residues 1-13
makestructure "ref:1-13:tz2.rst7"

PS2: (@ Dan: you can search 'how to use 'reference' without loading from file' in your email)

new Frame constructor: wrong rmsd result for ActionList`

Dan,

I am using ActionList to calculate rmsd for a series of masks. It works fine (got I expected) with immutable TrajecotryIterator but I got 0.0 results if using mutable Trajectory (numpy). Any idea why?

Note that if I am using Action_Rmsd directly, the result is correct.

Here is my python code

This give correct results for both mutable and immutable Trajectory.

        def test_rmsd(input_traj):
            from pytraj.actions.CpptrajActions import Action_Rmsd
            from pytraj.datasets import DataSetList
            dslist = DataSetList()
            act = Action_Rmsd()
            act.read_input('first @CA', top=input_traj.top, dslist=dslist)
            act.process(input_traj.top)

            for frame in input_traj:
                act.do_action(frame)
            print(dslist.values)

This give correct results for immutable Trajectory but wrong result for mutable Trajectory

        def test_rmsd_actlist(input_traj):
            from pytraj.actions.CpptrajActions import Action_Rmsd
            from pytraj.core.ActionList import ActionList
            from pytraj.datasets import DataSetList

            alist = ActionList()
            dslist = DataSetList()
            act = Action_Rmsd()
            alist.add_action(act, 'first @CA', top=input_traj.top, dslist=dslist)

            for frame in input_traj:
                alist.do_actions(frame)
            print(dslist.values)

wrapping Trajin or DataSet_Coord_TRJ

Dan,

pytraj used Trajin class in cpptraj for its TrajectoryIterator. Now I have time to look closer at DataSet_Coord_TRJ and find it more interesting.

it can have multiple trajectories and can randomly access frame.
- so in pytraj, I just need to create

traj = DataSet_Coords_TRJ()
traj.top = io.load("./myparm.top)
traj.load(a_list_of_file_names)

it has "void GetFrame(int idx, Frame& fIn, AtomMask const& mIn);" so I can move mask selection from pytraj level to cpptraj level for frame iterator (for frame in traj(mask='@CA').

What do you think about using DataSet_Coord_TRJ for TrajectoryIterator? any technical issue I should expect if I implement?

Hai

create frame_externalData branch in amber-md repo

Dan,

can you make this frame_externalData branch available in amber-md repo. I can not fork from your personal branch.

And it's easier to discuss when I or you submit new changes.

remlog command

Can we try to make the remold command work with ph remd style remlogs ?

Expand mask syntax

This is related to ParmEd/ParmEd#209.

I would like to expand the mask parser to be able to select molecules, chain IDs, etc.

Current proposals from @swails

A couple ideas:

Do what Chimera does (see https://www.cgl.ucsf.edu/chimera/docs/UsersGuide/midas/frameatom_spec.html)
Use :: for chain IDs (in the same way that @% indicates atom type names)
An example of 1:

# Select residues 1-10 in chain A
:1-10.A
# Select residues 1-10 in chains A, B, and C
:1-10.A-C # I *think*; not sure if you can use - for string range, maybe .A,B,C
# etc...

An example of 2:

# Select chain A
::A
# Select residues 1-10 of chains A, B, and C
::A,B,C:1-10
# Select protein backbone atoms of residues 2-20 in chains A and B
::A,B:2-20@CA,C,O,N

Another idea for flexibility is to say that chains are strings, and use the same "chain" syntax to denote molecules when the chain is given a numerical value.

My Thoughts

I do like 2) for chain IDs - I think we could potentially get into trouble with the . syntax. I don't know why but for some reason I feel I've come across atom names with . in them, although nothing is coming to mind now.

For molecule selection, personally I like the idea of having a separate character that can represent molecules, the way there is currently @ for atoms and : for residues. Based on what is currently used in the mask parser, I think that the most likely candidates for "molecule" are ; and %. Both have drawbacks: ; looks enough like : that things could easily get confused (and it's on the same key), while % is already used in a subset of @ to denote "type", so it would have different meanings in different contexts.

Whatever is decided, it should be consistent between ParmEd, Cpptraj, and the rest of Amber.

Split off general dihedral rotation functionality from Action_MakeStructure

Create new action like 'rotatedihedral' with simplified syntax for rotating dihedral angles.

lifetime stdout

Dan,

I implemented lifetime (from cpptraj analysis) in pytraj now. It seems that the statistics will be printed to stdout

In [14]: pdb = pt.load_pdb_rcsb("1l2y")

In [15]: dslist = pdb.search_hbonds()

In [22]: pt.common_actions.lifetime(dslist[1])
#Set       Nlifetimes      MaxLT      AvgLT  TotFrames SetName
         0          2          1     1.0000          2 d0
Out[22]: array([ 1.])

it seems that cpptraj's manual does not have info about dumping to dataset.

In [17]: pt.info("lifetime")
        [out <filename>] <dsetarg0> [ <dsetarg1> ... ]
        [window <windowsize> [name <setname>]] [averageonly]
        [cumulative] [delta] [cut <cutoff>] [greater | less] [rawcurve]
        [fuzz <fuzzcut>] [nosort]
  Calculate lifetimes for specified data set(s), i.e. time that data is
  either greater than or less than <cutoff> (default: > 0.5). If <windowsize>
  is given calculate lifetimes over windows of given size.

SetupX and SetX

do you think it's good idea to have better names for those two routines

this is from cpptraj's doc in Frame class

  * In addition to the constructors, there are two classes of routine that
  * can be used to set up Frames. The SetupX routines do any memory allocation,
  * and assign masses, and the SetX routines assign coordinates/velocities. The
  * SetX routines will dynamically adjust the size of the frame up to maxnatom,
  * but no reallocation will occur so the frame should be set up for the largest
  * possible # of atoms it will hold. This avoids expensive reallocations.
  * The representation of coordinates (X) and velocities (V) are double*
  * instead of STL vectors so as to easily interface with the FileIO routines
  * which tend to be much faster than iostream ops.

this is only minor thing since they are used internally.

git stuff

So you (Dan) are in github now. These are several things you might want to know (if not yet).

create new label for your issue
(beside the default labels)
markdown

basically you can write code nicely like this

print("hello world")

insert image to github issue

adding --shared and libcpptraj

I am going to release pycpptraj-0.1 and will include in AMBER in near future. It's really great if you can add option to install libcpptraj for dynamic link.

Thanks

Hai

turn off email?

Dan,

Can I turn off email notification so traivs does not send email to everyone?

notifications:
  email: false

In here: https://github.com/Amber-MD/cpptraj/blob/master/.travis.yml

cpptraj/pytraj todolist

I create this issue here to take note of new codes/features should be in cpptraj for pytraj's convenience. Hopefully Dan does not mind.

CpptrajState.h

add GetTrajinList(), GetActionList(), GetAnalysisList(), GetTrajoutList

turn off verbose for clustering and my questions about clustering

hi,

this is pytraj got when calling kmeans in cpptraj

In [4]: from pytraj.cluster import kmeans

In [5]: kmeans(traj, n_clusters=3)
#Clustering: 3 clusters 10 frames
#Cluster 0 has average-distance-to-centroid 5.643003
#Cluster 1 has average-distance-to-centroid 5.645873
#Cluster 2 has average-distance-to-centroid 4.141970
#DBI: 1.298084
#pSF: 3.236341
#Algorithm: Kmeans nclusters 3 maxit 100
#Representative frames: 9 5 1

Out[5]: array([2, 2, 1, 1, 1, 1, 0, 0, 0, 0], dtype=int32)

I although pytraj turn off cpptraj's verbose but still get the stdout.

Can we turn it off (if yes, I can make PR, or it's ok if it's quick for you to do).
Does you intend to include clustering info (average-centroid-distance, refp frame number, ...) in Dataset? (currently they are printed to stdout).
I have an idea that try to delay writing cluster trajs to disk (only do when calling Print). What do you think about performance.
I would like to construct pytraj workflow like this
- loading traj
- perform clustering and getting frame indcies for each cluster, getting rep frame number
- after playing with other stuff, calling write_cluster for specific clusters (for example, top 5)

What do you think?

Examples code: mostly I just need to write to pdb files to view in VMD. So for rep frames, I can write all to a single file like this.

rep_frame_indices = get_from_clustering...
traj[ref_frame_indices].save('reps.pdb', mode='model')

amber-md / cpptraj Goto Github PK

cpptraj's People

Contributors

Stargazers

Watchers

Forkers

cpptraj's Issues

Current proposals from @swails

My Thoughts

CpptrajState.h

Recommend Projects

Recommend Topics

Recommend Org