Comments (6)
Thanks for noticing these issues. The M END
seems to be missing indeed.
I'm a bit confused with your second point. When connectivity information is present in the IOData instance, it should be dumped. Can you given an example showing this problem? That may clarify the issue.
from iodata.
Yeah, you make the point. I didn't make it clear. Because when connectivity information is not available, such as in XYZ file format, there would be no such information dumped to SDF file. So, I think we need to figure out a good way of generating the connectivity. I know open babel handles this very well.
from iodata.
It's a bit of mission creep....we can dump connectivity when we have it, but defining it would be a utility external to IOData I think. It's implicit in GOpt, and we could use that to define connectivity to the extent we need it. Perhaps would require splitting off a utility from GOpt for connectivity.
from iodata.
I agree with Paul. IOData does (at least for now) not attempt to guess where bonds are because it goes beyond the original scope of reading and writing data.
If we decide to extend the scope, there should also be some discussion on how far we'd like to go. I'll try to make a few guesses. Just detecting connectivity (without trying to guess the types of bonds) can be done with relatively little code (~15 lines) and a table of covalent radii. For PDB files, that would be fine. However, not for the SDF format, because it also describes the type of bond to represent a Lewis structures. Trying to guess a Lewis structure from the connectivity is quite complex and existing algorithms tend to break on exotic molecules. (Even humans don't always agree.)Such an algorithm would go quite far beyond the scope of IOData. Openbabel, RDKit and OpenEye have advanced solutions for this. You can also try to use variations in bond length to detect the bond order, but that would require well-optimized geometries. Effects from level of theory, basis set or just internal strain may be enough to break the algorithm.
In any case, I'd suggest to fix one thing at a time. If you can make a PR fixing the M END
issue, that would already be very welcome, irrespective of the connectivity discussion.
from iodata.
So to be clear, I wouldn't be averse to having a stand-alone utility that had the functionality:
- Input:
iodata
instance without connectivity. - Output:
iodata
instance with connectivity generated byRDKit
,OpenBabel
, etc. Or even just covalent radii and interatomic distances.
I wouldn't want to include this in iodata
(except maybe the last one) because it adds major external dependencies and goes beyond the simple mandate of iodata
which is averse to "internal computation", and focused on actual input/output. Once one starts trying to duplicate RDKit and OpenBabel then one has gone down a whole different (very interest, but very difficult) rabbit hole.
It is a fascinating problem, though. I thought a little bit about the problem of generating atom/bond types this morning (for fun) and, wow, what a mess. Especially as we are interested in structures that are not necessarily equilibrium structures, coming up with anything sensible would be very difficult, except maybe for relatively simple organic compounds and inorganic molecules involving only elements from Groups 1,2, 16, 17, 18. Even in such easy cases, what one does with things like sulfur hexafluoride? One would almost need to run a semiempirical calculation (or minimal basis set HF) and then post-process the data to be reliable, and then one is really truly in the HORTON
landscape, not merely IOData
.
Also, (for now) iodata
is mostly (not exclusively, obviously) focussed on ab initio quantum calculations, and having atom/bond types end up being mostly useful for molecular mechanics and some types of semiempirical calculations.
from iodata.
Thanks for the comments! @PaulWAyers @tovrstra
Given this problem is beyond the scope of our IOData
, let's leave this for RDKit
or OpenBabel
.
It makes things very clear to time for now. I will fix the missing tag issues shortly and make a new PR.
from iodata.
Related Issues (20)
- Support different types of normalization of the primitives in `iodata.basis` HOT 1
- Support *.mol files HOT 3
- CFOUR molden files HOT 4
- Extract some information from results of `opt` and `scrf` Gaussian jobs HOT 4
- 2-electron reduced density matrices HOT 14
- PDB load_one issue with atom type CL HOT 5
- Compute electronic energy/gradient(force) in IOData
- Input writers for other quantum chemistry software HOT 2
- Install issue on Windows HOT 2
- Can I trust IOData for handling molden files generated from PySCF HOT 2
- Rename some fields read from PDB, to be more in line with PDB conventions HOT 3
- AttributeError: module 'numpy' has no attribute 'int'
- Scipy Factorial2 change HOT 4
- Issues related to factorial2 function HOT 2
- Can't install on macos with M1 HOT 3
- Computing Center of Mass HOT 3
- Support GPAW HOT 2
- Python 3.9 Numpy 1.20 Depreciation: np.int, np.float HOT 7
- Fix Factorial2 HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from iodata.