Giter VIP home page Giter VIP logo

nistchempy's People

Contributors

ivanchernyshov avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

nistchempy's Issues

Synonyms splitted incorrectly

For aspirin (ID = C50782) instead of

['Benzoic acid, 2-(acetyloxy)-', 'Salicylic acid acetate']

gets:

['Benzoic', 'acid,', '2-(acetyloxy)-', 'Salicylic', 'acid', 'acetate']

Add support of all available properties

NistChemPy 0.1.1 processes mol2d, mol3d, ir, ms, and uvvis only. It would be nice to add other properties so that they will have specified short keys in the data_refs dictionary.

Reduce requirements

Change the code so that in the absence of the necessary python libraries some functionality does not work, e.g. no parsed search without pandas, no substructure search without RDKit, and so on.

The final requirements must be requests, BeautifulSoup, and importlib-resources due to the problems with the python version.

Reaction thermochemistry data extracted incorrectly

NistChemPy 0.1.1, "Reaction thermochemistry data" is not in the keys if there are more than 50 reactions:

>>> import nistchempy as nist
>>> x = nist.Compound('C13968086')
>>> x.data_refs
{'mol2d': 'https://webbook.nist.gov/cgi/cbook.cgi?Str2File=C13968086',
 'mol3d': 'https://webbook.nist.gov/cgi/cbook.cgi?Str3File=C13968086',
 'Gas phase thermochemistry data': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C13968086&Units=SI&Mask=1#Thermo-Gas',
 'reactions 1 to 50': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C13968086&Units=SI&Mask=8#Thermo-React',
 'reactions 51 to 51': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C13968086&Units=SI&Mask=8&StartReac=50#Thermo-React',
 'Ion clustering data': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C13968086&Units=SI&Mask=40#Ion-Cluster',
 'Vibrational and/or electronic energy levels': 'https://webbook.nist.gov/cgi/cbook.cgi?ID=C13968086&Units=SI&Mask=800#Electronic-Spec'}

Implement search using parsed compound data

Instead of just returning dataframe with info on compounds, make an object for the search using this dataframe. For the purpose:

  1. Reparse synonyms from the NIST database and update the dadaframe
  2. Implement the name search
  3. Clear chemical formulas and implement the formula search
  4. Use RDKit to implement the substructure search

Search by InChI which is not in WebBook

Search by InChI always gives compound-like HTML page even if WebBook does not contain this compound:

>>> import nistchempy as nist
>>> S = nist.Search()
>>> S.find_compounds('InChI=1S/Al.H/i;1+1')

Traceback (most recent call last):

  File "C:\Users\Grim\AppData\Local\Temp/ipykernel_9400/3310466150.py", line 1, in <module>
    S.find_compounds(inchis[0], 'inchi')

  File "C:\Programming\Anaconda3\envs\csd\lib\site-packages\nistchempy.py", line 427, in find_compounds
    refs = soup.find('ol').findChildren('a', href = re.compile(self._COMP_ID))

AttributeError: 'NoneType' object has no attribute 'findChildren'

Check "incorrect" web searches

Searches with high requirements to compounds (e.g. to have all possible properties) return a list of compounds which possess just some of the required properties. Possibly there are a way to detect if the search was unsuccessful ("3FFF" mask?).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.