Comments (12)
Yes, definitely, and I agree it's not a big ask :) I think the hull value is a sensible addition. We can probably add that regardless, but I would just like to establish a more longer-term strategy too rather than adding things on an ad-hoc basis.
Perhaps we can establish some conventions and pitch them at the next workshop.
from emmet.
Hi @mkhorton, just thought I'd ping this again as the workshop has now flown by!
We've made quite a few changes to optimade-python-tools that should make the custom field process much easier (see this bit of the docs - adding a custom field is now config-only, provided it already exists in your underlying database), and also some fixes for issues that effect the current MP-OPTIMADE implementation (one's caused by me not you!). For example, we finally have a workaround that lets us use the latest (slightly scuffed) version of FastAPI, so depending on optimade
won't be holding you back.
Whilst there wasn't really any discussion of a hull distance field at the workshop (unfortunately very few of the databases that would use it were able to fully attend this year), it still seems like an obvious and very useful addition to me (it is currently stopping us from using MP-OPTIMADE in a data-driven project for live/in situ auto-XRD). We did also merge the fabled property definitions PR so the mechanism is now in-place for providers to come together and align property definitions without needing consensus from the whole consortium.
from emmet.
@janosh I would be happy to take as much time as needed to give guidance if you are willing to take this on. @tschaume is right, though. It needs me to first get the optimade structures incorporated into our automated builds. I will move it up on my list and try and get it done in the next week. Shouldn't be too heavy of a lift.
from emmet.
Hi @ml-evs, thanks for creating this issue. We have discussed internally but haven't reached conclusions yet.
The most important metadata we need to add, I think, is our database version. Currently, our OPTIMADE endpoint is not giving information from our latest database version, which can cause a lot of confusion. In future, they should stay in sync, but I think this is still critical to communicate.
In terms of other properties, it's difficult to know where to start/stop. Hull values are probably among the most useful, but there are many others as well. I think we were hedging and waiting for official prefixes before going to a lot of effort to add these keys and then have to change them later.
from emmet.
Hi @ml-evs, thanks for creating this issue. We have discussed internally but haven't reached conclusions yet.
The most important metadata we need to add, I think, is our database version. Currently, our OPTIMADE endpoint is not giving information from our latest database version, which can cause a lot of confusion. In future, they should stay in sync, but I think this is still critical to communicate.
In terms of other properties, it's difficult to know where to start/stop. Hull values are probably among the most useful, but there are many others as well. I think we were hedging and waiting for official prefixes before going to a lot of effort to add these keys and then have to change them later.
Fair enough! Unfortunately the "official" prefixes will only really happen if there is demand/support from the databases that would implement them (maybe at the next workshop).
I think a bespoke _mp_hull_distance
would still be super useful, provided it is not too much effort on top of your existing implementation (it should only require config changes once the data is in the corresponding document itself).
from emmet.
Yes, definitely, and I agree it's not a big ask :) I think the hull value is a sensible addition. We can probably add that regardless, but I would just like to establish a more longer-term strategy too rather than adding things on an ad-hoc basis.
Perhaps we can establish some conventions and pitch them at the next workshop.
Awesome! Happy to help out with the use of the optimade package (just ping me on here). One ergonomics thing I am adding to the next python-tools release is the ability to add provider fields to the info endpoint using just the config file (Materials-Consortia/optimade-python-tools#1096), which might be useful here.
from emmet.
Thanks for the update @ml-evs! Tagging @munrojm and @tschaume here to make sure they see it.
wasn't really any discussion of a hull distance field at the workshop [...]
it is currently stopping us from using MP-OPTIMADE in a data-driven project for live/in situ auto-XRD
This is interesting to hear. Is there a standard field name people are coalescing on yet? Do you have an example energy above hull property definition we could adopt via optimade-python-tools
?
from emmet.
This is interesting to hear. Is there a standard field name people are coalescing on yet? Do you have an example energy above hull property definition we could adopt via
optimade-python-tools
?
Sorry for missing this at the time @mkhorton! I think the only examples we have so far are OQMD which uses _oqmd_stability
and odbx (i.e., me) which uses _odbx_thermodynamics->hull_distance
to provide the hull distance in eV/atom (and similarly _oqmd_delta_e.
and _odbx_thermodynamics->formation_energy
for formation energy per atom).
In terms of optimade-python-tools
, you can just add the following to your o-p-t
config:
"provider_fields": {
"structures": [
{
"name": "energy_above_hull",
"description": "The distance this structure lies above the convex hull of relevant phase diagram spanned by its constituent elements, with a given set of computational parameters.",
"unit": "eV/atom",
"type": "float"
}
]
}
This will add this field metadata to the /info/structures
endpoint of your API as _mp_energy_above_hull
. If you are serving your OPTIMADE API off the same MongoDB as your main API, and you can access the underlying value with a flat (i.e., no $lookup
or other aggregations) query on the document, then you can just add the alias:
"aliases": {
"structures": {
"energy_above_hull": "<database_field_name_for_hull_distance>",
}
}
If the database access is more complicated than that, then you'd either have to write a custom MongoTransformer
to do the appropriate plumbing, or inline the hull distance values in whatever collection you run the OPTIMADE API from, but based on the SummaryDoc
stuff in the new MP API I would have thought this should be straightforward enough. If not, we can discuss adding support for aggregations based on the alias config in o-p-t
(e.g., provide a field name and a collection name in which to do a $lookup
).
In terms of other providers agreeing on a definition and field name, I guess that providers have to be seen as actually wanting to serve this data through OPTIMADE first, and I really hope they will, as many applications hinge on it!
from emmet.
@mkhorton I'd be happy to take this on. No time pressure of course if there are higher priorities atm. But if @munrojm or @tschaume can give me some guidance on what would need to be done, I think it'd be great to expose MP stability data through OPTIMADE.
from emmet.
@janosh I agree it would be a nice addition. In terms of optimade, however, we should first bump optimade to the latest version (currently still running 0.13.3) which requires we work on adding optimade structures to our build stages first.
from emmet.
@tschaume @munrojm Thanks for the fast replies! Sounds great. Happy to take care of the merge conflicts on https://github.com/materialsproject/devops/pull/594 (if someone gives me push access).
from emmet.
Great to see progress on this and more than happy to help out from our side, let me know if you run into any issues!
from emmet.
Related Issues (20)
- `task_type` incorrectly identifies NSCF Line calcs as Uniform calcs
- emmet-core requirement for openbabel not declared HOT 1
- Changing Elasticity builder to identify deformation calcs *not* using calc_type
- [Idea]: User-friendly access to properties HOT 7
- Bug when parsing LDA pseudopotential
- [Feature Request]: Add missing `py.typed` file
- [Feature Request]: Better handling of parsed trajectory in VASP calculations HOT 4
- Potential issue with change in monty deserialization HOT 2
- Pydantic 2.5.0 breaks `from_vasp_files` HOT 1
- [conda package] missing dependence on `custodian` HOT 2
- Bug: The `.as_dict()` method on `CrystalSystem` is not Monty compatible HOT 6
- Serialzation issues with ValueEnum HOT 1
- Bug: Q-Chem's `TaskDoc.from_directory` does not work with a custom directory
- Bug: Q-Chem's `TaskDoc.from_directory` does not properly parse enthalpy or entropy (or frequency modes)
- Q-Chem Bug: There should not be a check on supported basis sets and functionals HOT 15
- Bug risk in CLI
- Bug: Information is lost on Pydantic `.model_dump()` in `TaskDoc` HOT 1
- Bug: Wrong `run_type` report on r2SCAN run HOT 3
- Changing how `task_types` are determined HOT 4
- Q-Chem data is not always valid
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from emmet.