acesuit / acepotentials.jl Goto Github PK
View Code? Open in Web Editor NEWMachine Learning Interatomic Potentials with the Atomic Cluster Expansion
License: MIT License
Machine Learning Interatomic Potentials with the Atomic Cluster Expansion
License: MIT License
This line https://github.com/ACEsuit/ACE1pack.jl/blob/b1505b8d8e5ac500fcce8e8216b4c3fd41dfe223/src/read_params.jl#L20 seems to use the same logical test for the initial if and the final elseif, which seems wrong (unless julia had rather different elseif syntax compared to other languages, in which case feel free to close this issue).
@cortner, would you please tag v0.0.6?
The changes include updated compatibility bounds for IPFitting (to ensure we get the newest version with its own updated version bounds), as well as a test for the ace_fit.jl
script.
ace_fit.jl
uses ArgParse
which needs to be add
ed by the users. What would be the best way to just have it as ACE1pack's dependency and not need for the users to install it alongside ACE1pack?
I would have ArgParse be re-exported by ACE1pack and in the script call using ACE1pack.ArgParse
. Any better ideas? @WillBaldwin0, @cortner, @wcwitt
There is a bug in "First example (Julia)" section of the tutorial.
Running, as instructed by the tutorial
using ACE1pack
import Random
using LinearAlgebra: norm, Diagonal
r0 = rnn(:Si)
basis = ace_basis(;
species = :Si,
N = 3, # correlation order = body-order - 1
maxdeg = 12, # polynomial degree
D = SparsePSHDegree(; wL=1.5, csp=1.0),
r0 = r0, # estimate for NN distance
rin = 0.65*r0, rcut = 5.0, # domain for radial basis (cf documentation)
pin = 2)
function gen_dat()
sw = StillingerWeber()
n = rand(2:3)
at = rattle!(bulk(:Si, cubic=true) * n, 0.3)
set_data!(at, "energy", energy(sw,at))
set_data!(at, "forces", forces(sw,at))
return ACE1pack.AtomsData(at, "energy", "forces")
end
Random.seed!(0)
train = [gen_dat() for _=1:20];
Generates following error:
ERROR: MethodError: no method matching ACE1pack.AtomsData(::Atoms{Float64}, ::String, ::String)
Closest candidates are:
ACE1pack.AtomsData(::Atoms, ::Any, ::Any, ::Any, ::Any, ::Any) at ~/.julia/packages/ACE1pack/rq4n0/src/acefit.jl:12
Stacktrace:
[1] gen_dat()
@ Main ./REPL[7]:7
[2] (::var"#1#2")(#unused#::Int64)
@ Main ./none:0
[3] iterate
@ ./generator.jl:47 [inlined]
[4] collect(itr::Base.Generator{UnitRange{Int64}, var"#1#2"})
@ Base ./array.jl:787
[5] top-level scope
@ REPL[9]:1
On a machine with:
julia> versioninfo()
Julia Version 1.8.2
Commit 36034abf260 (2022-09-29 15:21 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin21.3.0)
CPU: 8 × Apple M1
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, apple-m1)
Threads: 1 on 4 virtual cores
Just running ] add ACE1pack
complains about the following, is this expected? I'm not sure... It's at least unexpected to me, but if this is the correct behaviour then apologies for the false alarm.
(apologies for the screenshot, apparently copy + paste messes up the formatting on Windows... ask Bill...)
Things like
IP = ACE1pack.JuLIP.FIO.read_dict(ACE1pack.ACE1.load_dict("./ace_potential.json")["IP"])
shouldn't be necessary. As a first step we should add the option (perhaps by default) to export yace
files when fitting a potential. More generally we should try to get rid of any instances where the user needs to call JuLIP or ACE1 via ACE1pack, replacing them with small ACE1pack functions when needed..
Any interest in pymatgen imports? Here is a little script we just wrote for our own purposes; somebody could incorporate this into a suitable ACE1 package?
using PyCall
ase = pyimport("ase")
pmt = pyimport("pymatgen.core")
pmt2ase = pyimport("pymatgen.io.ase")
json = pyimport("json")
f = open(@__DIR__() * "/training.json", "r")
data = json.load(f)
Xase = []
for d in data
curr_structure = pmt.Structure.from_dict(d["structure"])
at = pmt2ase.AseAtomsAdaptor.get_atoms(curr_structure)
push!(Xase, at)
end
using ASE, JuLIP
X = []
for x in Xase
curr_at = Atoms(ASEAtoms(x))
push!(X, curr_at)
end
interface to per order degrees + related
I don't think we are there yet, but not so far away so I'm asking the question and people can think about it : how do we feel about moving ACE1.jl, ACE1pack.jl + all related things to the General
registry so that most users don't need to work with the MolSim
registry?
I have added detailed documentation for ACE1pack to have something there, but still left the overall documentation a bit all over the place. Some suggested changes, discussion points & additions I gathered from various places & people:
using IPFitting
, etc, because it is not formally "added".Any comments, anything else to add?
It would be useful to be able to set weights per-config, or even better per-atom (for forces, obviously). This would then require an interface via the ACE1pack
dictionaries.
If the dict containing all the fitting params, which is passed to ace_fit.jl
, is stored in a JSON, dict keys are all strings, and it appears that integer keys, which can be needed in Dd
, for example, are not interpreted properly. I haven't gone into the code, but from the resulting basis size it appears that they are simply ignored: I tried to do a 4,8 + 3,16 basis fit, and it was a really big basis, so I think it ended up doing 4,16 and ignored the limit indicated by 4: 8
dict entry.
I worked around it by saving the dict as YAML rather than JSON, and that worked fine, but since the code already finds tuples that are encoded as strings and converts them to tuples, perhaps dict keys that can be interpreted as integers should be converted to integers? I guess that would require deciding that there's never a real reason to have a string key that contains only an integer.
Also, it would be nice if there was an error, rather than (apparently) silently ignoring the dict key.
CC @wcwitt
Would it be helpful, or just annoyingly complicated, if ace_fit.jl
set all the known BLAS env vars for multithreading, i.e. MKL_NUM_THREADS
, OPENBLAS_NUM_THREADS
, maybe also plain OMP_NUM_THREADS
, since some of the optimizers (namely the ones in scikit-learn
) use those instead of the julia-internal value?
closed by #72
Quick summary of a Slack discussion from 16-18 July 2022.
"before I fit a potential I would like to just evaluate the basis for an atomic structure, i.e. get an array of basis values for each site. Can someone please put such an example into this tutorial?"
Options mentioned
site_energy(B, at, 1)
where 1 is atom index, at is atoms object and B is ACE basisevaluate_basis(B, at)
There was some disagreement about whether site_energy(B, at, 1)
is adequate or confusing. It provides the energy basis evaluated for a site, and the total site energy for a potential IP
is something like
site_energy(IP, at, 1) = sum(coeff .* site_energy(B, at, 1))
One thought: Would site_descriptor(B, at, 1)
be more clear than site_energy(B, at, 1)
? More generally, we need to discuss, decide whether to make any changes, and document.
Modify ACE1pack docs to strongly encourage users to have upper version bounds on all packages in their projects.
Since the most recent PR (and several before) the docs website has been refusing to update. I've noticed there are two 'annotations' under the actions tab for that PR(50). One is to do with a bad link, and the other is complaining about the doc string of fill_defaults
I believe.
Any ideas @cortner @gelzinyte?
Currently, to access the command line fitting script, users would have to execute
julia $HOME/.julia/packages/ACE1pack/ChRvA/scripts/ace_fit.jl -p params.yml
But afaik users shouldn't need to fiddle with the package directories and there might be multiple corresponding to multiple versions of the package.
Is there a way to make ace_fit.jl
more visible or accessible?
Python has "console scripts" where you can give a name to an executable which is a short-hand for executing a script. For example instead of
python $HOME/workflow/wfl/cli/cli.py --help
can just do wfl --help
.
I couldn't find a Julia equivalent?
Alternatively, maybe there's a way to, when adding ACE1pack, export the path to scripts directory somehow? So that julia ace_fit.jl -p params.yml
works?
There's PackageCompiler, which might be useful so that users-only don't even need to install julia & packages, but for this simple problem it seems an overkill.
Is it possible to integrate ACE2 into ACE1pack without clashes?
I'm running the workflow's ace fit test, and I'm getting the following error
ERROR: LoadError: UndefVarError: read_data not defined
Stacktrace:
[1] get_num_observations(d::Dict{String, String})
@ Main ~/.julia/packages/ACE1pack/rq4n0/scripts/ace_fit.jl:21
[2] save_dry_run_info(fit_params::Dict{String, Any})
@ Main ~/.julia/packages/ACE1pack/rq4n0/scripts/ace_fit.jl:31
[3] top-level scope
@ ~/.julia/packages/ACE1pack/rq4n0/scripts/ace_fit.jl:56
in expression starting at /home/cluster2/bernstei/.julia/packages/ACE1pack/rq4n0/scripts/ace_fit.jl:55
I've updated the packages in julia, and it's reporting
Status `~/.julia/environments/v1.7/Project.toml`
[e3f9bc04] ACE1 v0.9.2
[8c4e8d19] ACE1pack v0.2.0
[51974c44] ASE v0.5.4
[c7e460c6] ArgParse v1.1.4
[945c410c] JuLIP v0.11.5
The actual run line is julia /home/cluster2/bernstei/.julia/packages/ACE1pack/rq4n0/scripts/ace_fit.jl --params run_dir/fit_params_ACE.B_test.json > run_dir/ACE.B_test.stdout 2> run_dir/ACE.B_test.stderr
, but since this error appears to be internal to ace_fit.jl
, I'm not sure those details matters.
@gelzinyte said she thinks the fix just involves updating the julia package's requested github release, because the bug has been fixed there.
If you follow the one liner in the ACE1pack docs the julia Pkg status shows
(@v1.7) pkg> st
Status `~/.julia/environments/v1.7/Project.toml`
[8c4e8d19] ACE1pack v0.0.2
[51974c44] ASE v0.5.4
[c7e460c6] ArgParse v1.1.4
[945c410c] JuLIP v0.11.5
If you try to create an ACE with read_dict
it complains about not having an implementation for ACE1_PolyPairPot
, e.g. However, iff you do ]add ACE1
, then read_dict
finds the ACE1_*
symbols and it works.
Because the software engineering is quite complex, it's very hard for a casual user to figure out what's allowed/required in each dictionary. As a result, documenting those (possible fields and allowed values) is even more important.
The documentation section that points to stable version is not working. Because it point to an ancient version. This is an issue because it is the version that is linked from the main GitHub page. For new people starting with ACE this is big issue, as it can cause a lot of confusion with non working documentation.
There are two possible fixes
Option 2. is probably the best way, as it does not need anything else than a tag in the repo.
An error of UndefVarError: artifact not defined
occurred after running the code in TiAl tutorial .
using ACE1pack, ACE1, IPFitting
data_param_dict = data_params(
fname = joinpath(ACE1pack.artifact("TiAl_tutorial"), "TiAl_tutorial.xyz"),
energy_key = "energy",
force_key = "force",
virial_key = "virial")
It looks like there is new descriptor functionality in ACE1: ACEsuit/ACE1.jl@2470490.
We should update the corresponding ACE1pack tutorial: https://github.com/ACEsuit/ACE1pack.jl/blob/main/tutorials/descriptor.jl.
Rename to reflect that what we currently call "preconditioners" are actually priors.
I believe that the ACEfit datastructures use nothing
for missing data, but Julia has a different type for that called Missing
with the singleton instance missing
.
This will replace read_params.jl
.
Since read_dict
converts loaded dictionaries into the appropriate composite type, I think, the code will need to be rewritten so that what are currently *_params()
functions are turned into struct
s. Does that sound correct? What was the reason to use dictionaries and not structures in the first place?
A question mainly for @gelzinyte, possibly @WillBaldwin0. At some point along the way, when integrating ACEfit
, I changed all the parameters of the form lsqr_damp
to just damp
For example:
OLD:
params["solver"]["type"] = "lsqr"
params["solver"]["lsqr_damp"] = 2e-2
params["solver"]["lsqr_conlim"] = 1e12
params["solver"]["lsqr_atol"] = 1e-7
params["solver"]["lsqr_maxiter"] = 100000
NEW:
params["solver"]["type"] = "lsqr"
params["solver"]["damp"] = 2e-2
params["solver"]["conlim"] = 1e12
params["solver"]["atol"] = 1e-7
params["solver"]["maxiter"] = 100000
We probably should have discussed this - I'm happy to change things back if you think there it's important to have the solver prefix.
This argument type does not seem to be documented anywhere or to exist. Is it now called rad_basis
?
Docs page https://acesuit.github.io/ACE1pack.jl/stable/ACE1pack/ace1pack_overview/ refers to save_fit
with a link that doesn't mention it.
Nowhere, so far as I can tell, is it documented how to load a saved json potential.
I started a documentation page with a reading list. I haven't yet filled in the actual references. Would be helpful to get some ideas on what people feel should be added and or dropped.
Add an ACE1pack interface to the verbose=True
arguments of the sklearn brr/ard code (or something)
and IPFitting for now?
there are some important tings which need to be fixed in the docs:
JulIP
rather than JuLIP
even though I think we've fixed that. I presume this will go away when we push another change to the installation file?We need this in ACE1pack, although I don't know how it works in detail yet.
@cortner, would you please tag v0.1.0 using the 0.1.x branch (https://github.com/ACEsuit/ACE1pack.jl/tree/0.1.x). We can use this to preserve compatibility with IPFitting.
Right now, the 0.1.x branch is even with 0.0.4 (except I've updated Project.toml). Once v0.1.0 exists, I'll backport relevant commits from the current master, and the result can become v0.1.1.
This line https://github.com/ACEsuit/ACE1pack.jl/blob/b1505b8d8e5ac500fcce8e8216b4c3fd41dfe223/src/basis.jl#L154 suggests to me that the rad_basis
dict is not supposed to require that "type" : "rad"
be set manually. However, if you pass a dict like
{ 'weights': {'ideal': { 'E': 10, 'F': 50, 'V': 10} },
'basis': { 'rpi': {'type': 'rpi', 'species': ['Cs', 'Pb', 'Br'], 'r0': 3.0, 'rad_basis': {'rin': 2.0, 'rcut': 7.0}},
'pair': {'type': 'pair', 'species': ['Cs', 'Pb', 'Br'], 'r0': 3.0, 'rcut': 7.0, 'maxdeg': 3} },
'solver': {} }
to fill_defaults!
, you get an error where it complains that the rad_basis
dict has nothing for type
. Adding a "type": "rad"
to the rad_basis
dict fixes this, but seems like it shouldn't be necessary. Here's the error stack trace
ERROR: LoadError: AssertionError: !(isnothing(type))
Stacktrace:
[1] basis_params(; type::Nothing, kwargs::Base.Pairs{Symbol, Float64, Tuple{Symbol, Symbol}, NamedTuple{(:rin, :rcut), Tuple{Float64, Float64}}})
@ ACE1pack ~/.julia/packages/ACE1pack/QCojp/src/basis.jl:13
[2] _fill_default(d::Dict{String, Any}, key::String)
@ ACE1pack ~/.julia/packages/ACE1pack/QCojp/src/read_params.jl:33
[3] fill_defaults!(params::Dict{String, Any}; param_key::String)
@ ACE1pack ~/.julia/packages/ACE1pack/QCojp/src/read_params.jl:12
[4] fill_defaults!(params::Dict{String, Any}; param_key::String)
@ ACE1pack ~/.julia/packages/ACE1pack/QCojp/src/read_params.jl:19
[5] fill_defaults!(params::Dict{String, Any}; param_key::String)
@ ACE1pack ~/.julia/packages/ACE1pack/QCojp/src/read_params.jl:16
[6] fill_defaults!(params::Dict{String, Any})
@ ACE1pack ~/.julia/packages/ACE1pack/QCojp/src/read_params.jl:12
[7] top-level scope
@ ~/src/work/workflow_devel/wfl/scripts/ace_fit.jl:46
in expression starting at /home/cluster2/bernstei/src/work/workflow_devel/wfl/scripts/ace_fit.jl:46
Is this a bug, or are we just using fill_defaults!
wrong (in ace_fit)
@cortner's suggestion:
Ok, I think I know how to do some of this:
# this function just shows the kwargs so you can see how the code works.
f(; kwargs...) = (@show kwargs)
e.g.,
julia> f(;a = 1, b = 2)
kwargs = Base.Pairs(:a => 1, :b => 2)
pairs(::NamedTuple) with 2 entries:
:a => 1
:b => 2
In principle you can splat a dictionary using D...
to turn it into kwargs, but the problem is that the keys are strings and not symbols. E.g.,
julia> D = Dict("a" => 1, "b" => 2); f(; D...)
ERROR: TypeError: in typeassert, expected Symbol, got a value of type String
Stacktrace:
[1] merge(a::NamedTuple{(), Tuple{}}, itr::Dict{String, Int64})
@ Base ./namedtuple.jl:293
[2] top-level scope
@ REPL[49]:1
But we can just convert them (recursively):
_makesymbol(x) = x # default
_makesymbol(p::Pair) = (Symbol(p.first) => _makesymbol(p.second)) # recursion
_makesymbol(D::Dict) = Dict(_makesymbol.([D...])...) # special case dictionary
e.g.
julia> _makesymbol(Dict("a" => 1, "b" => 2))
Dict{Symbol, Int64} with 2 entries:
:a => 1
:b => 2
and another example with nested dictionaries
julia> _makesymbol(Dict("a" => 1, "b" => 2, "c" => Dict("d" => 3)))
Dict{Symbol, Any} with 3 entries:
:a => 1
:b => 2
:c => Dict(:d=>3)
And now we can splat the kwargs:
julia> D = Dict("a" => 1, "b" => 2); f(; _makesymbol(D)...)
kwargs = Base.Pairs(:a => 1, :b => 2)
pairs(::NamedTuple) with 2 entries:
:a => 1
:b => 2
Originally posted by @cortner in https://github.com/ACEsuit/ACE1pack.jl/pull/1#r799650497
It'd be nice to be able to fit repulsive cores using ACE1pack. I think this can be easily implemented by adding an optional keyword to fit_params
where we'd give it a dictionary say R
with the element tuples as keys and (ri,e0)
as values which would then create a Vrep
which we set to IP.components[2]
.
The NiAl tutorial mentions implementing Laplacian preconditioning with
P = Diagonal(vcat(ACE1.scaling.(dB.basis.BB, rlap_scal)...))
but while "P"
is just a dict field, the quantities needed to evaluate it are not available outside of ACE1pack ace_fit()
. It would be helpful if there was something that could implement this preconditioning so it can be used with ACE1pack ace_fit()
interface for multi-transforms
This bug is in Fitting a TiAl potential (Julia) section of the tutorial. This same issue is also present in TiAl potential (ACE1pack-julia) tutorial.
using ACE1pack
data_file = joinpath(ACE1pack.artifact("TiAl_tutorial"), "TiAl_tutorial.xyz")
data = JuLIP.read_extxyz(data_file)
train = data[1:5:end];
r0 = 2.88
ACE_B = ace_basis(species = [:Ti, :Al],
N = 3,
r0 = r0,
rin = 0.6 * r0,
rcut = 5.5,
maxdeg = 6);
Bpair = pair_basis(species = [:Ti, :Al],
r0 = r0,
maxdeg = 6,
rcut = 7.0,
pcut = 1, # this is not a reasonable default?
pin = 0)
B = JuLIP.MLIPs.IPSuperBasis([Bpair, ACE_B]);
Vref = OneBody(:Ti => -1586.0195, :Al => -105.5954)
weights = Dict(
"FLD_TiAl" => Dict("E" => 30.0, "F" => 1.0 , "V" => 1.0 ),
"TiAl_T5000" => Dict("E" => 5.0, "F" => 1.0 , "V" => 1.0 ))
train = [ACE1pack.AtomsData(t,"energy","force","virial",weights,Vref) for t in train]
Produces
ERROR: KeyError: key "default" not found
Stacktrace:
[1] getindex
@ ./dict.jl:498 [inlined]
[2] ACE1pack.AtomsData(atoms::Atoms{Float64}, energy_key::String, force_key::String, virial_key::String, weights::Dict{String, Dict{String, Float64}}, vref::OneBody{Float64})
@ ACE1pack ~/.julia/packages/ACE1pack/rq4n0/src/acefit.jl:30
[3] (::var"#5#6")(t::Atoms{Float64})
@ Main ./none:0
[4] iterate
@ ./generator.jl:47 [inlined]
[5] collect(itr::Base.Generator{Vector{Atoms}, var"#5#6"})
@ Base ./array.jl:787
[6] top-level scope
@ ~/Work/ace/tutorial-1.jl:32
On a machine with
julia> versioninfo()
Julia Version 1.8.3
Commit 0434deb161e (2022-11-14 20:14 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin21.3.0)
CPU: 8 × Apple M1
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, apple-m1)
Threads: 4 on 4 virtual cores
Environment:
JULIA_EDITOR = code
JULIA_NUM_THREADS = 4
... is now available at
https://github.com/ACEsuit/ACEregistry
This should now be used instead of the MolSim
registry. Can somebody please update the docs?
It appears the per-species or per-pair length scales are useful (e.g @WillBaldwin0 's perovskite phonon results), and I suspect my results would benefit. It'd be very useful to figure out how to give access to such functionality (via Agnesi transform or some other tool) into ACE1pack.
I've just tagged IPFitting 0.10.0. ACE1pack should now be adjusted to work with that please. @casv2
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.