juliamolsim / atomsbase.jl Goto Github PK

View Code? Open in Web Editor NEW

81.0 13.0 16.0 926 KB

A Julian abstract interface for atomic structures.

Home Page: https://juliamolsim.github.io/AtomsBase.jl/

License: MIT License

Julia 100.00%

atomsbase.jl's Introduction

AtomsBase

A Julian abstract interface for atomic structures.

AtomsBase is an abstract interface for representation of atomic geometries in Julia. It aims to be a lightweight means of facilitating interoperability between various tools including ...

Chemical simulation engines:
- DFTK.jl (density-functional theory)
- Molly.jl (molecular dynamics)
Integration with third party-libraries:
- ASEconvert.jl (integration with the Atomistic Simulation Environment)
I/O with standard file formats (.cif, .xyz, ...)
- E.g. AtomIO.jl
automatic differentiation and machine learning systems
- ChemistryFeaturization.jl (featurization of atomic systems)
numerical tools: sampling, integration schemes, etc.
visualization (e.g. plot recipes)

Currently, the design philosophy is to be as lightweight as possible with a small set of required function dispatches to make adopting the interface easy. We also provide a couple of standard flexible implementations of the interface that we envision to be broadly applicable. If features beyond these are required we encourage developers to open PRs or provide their own implementations. For more on how to use the package, see the documentation.

Packages using AtomsBase

The following (not all yet-registered) packages currently make use of AtomsBase:

ASEPotential
ASEconvert
AtomIO: I/O for atomic structures, also wraps some ASE functionality
Atomistic: integrated workflow for MD simulations, part of CESMIX
AutoBZCore.jl: Brillouin-zone integration
BFPIS
ChemistryFeaturization: Interface for featurization of atomic structures for input into machine learning models, part of Chemellia
DFTK: density functional theory simulations
ExtXYZ: Parser for extended XYZ files
InteratomicPotentials: implementations of a variety of interatomic potentials, also part of CESMIX
Molly: molecular dynamics simulations
Xtals: I/O and structure representation for crystals

atomsbase.jl's People

Contributors

Stargazers

Watchers

Forkers

gregstrq rashidrafeek sailfish009 jamesthesnake leticia-maria chemicalfiend stjordanis jameskermode jrdegreeff ruibin-liu jty-computation liozou ejmeitz lxvm singularitti dhairyalgandhi

atomsbase.jl's Issues

make sure AtomsBase plays nice with Crystalline.jl/Bravais.jl

https://github.com/thchr/Crystalline.jl

because this would just be really cool for geometry building etc.

...depending on how in-depth this goes, it could easily end up as a separate package...this issue is mainly here as a reminder...

AtomView and system getindex

How am I intended to implement getindex(v.system, v.index, x) so that getindex(v::AtomView, x::Symbol) works without calling get()? If I use get() there is a cyclic reference and I don't think you can use getfield() with an AtomView.

Couldn't find a library that uses the AtomView approach to AtomsBase besides Molly. Maybe this is an issue with Molly's implementation, but we do not have arrays for mass, charge, atomic number etc stored as part of the system object so those are not accessible directly via getfield(v.system,x)[i] or something along those lines. I can access specific properties by name (e.g. atomic_mass(v.system, 1)) but not for a generic property passed as a symbol to getindex().

Base.getindex(v::AtomView, x::Symbol) = getindex(v.system, v.index, x)
function Base.get(v::AtomView, x::Symbol, default)
    hasatomkey(v.system, x) ? v[x] : default
end

Visualizing molecules and simulations

Hi, I felt as though the AtomsBase package had a lot of potential to be a great way for all Julia chemistry and materials science folks to outsource their structures (and simulation trajectories) visualization to. I wanted to get the ball rolling on this since I was working with a visualization issue on Molly.jl. Molly uses Makie pretty well to get the simulation trajectory output as an mp4 file.

I was looking into some other projects as well and found Alchemy.jl by @gustavojra for quantum mechanics (but can't seem to get it running for some reason) and this for single molecule visualization. However I think we could probably come up with a whole pure-Julia visualization suite for molecular simulation, similar to VMD or Ovito but free and open source and fast on Julia! Let me know what you think!

subsystem function?

From Zoom discussion today, it came up that having a function to return a subsystem would be useful. Currently, indexing/iteration operates at the level of atoms (or AtomView objects), and I think this makes sense, but there is also utility for getting back out another system object of the same type that contains only a subset of the atoms.

Potential discussion point: should we suggest/attempt to enforce whether things like boundary conditions/box size can or cannot change from system to subsystem level?

I know @tjjarvinen has more thoughts on this, I'll let him chime in with more.

Avoid use of getters with symbols

I'm nervous about the use of getters using the symbols describing the fieldnames. To my mind this defeats the entire point of an abstract interface. I appreciate the elegance those with which one can access properties of a system. I therefore want to propose to replace calls such as

sys[inds, :position]

with

sys[inds, position]

This can be formally equivalent to position(sys[inds]) but internally something more efficient can be done. Because position has its own type this means it can be done in a type-stable way, where as using a Symbol one relies on constant-propagation. (I don't know how reliable this is?)

But the biggest advantage for me is that this way one can change the internals without changing the way we access.

A prototype of this is already in #101

Introduce Cell types

This is a proposal based on the zoom call on 15 / 5 / 2024, that replaces #97 .

(1) remove bounding_box and boundary_conditions from the system and replace it with a cell object that encodes the same information.
(2) Implement a type Cell or ParaCell that contains fields cell_vectors and pbc (true / false)
(3) At the same time I would like to replace the AbstractVector with Vector in order to remove that type instability as well.

It would look something like this:

struct FlexibleSystem{D, TPART, TCELL} <: AbstractSystem{D}
    particles::Vector{TPART}
    cell::TCELL
    data::Dict{Symbol, Any}
end

struct ParaCell{D, T}
    cell_vectors::NTuple{D, SVector{3, T}}
    pbc::NTuple{D, Bool}
end

I think this will do two things:

make the FlexibleSystem type stable
increase flexibility

I'll start on the PR if I get tentative agreement from a few people.

test_approx_eq fails for isolated molecules

As tittle says.

Working example

using AtomsBase
using AtomsBaseTesting
using Unitful

hydrogen = isolated_system([
    :H => [0, 0, 0.]u"Å",
    :H => [0, 0, 1.]u"Å"
])

test_approx_eq(hydrogen, hydrogen)  # error

See for libAtoms/ExtXYZ.jl#48

Supporting "arbitrary" properties

This has come up a couple of times now. Examples could be:

a charge associated with every atom (atom-wise property, scalar)
a dipole moment associated with a system (system-level property, vector)
...and plenty of others.

@mfherbst, I recall a discussion of a relatively flexible/generic way to do this at some point but now can't find it, or recall the details of the proposal, but since this came up again recently over at FermiQC/Fermi.jl#120 (comment), I figured I'd file an issue here for discussion.

Improve convenience constructors for included implementations

They're a bit too rigid right now and should ideally be flexible to anything (different types of vectors, tuples, etc.) that might be thrown at them to specify systems.

Consider extending `==` on `Atom`, `FlexibleSystem` with `StructEquality.jl`?

I was adding support to this package in my code, where in the future, the conversions between FlexibleSystem and my Cell types used in CrystallographyBase.jl and Crystallography.jl is possible.

However, when I was testing this conversion, I found that Atom & FlexibleSystem contains mutable types like Dict, therefore their == are not automatically true since:

Value types are intended for compact, immutable objects. They are stored on the stack, passed by value, and the default hash and equality are based on the literal bits in memory.
Record types are allocated on the heap, are passed by reference, and the default hash and equality are based on the pointer value (the data address).
When you embed a record type in a value type, then the pointer to the record type becomes part of the value type, and so is included in equality and hash.

Which caused comparisons between these types feeling wierd:

julia> using WhyNotEqual

julia> a = Atom(:H, [0, 0, 1.0]u"bohr")
Atom(H, atomic_number = 1, atomic_mass = 1.008 u):
    position          : [0,0,1]u"a₀"


julia> b = Atom(:H, [0, 0, 1.0]u"bohr")
Atom(H, atomic_number = 1, atomic_mass = 1.008 u):
    position          : [0,0,1]u"a₀"


julia> a == b
false

julia> whynot(==, a, b)
DifferentButSameChildren: When applying `lens` to both objects, we get `obj1` and `obj2`.
obj1 and obj2 are different, but their children are all the same.
lens: identity
obj1: Atom(H,  [       0,        0,        1]u"a₀")
obj2: Atom(H,  [       0,        0,        1]u"a₀"

julia> box = [[10.0, 0.0, 0.0], [0.0, 10.0, 0.0], [0.0, 0.0, 10.0]]u"Å"  # Note the unit!;

julia> bc = [Periodic(), Periodic(), Periodic()];

julia> hydrogen = FlexibleSystem(
           [Atom(:H, [0, 0, 1.0]u"bohr"), Atom(:H, [0, 0, 3.0]u"bohr")], box, bc
       );

julia> hydrogen2 = FlexibleSystem(
           [Atom(:H, [0, 0, 1.0]u"bohr"), Atom(:H, [0, 0, 3.0]u"bohr")], box, bc
       );

julia> hydrogen == hydrogen2
false

julia> using WhyNotEqual

julia> whynot(==, hydrogen, hydrogen2)
DifferentButSameChildren: When applying `lens` to both objects, we get `obj1` and `obj2`.
obj1 and obj2 are different, but their children are all the same.
lens: (@optic _.particles[1])
obj1: Atom(H,  [       0,        0,        1]u"a₀")
obj2: Atom(H,  [       0,        0,        1]u"a₀")

As you can see, their fields are equal, but they are not equal.

This might be surprising when you want to compare them, and this is what I encountered earlier: I have to write more code to do simple tests.

One solution is to use StructEquality.@struct_hash_equal_isequal:

using StructEquality

@struct_hash_equal_isequal struct Atom{D, L<:Unitful.Length, V<:Unitful.Velocity, M<:Unitful.Mass}
    position::SVector{D, L}
    velocity::SVector{D, V}
    atomic_symbol::Symbol
    atomic_number::Int
    atomic_mass::M
    data::Dict{Symbol, Any}  # Store arbitrary data about the atom.
end

@struct_hash_equal_isequal struct FlexibleSystem{D,S,L<:Unitful.Length} <: AbstractSystem{D}
    particles::AbstractVector{S}
    bounding_box::SVector{D,SVector{D,L}}
    boundary_conditions::SVector{D,BoundaryCondition}
    data::Dict{Symbol,Any}  # Store arbitrary data about the atom.
end

Then the above tests will all be trues.

In fact, I use @struct_hash_equal_isequal in all my packages:

@struct_hash_equal_isequal struct SpglibCell{L,P,T,M} <: AbstractCell
    lattice::Lattice{L}
    positions::Vector{MVector{3,P}}
    atoms::Vector{T}
    magmoms::Vector{M}
end

Similar packages are:

Convenient functions for extending / dropping dimensions

Since most libraries only support 3D structures, it would be convenient to have functions that attach "empty dimensions" or strip them off in a generic way in AtomsBase.

Organizing boundary conditions

Right now, the boundary conditions are specified as AbstractVector{BoundaryCondition}.

I suggest we make it NTuple{D, BoundaryCondition} (D is the dimensionality of the system):

Tuple is covariant in its type parameters: Tuple{Periodic, Periodic, DirichletZero}is a subtype ofNTuple{3, BoundaryCondition}`.
Widely used variants of boundary condition, such as Periodic, DirichletZero, are singleton types (they don't have any fields). So, if the boundary conditions are specified as for example Tuple{Periodic,Periodic,DirichletZero}, you don't actually need to store this tuple, you can directly dispatch on its type.
If we also define the system as AbstractSystem{D}, than we enforce that the size of the tuple of boundary conditions equals to the dimensionality of the system.

`visualize_ascii` on unit test case is broken

(potentially related to #83)

If I just copy/paste this code from test/interface.jl into the REPL:

box = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]u"m"
bcs = [Periodic(), Periodic(), DirichletZero()]
positions = [[0.25, 0.25, 0.25], [0.75, 0.75, 0.75]]u"m"
elements = [:C, :C]
atoms = [Atom(elements[i], positions[i]) for i in 1:2]

flexible = FlexibleSystem(atoms, box, bcs)

I get:

Error showing value of type FlexibleSystem{3, Atom{3, Quantity{Float64, 𝐋, Unitful.FreeUnits{(m,), 𝐋, nothing}}, Quantity{Float64, 𝐋 𝐓⁻¹, Unitful.FreeUnits{(a₀, s⁻¹), 𝐋 𝐓⁻¹, nothing}}, Quantity{Float64, 𝐌, Unitful.FreeUnits{(u,), 𝐌, nothing}}}, Quantity{Int64, 𝐋, Unitful.FreeUnits{(m,), 𝐋, nothing}}}:
ERROR: ArgumentError: invalid Array dimensions

I haven't tried to track down exactly what is causing this, but it doesn't seem to affect any functionality (or cause any tests to fail, since tests only test show and not display, which is what is called in the REPL if you just put the variable name), I'm wondering if we should just dispatch display to show by default. Is there any foreseeable issue with that approach?

should there be a `distance` method?

I'm imagining a primary syntax something like distance(sys::AbstractSystem, atom_ind_1, atom_ind_2) that would return the distance between two particles in the structure, and maybe also single-index and no-index options that would return distances from one atom to every other one, and a matrix of distances, respectively?

Similarly, should there be a vector version of a similar kind of thing? (i.e. returns the vector in the bounding_box basis rather than just its length)

Personally, I could imagine these having broad enough use (interatomic potentials, geometry optimization, etc.) that it would be worth having, but curious for others' thoughts.

(And of course, as with the beauty of any interface, anyone could dispatch the function on whatever performant method implementation they want)

Introduce `mass`

I think the default mass of a particle should be mass and not atomic_mass.

First, this does not create any ambiguities: The prefix atomic_ is to emphasize it is the mass of an atom. If the particle of which we are getting the mass is an atom then that information is already encoded in the type of the particle object. In other words, mass(x::Atom) already encodes all the information, whereas atomic_mass(x::Atom) duplicates it.

Secondly, it will generalize the terminology to other particles that are not necessarily atoms.

One could introduce mass side by side with atomic_mass or it could entirely replace atomic_mass.

Energies, Forces, Etc

I propose that we stop ~~dragging our feet~~ sleeping on this and implement prototypes for

potential_energy
forces
virial

We can add more if desired e.g.

stress
hessian

come to mind. potential_energy could also be called just energy, I don't mind either way.

This will make an ultra-lightweight interface. If needed it can be moved out of AtomsBase into a separate package, but I see no harm having it here for now. If Molly were the only thing that does simulations then it could live there, but it's not.

If people agree please :thumbsup and I'll make a PR.

CC @jgreener64 @rkurchin @mfherbst @tjjarvinen

UPDATE: after discussion the current proposal is the following:

potential_energy(system, calculator; kwargs...)
forces(system, calculator; kwargs...)
forces!(F, system, calculator; kwargs...)
virial(system, calculator; kwargs...)
energy_forces_virial(system, calculator; kwargs...)
energy_forces_virial!(F, system, calculator; kwargs...)

best names for `FastSystem` and `FlexibleSystem`?

They were originally SoASystem and AoSSystem and were changed to the present names. Thoughts?

Do we really need separate `AbstractParticle` and `AbstractElement` types?

Right now we have two types describing our particles: AbstractParticle and AbstractElement.
AbstractElement contains the physical information about the element, besides its coordinates and possibly velocities.

At the same time AbstractParticle by definition is simply a container which stores AbstractElement, vector of coordinates and possibly vector of velocities. Since AbstractParticle is simply a container, it does not add anything new in terms of functionality. Moreover, since it may or may not contain velocities, we need to create at least two concrete subtypes: one without velocities and one with them, which again leads to the redundancy in the interface (we need to implement he functions that deal with AbstractElement and coordinates for both of these subtypes).

In these regards, Rachel mentioned this dicsussion of the indexing in Xtals package.

So, I propose that we remove AbstractParticle type. We can rename AbstractElement to AbstractParticle although, but it would still correspond to element data without coordinates and velocities. And getindex would then simply return a tuple (Element, coordinate) or (Element, coordinate, velocity).

Generating bicrystals using CSL for bounding box

I'm curious to know if being able to create bicrystals by generating the coincident site lattice boundaries for simulation cell with PBC would be within the scope of AtomsBase.jl. This capability is missing in ASE build module and would be a nice feature in AtomsBase.jl in my opinion. I can start implementing it for an orthorhombic pbc simulation cell if this is the right home for it.

Library of examples

Now that AtomsBase is starting to get wider usage, I think a library of examples in the docs would be a good thing to have. Starting this issue to brainstorm possible thing to include...

Interface Update v0.4.x

The following document summarizes a proposed interface update for 0.4.x. Please provide comments by 8 July, after which I will start a new PR that implements this. If you plan to provide comments but after 8 July just let me know here please.

https://hackmd.io/@TQNVbm-8R6mxyDAnh7mzAA/HJgCF-xP0

Short summary

Rework boundary conditions interface, introduce a cell type, remove all usage of infinities. #99, #97
Introduce a "chemical element" / "atom id" type that is isbits and therefore computationally efficient but sufficiently flexible for most use cases; #49
Slim down the core interface, in particular make dynamic getters such as system[:position, i] part of an extended interface.
Introduce setters as a second extension to the interface.
Move the reference implementations to an unexported submodule.
#100, #28, #43, #78, #5

EDIT - just to be clear - we are only talking about 0.4.x. Further evolutions are likely. This need not be the final iteration.

some kind of `wrap` function for PBC's?

Would it make sense to have this be part of the interface? If so, we would need to be able to dispatch on boundary condition types, which might require rejiggering the type parameters of AbstractSystem...

Chemical Element

At the moment, the Atom type is defined as follows:

struct Atom{D, L<:Unitful.Length, V<:Unitful.Velocity, M<:Unitful.Mass}
    position::SVector{D, L}
    velocity::SVector{D, V}
    atomic_symbol::Symbol
    atomic_number::Int
    atomic_mass::M
    data::Dict{Symbol, Any}  # Store arbitrary data about the atom.
end

I notice two things: why do we need atomic symbol AND atomic number? Doesn't it define the same thing? Secondly, Symbol is not an isbbits type (which really tripped me up a little while ago).

I therefore propose to implement a (could be named something else of course)

struct ChemicalSpecies
    Z::Int 
end

which is now isbits, and could be displayed as the symbol for readability.

`position` conflicts with `Base.position`

I am not sure this was known when get_position was changed to position in #4. But having an export conflicting with Base doesn't seem like a good idea.

julia> using AtomsBase

julia> position
WARNING: both AtomsBase and Base export "position"; uses of it in module Main must be qualified
ERROR: UndefVarError: position not defined

This means that whenever position needs to be called , it would need to be written as AtomsBase.position unless we add a line using AtomsBase: position

It seems that exporting the name in Base was a mistake, but it won't be changed until julia 2.0 (See JuliaLang/julia#33799)

`AbstractSystem` as a subtype of `AbstractArray`

Right now, AbstractSystem is defined as a subtype of AbstractVector. In case of a static D-dimensional lattice it is more convenient to use cartesian indices instead of linear indices, so this subtyping does not make much sense in this particular case.

Instead, AbstractSystem could subtype AbstractArray ( or not subtype anything).
In either case, we keep the functions Base.getindex and Base.size. I also propose to add Base.length to the list of interface functions because Base.size and Base.length have different meaning in the context of multidimensional arrays.

`add_atom`/`remove_atom` functions, example with mutability?

Nothing should stop someone from having a subtype of AbstractSystem that's mutable, and it could be useful for setting up geometries for simulations, especially if they're rather large and the overhead of rebuilding a copy with one atom added/removed could be large.

I was thinking a few related things:

It would be nice to have an example (see #50) for using AtomsBase for building geometries for, e.g., MD or DFT simulations.
For that purpose, having functions like add_atom and remove_atom that just return a copy of the system with an atom added/removed, would be useful. Or, if we make an example subtype of AbstractSystem that's actually itself mutable, we could also have add_atom! and remove_atom!, which could be nice for a larger system where the overhead of making the copy might be large.
This would also serve as another example of how easy interoperability is, since you'd presumably want to convert back to some immutable and package-specific system type to run your actual calculation.

I'm happy to mock up a basic version of this, but thought I'd open an issue for discussion in case people had other thoughts about it.

My main question is, do add_atom/remove_atom belong in AtomsBase itself, or should they (like, e.g. visualization-related functionality that's being worked on in Sclera) rather live in a separate package that people can depend on if they need it?

Element can't be found when using String for AtomID

AtomsBase.jl/src/interface.jl

Line 91 in 652f916

element(id::Union{Symbol,AbstractString,Integer}) = PeriodicTable.elements[id]

MWE:

Atom("Si", zeros(3)u"\AA")

Export optional system and atoms property keys and their units

The tables about the optional atomic and system properties from https://juliamolsim.github.io/AtomsBase.jl/stable/tutorial/ should be available for downstream codes in the form of constants in AtomsBase. This came up in libAtoms/ExtXYZ.jl#29.

Interfacing extended XYZ readers and writers for AtomsBase

Hello! I'd like to contribute extended XYZ readers and writers to AtomsBase. I have an existing Julia package, ExtXYZ.jl, which calls a C library containing parser that is auto-generated from a formal grammar for the extended XYZ format that we've been working on recently, as well as a C writer.

There are two things I'd like to discuss before starting work ob this:

Package Dependencies. ExtXYZ.jl is already interfaced to @cortner's JuLIP.jl. There we made JuLIP depend on ExtXYZ, but since it seems AtomsBase is designed to contain the basic interface only I'm wondering if it would be better for me to make my ExtXYZ package depend on AtomsBase instead? Or perhaps there should be a new package which uses both of them?
Arbitrary Data. Extended XYZ format allows for arbitrary per-atom and per-config data. This is useful e.g. to store (perhaps multiple sets of) energies and forces associated with a structure when training machine learning potentials, or for visualisation of per-atom quantities with OVITO which can also read extxyz format. In ASE these are stored in the Atoms.arrays and Atoms.info dictionaries, respectively. In JuLIP we did something similar, with a single data dictionary. I see there is a data dict in the Atom type, but from what I can tell storing additional data associated with each structure is not included in the abstract AbstractSystem API. Is this something you've discussed already and ruled out? If so should I add interface routines which convert to/from (AtomsBase.AbstractSystem, Dict) pairs and the dictionaries that ExtXYZ works with? Or would it be better to define my own concrete subtype of AbstractSystem which includes e.g. per_atom_properties::Dict{Symbol,Any} and per_structure_properties::Dict{Symbol,Any} dictionaries, or similar?

What type parameters should `AbstractSystem` have?

Currently we have one D for number of dimensions and one S for the species type. There could be more (e.g. adding Unitful types to specify units for length, velocity, etc.) or fewer, i.e. removing one/both of the current ones. Discuss!

add some more tests

another "note to self" issue of something we should do before registering...probably just making sure the example files run is sufficient for now, and maybe also just making sure every interface function gets run on the concrete implementations we export

Adding support for bonds

Just wanted to start a discussion about potential ways to store bond information in the current AtomsBase object as well as how trajectories could be standardized.

Bonds could just be added as an optional keyword with format up to the user or the format could be dictated. Either way there should be the option to include this data I think.
- Trajectories I'm a little less certain how to include as they are dynamic and AtomsBase fundamentally is not. That said a lot of analysis functions act on atomic trajectories not systems and it could be nice to unify trajectory formats as well.

Support Cartesian indexing

In lattice-based systems such as those @Gregstrq works on, it will eventually be desirable to support Cartesian and not just linear indexing into AbstractSystem objects.

Boundary Conditions

PROPOSAL

Replace

boundary_conditions::SVector{D, BoundaryCondition}

with one of

# option 1  (strong preference for this)
pbc::NTuple{D, Bool}

# option 2
struct PBC{D}
   vals::NTuple{D, Bool}
end 

boundary_condition::PBC{D}

and to clarify in the documentation the difference between boundary conditions and constraints.

This would be a breaking change and affect downstream packages.

ORIGINAL POST:

The boundary conditions currently cannot take two values without type instability:

boundary_conditions::SVector{D, BoundaryCondition}

If I want PBC in x, y but open in z direction? A possible solution is to allow Periodic(true) and Periodic(false). But e.g. Periodic(), Open() will not work. As I mentioned on Zulip, I think for particle systems there is only one type of boundary condition: periodic {true, false}. Everything else is to be implemented as a contraint. I propose to remove boundary_conditions entirely and replace it by a pbc::NTuple{D, Bool}.

ZeroDirichlet or anythig like that should be implemented as a contraint, not a boundary conditions. (this is different from continuum mechanics)

There are some more complicated cases that I'm not 100% certain how to address. E.g. reflections. (quadrupole method) but I suspect those are easiest to implement by just storing a larger system + the information that this systems has a certain symmetry

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

Working Unitless

It appears that Atomsbase won't let me work unitless. This seems unnecessarily restrictive. What is the intention?

FlexibleSystem is missing a propertynames implementation

I think in #53 we missed adding a propertynames implementation for FlexibleSystem.

@JanHab Do you want to take care of this?

What is the scope of this package?

The README says

AtomsBase is an abstract interface for representation of atomic geometries in Julia.

and

Currently, the design philosophy is to be as lightweight as possible, with only a small set of required function dispatches.

So what all functions are part of these "required function dispatches". Are operations like getting distances, angles, dihedral angles etc. planned to be implemented here. Are more convenience functions like those which exist in ASE to create and transform structures (like creating supercell etc.) intended to be a part of this. Or are these functionalities to be implemented in new packages making use of the interfaces defined here?

Should we enforce Unitful coordinates?

This is the place to discuss whether we want to enforce specific types for the coordinates and, by extension, for the velocities.

Right now, the proposed way is to assume that the coordinates must have a unitful type with dimension of the length, while the velocities must have a unitful type with the corresponding dimension of the velocity. As a consequence, all of the realizations of the abstract interface need to enforce this assumption.

What is the opinion of the community members about this restriction?

My take on it is that this is an unnecessary restriction. Although I agree that the majority of use cases pertain to the particles in real space, I can imagine doing dynamical simulations with the particles living in some abstract space. For example, in momentum space. As a result, I propose to relax the restriction and allow the coordinates and velocities to have arbitrary types. Instead, I propose that we should enforce the "homogeneity" of the used types: one should not be able to mix the unitless and unitful coordinates while describing the same system (one should not be able to use unitful quantities with different dimensions as well).

getting positions/velocities on the whole system vs. one particle

Just going to file an issue for discussing this since @Gregstrq and I were just going back and forth about it on Slack.

Currently, the setup is (I'm going to write it out just for positions, but it follows exactly analogously for velocities) that there's just one function position, which behaves this way:

position(::AbstractParticle) --> position of that particle (this is obvious)
position(::AbstractSystem) --> positions over every particle in the system

Grigory (assuming I was understanding correctly) was suggesting that we could instead do something like

position(::AbstractParticle) --> position of that particle (this is obvious)
position(::AbstractSystem, index) --> position of particle at an index
positions(::AbstractSystem) --> new function for getting all the positions

Personally, I don't feel it's crucial to add a new function name as I think the behavior as it exists now is fairly intuitive, though I would be fine with adding the indexed version.

Anyway, discuss.

Should slicing work? If so, what should it do?

Currently, you can only use single indices. Intuitively, I would say slicing should return a new system with the same bounding box, boundary conditions, etc. but only containing the sliced indices. However, one could also imagine it returning a list of the actual particle objects referenced by the indices...also, either of these becomes a bit fraught in the case of a lattice with Cartesian indexing because the bounding box is more "intimately" connected to the specific particles/sites...so maybe not supporting slicing at the top level is fine?

Display is too rich as default

I appreciate this is largely personal preference:

The extremely detailed and rich text-based summary of the Atoms object is nice, but as a default I think it gives far too much information for my taste. I normally, want to see a very brief summary (cf Lux layers) and then the option to for a more detail. E.g. there could be a rich_display function added to AtomsBase.

How do others feel about this?

we should add docstrings

Just leaving this here so that we don't forget to do it before actually registering the package 😆

Should `get_` be removed from the getter functions

At the minute the getter functions such as get_position, get_box etc. are preceded by get_.

I can see the advantage of keeping common variable names free, but I lean towards the Julia convention of not using get_ (and adding ! for any corresponding setters).

Do others have opinions on this?

Is `AbstractElement` the best name?

It could be somewhat confusing terminology in the case where the identifier isn't actually an element. Would something like AbstractIdentifier or AbstractParticleIdentifier be too vague? Do people have other possibilities to suggest?

Dynamic properties for `StaticAtom`

While playing a bit with AtomsBase support for DFTK I quickly ran into the issue that I need to attach extra "properties" to an atom for our internal bookkeeping (pseudos, magnetisation etc) and that the element which was stored inside the struct turned out to be not all that useful.

So how about we avoid storing the element at all (can always be recovered from the atomic_symbol) and allow appending/storing arbitrary properties using a Dict? Along the lines that got me thinking if might not be a better approach to just integrate with the standard Julia properties interface instead of having atomic_property at all. What I have in mind is something similar to https://github.com/JuliaMolSim/DFTK.jl/blob/master/src/Energies.jl#L40 where the keys are the just "by convention". Naturally one could fill the Dict with some default data from a lookup into elements on atoms construction.

I don't think it makes sense to do this for some special properties like species, velocities, positions to be part of the Dict to ensure type stability on these, so those I would still keep these separate.

Quick sketch what I have in mind:

struct Atom{D,L<:Unitful.Length}
    position::SVector{D,L}
    atomic_symbol::Symbol
    data::Dict{Symbol,Any}
end

function Atom(position, symbol::Union{Integer,AbstractString,Symbol})
    el = elements[symbol]
    data[:atomic_mass] = at.element.atomic_mass
    data[:atomic_number] = at.element.number    
    Atom(position, el.symbol, data)
end

Base.propertynames(at::Atom, private=false) = private ? append!(keys(data), fieldnames(at)) : keys(data)
Base.getproperty(at::Atom, x::Symbol) = x in keys(data) ?  getindex(data, x) : getfield(at, x)
Base.setproperty!(at::Atom, x::Symbol, v) = x in keys(data) ? setindex!(data, x, v) : setfield!(at, x, v)

# Maybe?
element(at::Atom) = elements[at.atomic_symbol]

Note that this is quite flexible and also future-proof. If we decide that a property is so important that it deserve to be a true field of Atom we just make it one and effectively "move it out" of the dict without any further code changes downstream.

Thoughts?

Utility to access optional AtomsBase keys (e.g. charge)

This issue is in reference to the solution to #74 and possible ways to improve the use of optional keys with other parts of the AtomsBase interface.

The original problem: Defining a getindex(sys::AbstractSystem, i::Integer, x::Symbol) for libraries (like Molly) that make use of the AtomView object can be challenging especially when optional atomkeys like charge are used.
The current solution to implement the getindex function in Molly is pasted below. The use of optional keys makes implementing this function tricky. For example, Molly now has to maintain the custom_keys list and also define functions that mimic the AtomsBase interface to retrieve the property specified by x if x is optional in the AtomsBase interface. For mandatory keys like atomic_number, AtomsBase requires the definition of AtomsBase.atomic_number but that is not the case for optional keys (e.g. you cannot define AtomsBase.charge). In general, it would be nice to have a method to access optional keys through the AtomsBase interface instead of relying on code custom to the library implementing AtomsBase.
One thought I had was maybe add all of the optional keys to the interface but set the methods to nothing. Not sure if this does what I think it does, but I think this would make the function optional to overload with your own definition and if you did not overload it gets compiled to a no-op.
Something like: charge(sys::AbstractSystem) = nothing

Might be missing something from this explanation so feel free to add anything @mfherbst.

function Base.getindex(system::Union{System, ReplicaSystem}, i, x::Symbol)
    atomsbase_keys = (:position, :velocity, :atomic_mass, :atomic_number)
    custom_keys = (:charge, )
    if hasatomkey(system, x)
        if x ∈ atomsbase_keys
            return getproperty(AtomsBase, x)(system, i)
        elseif x ∈ custom_keys
            return getproperty(Molly, x)(system, i)
        end
    else
      throw(KeyError("Key $(x) not present in system."))
    end
end

Should we add `mass` and `momentum` functions?

@cortner suggested this...TBD if it belongs at the top level of the interface...