Giter VIP home page Giter VIP logo

Comments (10)

JaGeo avatar JaGeo commented on June 2, 2024 2

@stefsmeets I just wanted to mention that this might be problematic for very large structures. There are only 26 letters in the alphabet. With exception to this drawback, such a solution would be nice.

from pymatgen.

stefsmeets avatar stefsmeets commented on June 2, 2024 2

I'd be happy to have a go at this

from pymatgen.

fxcoudert avatar fxcoudert commented on June 2, 2024 1

One of the reasons labels should be unique: they are used to indicate bonding (and angles, torsions, etc). If they are not unique, this information is useless. (And I am working on a PR to add that capability to pymatgen.)

from pymatgen.

Andrew-S-Rosen avatar Andrew-S-Rosen commented on June 2, 2024

I can confirm that this is likely a bug (in terms of having agreement with the CIF spec).

from pymatgen.

stefsmeets avatar stefsmeets commented on June 2, 2024

The bug may not necessarily be in the CifWriter itself. Pymatgen expands the structure to P1 when reading a cif file (symmetry information gets lost). This includes duplicating the labels.

from pymatgen.

fxcoudert avatar fxcoudert commented on June 2, 2024

Does pymatgen require labels to be unique? I do not know. If so, the bug is in the symmetry expansion. But what is sure is that labels in a CIF file should be unique (whether pymatgen accepts duplicate labels or not).

from pymatgen.

stefsmeets avatar stefsmeets commented on June 2, 2024

No, pymatgen does not require unique labels. The duplication of labels happens when the symmetry is expanded in the CifParser:

pymatgen/pymatgen/io/cif.py

Lines 1003 to 1011 in 0e57abf

if not match:
coord_to_species[coord] = comp
coord_to_magmoms[coord] = magmom
labels[coord] = label
else:
coord_to_species[match] += comp
# disordered magnetic not currently supported
coord_to_magmoms[match] = None
labels[match] = label

One way to solve would be to ensure labels are suffixed 'abcde...' when the symmetry gets expanded there. Although this can also be a check on the CifWriter side.

from pymatgen.

JaGeo avatar JaGeo commented on June 2, 2024

@stefsmeets sound great. (cc @janosh)

from pymatgen.

stefsmeets avatar stefsmeets commented on June 2, 2024

Some options:

  1. CifWriter raises ValueError if Structure contains duplicate label -> user is forced to correct it
  2. CifWriter raises UserWarning if Structure contains duplicate label -> user can choose to correct it
  3. CifWriter overwrites labels to fix the issue -> less hassle for the users, but can lead to unexpected labels in CIF file
  4. CifParser.parse_structures() ensures unique labels when reading CIF -> if the user modifies the sites, they may still end up with duplicate labels

I want to avoid agressively/automatically relabelling, because I know that this can lead to unexpected results which can be frustrating. So I'm leaning towards 1 or 2.

To help with 1 or 2, I want to add a method: Structure.relabel_sites() to give users some room to quickly relabel a large number of sites and ensure uniqueness. I think this can be useful to add regardless.

Let me know what you think.

from pymatgen.

fxcoudert avatar fxcoudert commented on June 2, 2024
  • Option 1: sadly, there are CIF files (mostly from computational tools) that have this problem, so I think that is not viable.
  • Option 2: sounds nice, but does not account for situations where pymatgen itself is creating label duplicates (when lowering symmetry). How about an option to CifWriter to make labels unique.

from pymatgen.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.