Giter VIP home page Giter VIP logo

Comments (8)

kiryph avatar kiryph commented on August 26, 2024

Thank you for your recent updates including 4120397.

  1. However, I think there is a regression:
❯ cat my.bib
@article{Koch:1984,
  author  = {Koch, E.},
  doi     = {10.1107/S0108767384001227},
  issn    = {0108-7673},
  journal = {Acta Crystallographica Section A Foundations of Crystallography},
  month   = {sep},
  number  = {5},
  pages   = {593--600},
  title   = {{The implications of normalizers on group–subgroup relations between space groups}},
  volume  = {40},
  year    = {1984}
}
@Article{Dove:1997,
  author  = {Dove, Martin T.},
  title   = {{Theory of displacive phase transitions in minerals}},
  doi     = {10.2138/am-1997-3-401},
  issn    = {0003-004X},
  journal = {American Mineralogist}
  month   = {apr},
  number  = {3-4},
  pages   = {213--244},
  url     = {https://pubs.geoscienceworld.org/ammin/article/82/3-4/213-244/43266},
  volume  = {82},
  year    = {1997}
}
@article{Koch:1984,
  author  = {Koch, E.},
  doi     = {10.1107/S0108767384001227},
  issn    = {0108-7673},
  journal = {Acta Crystallographica Section A Foundations of Crystallography},
  month   = {sep},
  number  = {5},
  pages   = {593--600},
  title   = {{The implications of normalizers on group–subgroup relations between space groups}},
  volume  = {40},
  year    = {1984}
}

❯ cat bibtoolrsc
check.double = on
sort         = on
sort.format  = { doi }

❯ /usr/local/Cellar/bib-tool/HEAD-4120397/bin/bibtool -r bibtoolrsc ~/my.bib
*** BibTool WARNING (line 26 in /Users/kiryph/my.bib): Possible double entry discovered to (line 13 in /Users/kiryph/my.bib) `dove:1997'
*** BibTool WARNING (line 26 in /Users/kiryph/my.bib): Possible double entry discovered to (line 1 in /Users/kiryph/my.bib) `koch:1984'
*** BibTool WARNING (line 13 in /Users/kiryph/my.bib): Possible double entry discovered to (line 1 in /Users/kiryph/my.bib) `koch:1984'

@Article{	  koch:1984,
  author	= {Koch, E.},
  doi		= {10.1107/S0108767384001227},
  issn		= {0108-7673},
  journal	= {Acta Crystallographica Section A Foundations of
		  Crystallography},
  month		= {sep},
  number	= {5},
  pages		= {593--600},
  title		= {{The implications of normalizers on group–subgroup
		  relations between space groups}},
  volume	= {40},
  year		= {1984}
}

###Article{	  dove:1997,
  author	= {Dove, Martin T.},
  title		= {{Theory of displacive phase transitions in minerals}},
  doi		= {10.2138/am-1997-3-401},
  issn		= {0003-004X},
  journal	= {American Mineralogist},
  month		= {apr},
  number	= {3-4},
  pages		= {213--244},
  url		= {https://pubs.geoscienceworld.org/ammin/article/82/3-4/213-244/43266},
  volume	= {82},
  year		= {1997}
}

###Article{	  koch:1984,
  author	= {Koch, E.},
  doi		= {10.1107/S0108767384001227},
  issn		= {0108-7673},
  journal	= {Acta Crystallographica Section A Foundations of
		  Crystallography},
  month		= {sep},
  number	= {5},
  pages		= {593--600},
  title		= {{The implications of normalizers on group–subgroup
		  relations between space groups}},
  volume	= {40},
  year		= {1984}
}

The entry dove:1997 should not be detected as a duplicate (different doi and bibkey).
Maybe the message could mention the reason for the duplicate detection.

  1. I am also wondering whether it is possible to sort according to doi to get a warning for duplicate dois, but also an error for duplicate bibkeys?

from bibtool.

ge-ne avatar ge-ne commented on August 26, 2024

sort.format = { doi }

This means that each record has the same constant sort.format, i.e. "doi". If you want to compare for the field use something like sort.format = { %s(doi) }

I am also wondering whether it is possible to sort according to doi to get a warning for duplicate dois, but also an error for duplicate bibkeys?

I am thinking about having several formats for different equality criteria.

from bibtool.

kiryph avatar kiryph commented on August 26, 2024

This means that each record has the same constant sort.format, i.e. "doi". If you want to compare for the field use something like sort.format = { %s(doi) }

Aaah, I see. So my fault 😞 . With sort.format = { %s(doi) } it works as expected.

I am thinking about having several formats for different equality criteria.

Thanks for considering it.

from bibtool.

ge-ne avatar ge-ne commented on August 26, 2024

resource unique.field has been introduced for this purpose

from bibtool.

kiryph avatar kiryph commented on August 26, 2024

I have one thought about the last sentence in following item recently added to Changes.tex (4120397) regarding the speed of bibtool

  • The behaviour of the resource \rsc{check.double} has been
    generalized. The requirement that double entries to be adjacent
    has been dropped. This has the impact that the processing is
    slightly slower.

I think when bibtool is run with the option -q (quiet), the runtime should not be affected by the extended duplicate check.

from bibtool.

ge-ne avatar ge-ne commented on August 26, 2024

I think when bibtool is run with the option -q (quiet), the runtime should not be affected by the extended duplicate check.

Unfortunately this is not true. The processing of duplicates (deleting or marking) has to happen in any case.

The complexity of the algorithm has been O(n) and is now something like O(n log(n)). But I am not worried. The processing speed of up-to-date computers should be sufficient for realistic databases.

from bibtool.

kiryph avatar kiryph commented on August 26, 2024

The processing of duplicates (deleting or marking) has to happen in any case.

I think I was not specific enough: I meant when run without -d. Is this still true?

However, I agree the speed of bibtool is nothing I am worried about too. People writing articles, theses or books should have less than 10.000 entries and should not notice anything. Maybe other people with different usage scenarios such as webscraping millions of bibliographic records (possibly with bib file sizes into gigabytes; however, in this case they should not use anymore plaintext files).

from bibtool.

kiryph avatar kiryph commented on August 26, 2024

The newly added feature

resource unique.field

is really what I had in mind. Thank you very much for adding it (f02e496).

from bibtool.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.