Giter VIP home page Giter VIP logo

Comments (9)

goerz avatar goerz commented on July 19, 2024 1

Related issue (pretty old): Humans-of-Julia/Bibliography.jl#9

from documentercitations.jl.

goerz avatar goerz commented on July 19, 2024

Yeah, there are some issues with DocumenterCitations, respectively the underlying Bibliography.jl/BibParser being finicky with the .bib content it can process. People have some wild syntax in their .bib files!

Luckily, you can just clean up your .bib file to make it work 😉. None of these extra braces should actually be there (there is no capitalization, like acronyms, to preserve here), and the double braces like {{Introduction}} definitely shouldn't be there. In fact, {{Introduction}} arguably is handled correctly: with the double braces, you're indicating that you want to keep the inner braces. So Zotero should improve their export.

We do have some heuristics in place for handling braces, but they don't include braces over multiple words, which is why they're not removed from "Smooth Manifolds" and "Cambridge University Press". Maybe we can improve that, but it would actually be much better if this was handled transparently by the underlying Bibliography.jl.

from documentercitations.jl.

goerz avatar goerz commented on July 19, 2024

There's a limit to what we can do here, since there's no LaTeX parser for Julia. If someone were to implement one, this would be an entirely different story. Without that, there's always going to be cases where you have to tweak the .bib file to get the same result as you would get in LaTeX.

from documentercitations.jl.

kellertuer avatar kellertuer commented on July 19, 2024

Sadly this is not “wild syntax” but curly braces are the official way to keep Capitalisation, especially for names and such.

The problem with “cleaning up” is that the Zotero export is an automatic one that is run when I add a reference (since I want to use it in my documentation), so that would mean I have to clean up my whole reference list whenever I want to add something?

Double braces is what Zenodo (even the one called “Better bib” does.

from documentercitations.jl.

kellertuer avatar kellertuer commented on July 19, 2024

But of course you are right, both in “fixing other programs silly exports” as well as in your personal efforts (time-wise) in general there are limitations, I can totally understand that and I am happy that we have a very good step forwards with this package in having citations in docs :)

from documentercitations.jl.

goerz avatar goerz commented on July 19, 2024

Sadly this is not “wild syntax” but curly braces are the official way to keep Capitalisation, especially for names and such.

I'm sorry, I didn't mean to imply that this was a particular example of "wild syntax". And I agree that we should handle braces better.

One area where people go wild is accented characters. BibTeX used to be ascii-only, and the official way to do accented characters is described at https://www.bibtex.org/SpecialSymbols/. But there's other variations like {\'o} that work, and weirder stuff with multiple levels of braces. None of that is necessary anymore: I haven't come across a LaTeX installation that doesn't handle unicode bibtex out of the box, in a long time.

Fundamentally, though, (as far as I understand), BibTeX handles one level of braces to protect capitalization, but after that the fields are simply passed to LaTeX. I'm pretty sure that's what happens with the double-braces: the first level of braces is stripped out by BiBTeX and the second level is then just ignored by LaTeX itself. Beyond that, I'm sure there's someone who has actual macro calls in their bibtex fields (like \textit, siunitx macros), and this is not something we'll be able to handle. What happens in DocumenterCitations is that we can process the raw field text a bit (stripping out braces, for example), but after that, the string is just plain text, or markdown to be more precise. I should probably clarify this a bit more in the documentation. Capitalization in DocumenterCitations will always be preserved, so there's generally no need to use braces (although it's still a good feature to have, just for compatibility). You're also strongly encouraged to use unicode as much as possible instead of LaTeX commands or escape sequences.

As I said, processing the raw (LaTeX) text should be improved. I'll just have to do a proper letter-by-letter parser. Nested braces is something that you famously can't do with a simple regex (which is what we have now). The parser will also have to be able to recognize inline math, since that's something we want to perfectly preserve (including braces). Everywhere else, we should just strip out braces. I wouldn't mind if that processing was done in Bibliography.jl, but I'll have to get in touch with the maintainer to see what they think.

The problem with “cleaning up” is that the Zotero export is an automatic one that is run when I add a reference (since I want to use it in my documentation), so that would mean I have to clean up my whole reference list whenever I want to add something?

Oh, definitely not. What I had in mind is that you keep a .bib file for the Manopt documentation, separate from the .bib file that Zotero puts out. If want to add a reference, add it in Zotero, open the .bib file that Zotero generators in a text editor, and copy the entry to the .bib file for the Manopt documentation. Then you can "clean" it there, once.

Double braces is what Zenodo (even the one called “Better bib” does.

I've seen even pretty widely used tools write out pretty questionable .bib code. And don't even get me started on the garbage that publishers let you download off their website! That's something I personally handle with a getbibtex script I'm also advertising in the documentation

My general recommendation (and what I do) is to copy the bib code exported by your reference manager, and put that in a separate .bib file for any specific paper (or software project, in this case), and to clean it up manually in a text editor.

from documentercitations.jl.

kellertuer avatar kellertuer commented on July 19, 2024

Concerning wild characters (being German like you I think, ü for example – living in Norway also something like ø) are resolved by BibLaTeX (which is anyways my favourite Bib format), which we do not have a parser for.
From the current parser, something like {\"a} is kept as is and also shown in HTML here like this.
Sure I use UTF8 by now as well.

All your arguments are valid, if I would write my bibtex entries by hand I would follow all those rules – the problem is really that I am planning to use some software to keep one archive of literature (and export to paper folders, documentation,...) – and then Zotero (sorry not zenodo) does this strange stuff.

I have some scripts here as well from a previous group I was in (e.g. some doi2bib things).

My only argument, where I do not agree is “clean it up manually” because that means I either remember “ah I had this entry cleaned for that paper and can copy it from there” (that's what I do for now for the docs) or clean them multiple times – I prefer to do neither of that.

from documentercitations.jl.

goerz avatar goerz commented on July 19, 2024

I sympathize. Just be aware that there's some other just plain bugs in the underlying BibParser: https://github.com/Humans-of-Julia/BibParser.jl/issues

Even in the example .bib file I had to work around that:

% https://github.com/Humans-of-Julia/BibParser.jl/issues/32
@string{XXXclearparser = ""}

Hopefully, this situation will improve over time

from documentercitations.jl.

kellertuer avatar kellertuer commented on July 19, 2024

You are right, this is probably more an issue about that package (if you look closely, there is an issue about BibLaTeX – by actually me).
I think we could close this one and I open the curly brace thing there maybe.

from documentercitations.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.