Comments (5)
Hi,
I don't understand your point. You want an option to surround/protect some parts by braces? My understanding is that the BibTeX entry itself should be protected...
from pylatexenc.
Sorry if this wasn't completely clear. I am trying to create bibtex files programatically. Depending on the bibliography style, bibtex may generate titles in "titlecaps" or "sentence case" irrespective of the capitalisation used in the bibtex file. To avoid interfering with mandatory capitalisation (e.g. acronyms), mandatorily capitalised title parts (e.g. acronyms) should be protected by braces in bibtex files.
Let's consider this example title
AET: An exposé of titles
in the bibliography as compiled in a latex documents, this might get displayed as
AET: An Exposé of Titles
The corresponding part of the bibtex file should be
title={{AET}: An expos{\'e} of titles}
Note that using double braces would prevent bibtex from using the title capitalisation from the bibliography style and is thus not wanted.
title={{AET: An expos{\'e} of titles}}
would lead to AET: An exposé of titles
in the pdf rather than AET: An Exposé of Titles
Below is a small python code to ease looking at this question:
#!/usr/bin/env python3
## Import statements
import sys
import re
from pylatexenc.latexencode import utf8tolatex
## Function to surround accronyms with braces
def capitalize_title(title):
capitalization_regex = re.compile('[A-Z]{2,}')
words = re.split('(\W)', title)
for idx, word in enumerate(words):
m = capitalization_regex.search(word)
if m:
new_word = '{' + word[m.start():m.end()] + '}'
words[idx] = words[idx].replace(word[m.start():m.end()], new_word)
return ''.join(words)
def utf8tobibtex_title(title):
return capitalize_title(utf8tolatex(orig_title))
orig_titles = [ "AET: An Exposé of Titles", "AET: An exposé of titles" ]
for cmd_line_arg in sys.argv[1:]:
orig_titles.append(cmd_line_arg)
for orig_title in orig_titles:
print("===")
print("orig_title\n" + orig_title + "\n")
print("utf8tolatex(orig_title)\n" + utf8tolatex(orig_title) + "\n")
print("utf8tobibtex_title(orig_title)\n" + utf8tobibtex_title(orig_title) + "\n")
print("Title in bibtex context")
print("title={" + utf8tobibtex_title(orig_title) + "},\n")
from pylatexenc.
Thanks for your feedback. My impression is that the functionality that you're suggesting is a bit orthogonal to the purpose of pylatexenc.latexencode
, which is meant to provide a lightweight and straightforward conversion of non-ascii chars into corresponding LaTeX encoding sequences. It sounds like your suggestion would only target a rather specific use case, namely the protection of acronyms in the generation of BibTeX entries.
However, I've been meaning to improve utf8tolatex()
to allow to extend it to perform some smarter encodings, like transforming "..." (three dots) into "\ldots". The idea would be to have a way to specify custom rules, in the same spirit as the MacroDef
's in pylatexenc.latex2text
. I think this would be a good way to resolve your problem: You could specify a rule where a word with two or more capital letters get output with surrounding protective braces.
Hopefully I'll be able to get to this soon.
from pylatexenc.
OK thanks. As this is not on the roadmap, I will close this ticket.
As an off-topic side note as well, I also tried pylatex.utils.escape_latex
and saw that it encoded line breaks as \newline
which was not what I wanted this time but is probably something useful in some contexts.
from pylatexenc.
Hi again. I'm working on a pylatexenc 2.0
release that would allow you to do what you were suggesting. Could you test the new version and see if it meets your needs? I'm happy to hear your feedback.
u = UnicodeToLatexEncoder(
conversion_rules=[
latexencode.UnicodeToLatexConversionRule(
latexencode.RULE_REGEX,
[ (re.compile(r'([{}])'), r'\1'), # keep existing braces
(re.compile(r'\b([A-Z]{2,}\w*)\b'), r'{\1}'), ]
),
] + latexencode.get_builtin_conversion_rules('defaults')
)
result = u.unicode_to_latex(input_string)
See updated doc: https://pylatexenc.readthedocs.io/en/latest/latexencode/
To install the development version, clone the git repo, then in the cloned directory run the commands:
python setup.py sdist
pip install dist/pylatexenc-2.0b0.tar.gz
from pylatexenc.
Related Issues (20)
- The function of "latex_to_text" can not convert \sqrt[n]{x} with hold the sqrt num n HOT 2
- Argument parsers should be given the name of the encountered macro, in order to handle unknown macros HOT 3
- convert ₀:$_0$ ... ₉:$_9$ HOT 2
- Special characters treated as macros HOT 1
- Parsing commands containing "@" HOT 4
- parse LaTeX, replace $$ HOT 4
- Support for Python 3.7+
- various upgrades HOT 1
- how to instantiate LatexMacroNode? HOT 3
- suggestion for simpler/more correct MacroArgs
- nodelist_to_latex() doesn't respect node.environmentname HOT 1
- how to parse `\vrule height 2pt depth -1.6pt width 23pt` HOT 1
- how to parse `\def\enorm#1{\|#1\|_2}` HOT 1
- how to parse `\lstinline|code|` HOT 1
- macroname='SS' incorrectly translated HOT 1
- parse issue with `\newcommand{\be}{\begin{equation}}` HOT 3
- adapt LatexWalker context while walking HOT 1
- Input and output differs when converting a nodelist back into latex HOT 1
- Incorrect (?) parsing of content in `lstlisting` environment HOT 3
- Deprecation warning upon installation via pip HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pylatexenc.