Comments (3)
Your approach is correct! Your code also captures the macro \mu
which is not explicitly declared to the latexwalker. The default implementation relies on the behavior for default macros, which is to keep them as a macro node with no arguments. (The macro is separately declared for latex2text
as representing the unicode "μ" symbol.)
I realize it's a bit of a weakness of the API for now that the parse_args()
method is not given information about the macro/environment that is currently being parsed. This is usually not a problem in typical settings where you set a MacroSpec
or EnvironmentSpec
to specific macros, since in such cases the parser is usually tailored to a specific macro/environment. A possible approach to display the unknown macro name is to hook directly into the LatexContextDb
object. I also realize that these objects don't expose a simple way of doing this, but the following code achieves the desired behavior:
from pylatexenc import latexwalker, macrospec, latex2text
class UnknownMacroArgsParser(macrospec.MacroStandardArgsParser):
def __init__(self, macroname):
super().__init__()
self.macroname = macroname
def parse_args(self, w, pos, parsing_state=None):
print("Unknown macro `\\{}' at {}".format(self.macroname, pos))
return super().parse_args(w, pos, parsing_state=parsing_state)
class CustomLatexContextDb(macrospec.LatexContextDb):
def __init__(self, db):
super().__init__()
for cat in db.categories():
self.add_context_category(
cat,
macros=db.iter_macro_specs([cat]),
environments=db.iter_environment_specs([cat]),
specials=db.iter_specials_specs([cat]),
)
def get_macro_spec(self, macroname):
mspec = super().get_macro_spec(macroname)
if mspec is not None:
mspec
return macrospec.MacroSpec(macroname, args_parser=UnknownMacroArgsParser(macroname))
walker_context = CustomLatexContextDb(latexwalker.get_default_latex_context_db())
# second example
output = latex2text.LatexNodes2Text().latex_to_text(
r"""start
$\mu $
\foo
\foobar
""", latex_context=walker_context)
print(output)
# prints:
#
# Unknown macro `\mu' at 11
# Unknown macro `\foo' at 18
# Unknown macro `\foobar' at 26
# start
# μ
It's not a particularly elegant solution, and I'll look into how to make this easier in future versions of pylatexenc.
Regarding macros that are considered as unknown to latexwalker
but are known to latex2text
, you could consider emitting a warning only after performing a search in the latex2text
context db object (call l2tcontext.get_macro_spec(macroname)
and check if it is None
, where l2tcontext
is the context-db object used by latex2text
). I hope this helps.
I'm going to change the issue title to reflect that the desired improvement to pylatexenc is that unknown macro/environment/specials handlers be given more information about what macro/environment/specials was encountered.
from pylatexenc.
Actually, I realize that issue #32 already asked a very similar question. If you care about converting to text, not necessarily about obtaining the argument structure, you can plug into latex2text
's context db to issue warnings for unknown macros. See my comment in issue #32.
from pylatexenc.
Thank you for the clarifications.
Yes, #32 is better for my use case (sorry I didn't spot it by myself).
from pylatexenc.
Related Issues (20)
- how to parse `\def\enorm#1{\|#1\|_2}` HOT 1
- how to parse `\lstinline|code|` HOT 1
- macroname='SS' incorrectly translated HOT 1
- parse issue with `\newcommand{\be}{\begin{equation}}` HOT 3
- adapt LatexWalker context while walking HOT 1
- Input and output differs when converting a nodelist back into latex HOT 1
- Incorrect (?) parsing of content in `lstlisting` environment HOT 3
- Deprecation warning upon installation via pip HOT 1
- Any ideas to skip specific element in LatexWalker? HOT 1
- pylatexenc import latexpp NCArgsParser doesn't work HOT 1
- Exception when parsing macro definitions with multiple arguments HOT 1
- IndexError: list index out of range HOT 2
- problem with ensuremath construct HOT 1
- Temporarily disable environment parsing HOT 2
- Error when command taking argument appears at the end of the input HOT 1
- publish wheel for v2.10 HOT 2
- Is there a way to automatically insert multiplication sign? HOT 1
- i-acute in context of BibTeX
- Best way to to go into LatexGroupNodes? HOT 1
- Simple latex HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pylatexenc.