wichert / lingua Goto Github PK

View Code? Open in Web Editor NEW

46.0 46.0 32.0 461 KB

Translation toolkit for Python

License: Other

Python 100.00%

lingua's People

Contributors

Stargazers

Watchers

lingua's Issues

multi-line python statements in tal:define

The following example results in an error when parsing the template using pot-create:

<div tal:define="structural field.widget.hidden or
                            field.widget.category == 'structural'">
</div>

The error message is:

Aborting due to Python syntax error in %s[%d]: %s ./test.pt 1 field.widget.hidden or
                            field.widget.category == 'structural'

The example does work when parsed by chameleon.

Possible workarounds are adding a \ after or or removing the line break.

Some messages skipped in the extraction process for i18n

After running the command:

python setup.py extract_messages

It appears that some messages are not extracted from my pt template:

<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:tal="http://xml.zope.org/namespaces/tal"
xmlns:i18n="http://xml.zope.org/namespaces/i18n"
i18n:domain="nursery">
    <body>
        <form method="post" tal:attributes="action url">
            <p>
                <label i18n:translate="login-label">Login:</label>
                <input type="text" name="login" tal:attributes="value login">
            </p>
            <p>
                <label i18n:translate="password-label">Password:</label>
                <input type="password" name="password" tal:attributes="value password">
            </p>
        </form>
    </body>
</html>

Indeed, I only have the message login-label extracted in the .pot file:

#. Default: Login:
#: nursery/templates/login2.pt:8
msgid "login-label"
msgstr ""

I also noted that if I remove the html p elements and/or if I add a slash / character just before the closing > character of the html input elements, then the extraction process works perfectly.

Crash in Python extractor when using a custom translation function

After upgrading from Lingua 1.6 to 3.9, and trying pot-create with an existing project, I came across a bug in the Python extractor. Some of the files of the projects had been processed by Modernize (https://pypi.python.org/pypi/modernize), and Unicode strings had been rewritten as six.u("...").

If a custom translation function is used on such "strings" (e.g. _other(...) instead of _(...)), the extractor will choke with the following error message:

Traceback (most recent call last):
  File "/.../bin/pot-create", line 11, in <module>
    sys.exit(main())
  File "/.../lib/python2.7/site-packages/lingua/extract.py", line 262, in main
    for message in extractor(real_filename, options):
  File ".../lib/python2.7/site-packages/lingua/extractors/python.py", line 90, in _extract_python
    msg = parse_keyword(node, KEYWORDS[node.func.id])
  File ".../lib/python2.7/site-packages/lingua/extractors/python.py", line 31, in parse_keyword
    msgid = node.args[keyword.msgid_param - 1].s
AttributeError: 'Call' object has no attribute 's'

The error can be reproduced by running pot-create --keyword=_other x.py on the following Python code. It will correctly handle lines 1-3, but choke on line 4:

_(u"This is extracted")
_(foo("This is ignored"))
_other(u"This is extracted, too")
_other(foo("Bug!"))

error when parsing mako template

I've tried parsing my mako files with the mako extractor and I get:

Aborting due to parse error in ./mainserver/templates/menu.mako[24]: else: pass

That seems to be coming from the lingua/extractors/python.py as that's the only one giving a line number. I have no idea where the "else: pass" is coming from as here's that section of that template (line 24 is ${name}\):

% if text is not None:
${text}\
% else:
${name}\
% endif

Is this extractor going over the compiled version of the mako template or is somehow using the wrong extractor? My config is:

[extensions]
.mako = mako
.pt = chameleon
.py = python

[extractor:babel-mako]
comment-tags = TRANSLATORS:

I tried with the babel-mako and I get:

$ env/bin/pot-create -c lingua.cfg -d mainserver -o mainserver/locale/mainserver.pot mainserver
Traceback (most recent call last):
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/mako/lexer.py", line 201, in decode_raw_stream
    text = text.decode(parsed_encoding)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 3891: ordinal not in range(128)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "env/bin/pot-create", line 11, in <module>
    sys.exit(main())
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/lingua/extract.py", line 262, in main
    for message in extractor(real_filename, options):
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/lingua/extractors/babel.py", line 35, in __call__
    comment_tags, self.config):
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/mako/ext/babelplugin.py", line 48, in extract
    for message in extractor(fileobj):
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/mako/ext/extract.py", line 11, in process_file
    input_encoding=self.config['encoding']).parse()
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/mako/lexer.py", line 215, in parse
    self.filename,)
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/mako/lexer.py", line 207, in decode_raw_stream
    0, 0, filename)
mako.exceptions.CompileException: Unicode decode operation of encoding 'ascii' failed at line: 0 char: 0

helper script i18n.sh could notify about fuzzy translations

I use docs/examples/i18n.sh a couple of days. I works like a charm. Everything is working fine and the script output is like:

Extract messages
Update translations
....... done.
....... done. 
Compile message catalogs

But sometimes a translation is not showing up. After a bit of investigation I almost always find that small word fuzzy in the translation files (*.po).

After running the script I would like to be informed about fuzzy translations. It was pretty easy to change that. Just add --statistics to the call to msgfmt:

msgfmt --statistics -o "${po%.*}.mo" "$po"

Now a very useful summary with quantitative information is appended:

Extract messages
Update translations
....... done.
....... done.
Compile message catalogs
39 translated messages.
34 translated message, 1 fuzzy translation, 4 untranslated messages.

Given that little bit of more information I know exactly that there are fuzzy translations and can fix them immediately if I want.

Temp file can not be renamed on Windows

I updated lingua from version 4.6 to 4.9 (Python 2.7 on Windows 7) and since then it is not working correctly anymore.

I always get this error:

WindowsError: [Error 32] The process cannot access the file beacause it is being used by another process.

Apparently, the tmpfileis not correctly closed and the system still thinks the file is being used, thus os.rename (https://github.com/wichert/lingua/blob/master/src/lingua/extract.py#L235) throws this error.

Moreover, just before os.rename I tried that:

f = open(tmpfile)
f.close()

And I get exactly the same error, but for the f = open(tmpfile) command...

Domain not respected when extracting strings marked with translationstring package.

It looks like strings marked with translationstring package:

from translationstring import TranslationStringFactory
_ = TranslationStringFactory('domainX')
a = _(u'b')

are being extracted when extracting with different domain set (pot-create -d domainY (...)).

Is this even possible to take domain into account in such scenario?

This is analogues to issue #66.

Updating existing .pot files

Can lingua detect that existing .pot file does not need to be changed and don't write it?

Incorrect use of python-format and c-format

As far as I can tell Lingua is marking extracted strings with the wrong formatting specifier flags.

Given the following input file:

print(_('Hello World!'))

print(_('Hello {name}!').format(name))

print(_('Hello %s!') % name)

And running the command:
pot-create hello_world.py -o hello.pot

I get the following .pot file:

# ... snip ...
#, fuzzy
msgid ""
msgstr ""
# ... snip ...
"Generated-By: Lingua 4.8.1\n"

#: ./hello_world.py:1
msgid "Hello World!"
msgstr ""

#: ./hello_world.py:3
#, python-format
msgid "Hello {name}!"
msgstr ""

#: ./hello_world.py:5
#, c-format
msgid "Hello %s!"
msgstr ""

Looking at the gettext source code, I am under the impression that source line 3 which has been written with the python-format flag should in fact use the python-brace-format flag and source line 5 which has been written with the c-format flag should in fact be using the python-format flag.

In particular looking at the gettext source files defining the different formats:

format-python.c
The comment block towards the top of that file describes % (old) style string formatting. Specifically:

Any string or Unicode string can act as format string via the '%' operator, implemented in stringobject.c and unicodeobject.c.

although I believe that the String Formatting Operations section referred to in the comment can now be found at https://docs.python.org/2/library/stdtypes.html#string-formatting-operations.
format-python-brace.c
The comment block towards the top of that file describes {} (new) style formatting. Specifically:

Python brace format strings are defined by PEP3101 together with 'format' method of string class.
format-c.c
This is for formatting C format strings, which are similar to, but not exactly like old style Python format strings. One example differences is the conversion type r (see String Formatting Operations) which formats a string using repr() and is not type that is available using C's printf. It is therefore probably not a good idea to use this format type for actual old style formatting Python strings.

Using the incorrect format specifier flags means that gettext's msgfmt command's --check option provides incorrect output. Given the following 'translation' of the above .pot file:

#, fuzzy
msgid ""
msgstr ""
"Content-Type: text/plain; charset=UTF-8\n"
"Generated-By: Lingua 4.8.1\n"

#: ./hello_world.py:1
msgid "Hello World!"
msgstr ""

#: ./hello_world.py:3
#, python-format
msgid "Hello {name}!"
msgstr "100% {name}!"

#: ./hello_world.py:5
#, c-format
msgid "Hello %s!"
msgstr "Hello %r!"

Running msgfmt produces the following output:

msgfmt hello.po --check-format
hello.po:14: 'msgstr' is not a valid Python format string, unlike 'msgid'. Reason: In the directive number 1, the character '{' is not a valid conversion specifier.
hello.po:19: 'msgstr' is not a valid C format string, unlike 'msgid'. Reason: In the directive number 1, the character 'r' is not a valid conversion specifier.
/usr/local/Cellar/gettext/0.19.7/bin/msgfmt: found 2 fatal errors

Changing the file to use the correct format specifier flags:

#, fuzzy
msgid ""
msgstr ""
"Content-Type: text/plain; charset=UTF-8\n"
"Generated-By: Lingua 4.8.1\n"

#: ./hello_world.py:1
msgid "Hello World!"
msgstr ""

#: ./hello_world.py:3
#, python-brace-format
msgid "Hello {name}!"
msgstr "100% {name}!"

#: ./hello_world.py:5
#, python-format
msgid "Hello %s!"
msgstr "Hello %r!"

will, given the same msgfmt command as above, produce no errors.

I would be happy to supply a patch for this if there is agreement that the format string flags should be corrected.

Cheers,
Christian

xls-to-po doesn't merge translations with trailing blank spaces (lingua 1.3)

If a translation includes a trailing blank space, when you merge back changes, they are ignored, e.g.:

msgid "Welcome to the "

msgid " Program"

They are correctly extracted by po-to-xls but not viceversa.

Support Jinja2 Template Engine

It would be great if lingua can support Jinja2 directly. Babel project looks like in inactive mode, so as a Pyramid lover who is using Jinja2, it would be great if this happens...

Translation extractor does not strip trailing whitespace off message-ids

    def test_translate_stripExtraWhitespaceAfterText(self):
        snippet="""\
                <html xmlns:i18n="http://xml.zope.org/namespaces/i18n" i18n:domain="lingua">
                  <dummy i18n:translate="">
                      Dummy text
                  </dummy>
                </html>
                """
        self.assertEqual(self.extract(snippet),
                [(2, None, u"Dummy text", [])])

polint's output under Python 3 uglyness

Under Python 3, polint emits b'strings':

[...it/LC_MESSAGES/some.po] Translation:
b'        Attivit\xc3\xa0'
Used for 2 canonical texts:
b'1       Activity'
b'2       Activities'

Can it instead decode that strings in the PO encoding and emit the (visually) right things?

Multiple i18ndomains

Referring to collective/i18ndude#22 (comment)

at some point lingua had problems handling multiple i18ndomains in one run. but maybe this was solved meanwhile.

@wichert can you confirm this?

lingua 4.10 fails with Python3 due to non-existing unicode() function

In lingua 4.10 we find the new line output.write(unicode(catalog)) . This is not valid Python3 because the global unicode() function has been removed.

Traceback (most recent call last):
  File ".../bin/pot-create", line 11, in <module>
    sys.exit(main())
  File ".../lib/python3.4/site-packages/lingua/extract.py", line 359, in main
    save_catalog(catalog, options.output)
  File ".../lib/python3.4/site-packages/lingua/extract.py", line 236, in save_catalog
    output.write(unicode(catalog))
NameError: name 'unicode' is not defined

Gracefully handle undeclared namespaces

If a XML file uses an undefined namespace the lingua XML extractor silently ignores it. For example:

<div xmlns:i18n="http://xml.zope.org/namespaces/i18n"
     i18n:domain="voipro.portal">
    <p tal:condition="has_title" i18n:translate="">Hello, world!</p>
</div>

Note how the tal namespace is not defined.

Issue with expat breaking lines not only on newline

Hello,

I'm trying to extract some strings from expressions in a chameleon template. Having found that some strings were missing, I debugged a bit the xml parser.

I've found that some times CharacterDataHandler (in parsers/xml.py) is called with half lines, therefore the regular expression ignores the line.

I've also found that setting parser.buffer_text to True helps by sending all the content to the CharacterDataHandler but I feel that if the buffer is full, the problem can happen again. Python documentation for that buffer doesn't say anything about it being expanded when needed.

I can provide the faulty file on request. I cannot make it public.

the solution we have found is to wrap every expression with a fake tag:

<tal:s>${expression containing _()}</tal:s>

tal:repeat does not support multiple assignment

Chameleon supports multiple assignment when performing iteration:

<li tal:repeat="(ix, item) items"></li>

However, in the xml extractor, value is split by the first whitespace character which means that value will contain item) items:

elif attribute[1] == 'repeat':
    (engine, value) = get_tales_engine(value.split(None, 1)[1])

And pot-create will fail:

Aborting due to Python syntax error in %s[%d]: %s ./portal/templates/sidebar.pt 7 item) items

Removing the space after the comma can function as a workaround.

Babel extraction plugins are gone?

I used to have something like this in setup.py:

              ('**.py', 'python', None),  # babel extractor supports plurals
              ('**.pt', 'lingua_xml', None),

Since lingua 2.0 "lingua_xml" extractor is not recognized. What's the correct setup now?

lingua_xml does not pick up all items to be translated

I just looked at Pylons/deform to add some translations. I had to change the message extractors to lingua, as chameleon_xml/python simply issn't availabel in chameleon2, afair.

After running bin/python setup.py extract_messages I noticed there were not more but less things to be translated, because some things from small template snippets are not picked up. If I remember linguas code right, you check for the namespace declaration to be present. Here it isn't.

examples:

diff --git a/deform/locale/deform.pot b/deform/locale/deform.pot
index da3aa98..27210b1 100644
--- a/deform/locale/deform.pot
+++ b/deform/locale/deform.pot
@@ -6,9 +6,9 @@
#, fuzzy
msgid ""
msgstr ""
-"Project-Id-Version: deform 0.9.2\n"
+"Project-Id-Version: deform 0.9.3\n"
"Report-Msgid-Bugs-To: EMAIL@ADDRESS\n"
-"POT-Creation-Date: 2011-08-10 09:33+0200\n"
+"POT-Creation-Date: 2011-11-15 17:20+0100\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME EMAIL@ADDRESS\n"
"Language-Team: LANGUAGE [email protected]\n"
@@ -53,66 +53,7 @@ msgstr ""
msgid "Add ${subitem_title}"
msgstr ""

-#: deform/widget.py:1235
+#: deform/widget.py:1226
msgid "Incomplete date"
msgstr ""

-#: deform/templates/checked_password.pt:10
-msgid "Confirm Password"

-msgstr ""

-#: deform/templates/checked_password.pt:5
-msgid "Password"
-msgstr ""

and some extract from that template (with no namespace declaration, though using):

Password

how to go on from here?

Python 3.4 python extraction is totally broken.

python extractor is completely broken with Python 3.4, lingua 4.3.1; throws on any file with more than 0 bytes.

% ls
total 0
% pot-create .
No files scanned, aborting
% echo > foo.py 
% pot-create . 
Traceback (most recent call last):
  File ".../venv/bin/pot-create", line 11, in <module>
    sys.exit(main())
  File ".../venv/lib/python3.4/site-packages/lingua/extract.py", line 280, in main
    for message in extractor(real_filename, options):
  File ".../venv/lib/python3.4/site-packages/lingua/extractors/python.py", line 349, in __call__
    return parser(token_stream, options, filename, lineno)
  File ".../venv/lib/python3.4/site-packages/lingua/extractors/python.py", line 164, in __call__
    for (token_type, token, location, _) in token_stream:
  File ".../venv/lib/python3.4/site-packages/lingua/extractors/python.py", line 138, in next
    token = self._transform(next(self.queue))
  File ".../venv/lib/python3.4/tokenize.py", line 554, in _tokenize
    if line[pos] in '#\r\n':           # skip comments or blank lines
TypeError: 'in <string>' requires string as left operand, not int

i18n:attributes and html 5

Extracting texts from attributes in chameleon templates silently fails if the tag is not closed with a '/'.
I just tried this with the 3.1 release.

title attribute extracted:

<img src="element.png" title="Add a new page element" align="top" i18n:attributes="title"/>

not extracted:

<img src="element.png" title="Add a new page element" align="top" i18n:attributes="title">

Though I'm not sure if this is supported or not?

Feature Request: HTML inside .po Default value

Dear developing team,

I gracefully use lingua within my Pyramid project. While I was able to get everything to work, I stumbled upon the following issue that I was unable to resolve: I wanted to extract HTML tags from my Chameleon template into the Default value of my .po file.

For further details, please see my question asked over at stackoverflow (http://stackoverflow.com/questions/9559756/pot-file-with-tags-instead-of-dynamic-element) or similarly on the Google Pylons group (http://groups.google.com/group/pylons-discuss/browse_thread/thread/eb5ca27b1494cfd2/51c849bf27410215).

I don't think this is possible with Lingua so far. Do you think this feature could be implemented in future development? I'd very much appreciate this.

Thank you very much!
Lukas

Aborting due to parse error, with a dict as a method argument (in chameleon)

Creating a chameleon page template, with only this content:

${some_method(_('abc'), {'a':'b'})}

and running pot-create -o failure.pot minimal_failure.pt, results in an error:

Aborting due to parse error in ./minimal_failure.pt[2]: some_method(_('abc'), {'a':'b'

(Yes, the error stops right there, the closing } of the dict is not there.)

If we change our example to replace the dict with a dict(), like this:

${some_method(_('abc'), dict(a='b'))}

Then the parsing will continue successfully.

(lingua==3.10, Chameleon==2.22, pyramid==1.6a2)

pot-create cannot work with babel in python 3.4

$ pot-create -c lingua.cfg i18n
Traceback (most recent call last):
  File "/home/william/venv/pyramid-1.5/bin/pot-create", line 9, in <module>
    load_entry_point('lingua==2.1', 'console_scripts', 'pot-create')()
  File "/home/william/venv/pyramid-1.5/lib/python3.4/site-packages/lingua/extract.py", line 232, in main
    for message in extractor(real_filename, options):
  File "/home/william/venv/pyramid-1.5/lib/python3.4/site-packages/lingua/extractors/babel.py", line 23, in wrapper
    for (lineno, _, msgid, comment) in extractor(fileobj, DEFAULT_KEYWORDS.keys(), (), None):
  File "/home/william/venv/pyramid-1.5/lib/python3.4/site-packages/jinja2/ext.py", line 582, in babel_extract
    for extension in options.get('extensions', '').split(','):
AttributeError: 'NoneType' object has no attribute 'get'

My env:

python 3.4.0
lingua 2.1
Babel 1.3
pyramid 1.5

lingua.cfg:

[extension:.jinja2]
plugin = babel-jinja2

'&' in xml extractor isn't picked up + aborts any further processing

It seems like an &-sign in any xml template causes the message extractor to ignore it. It works with the python extractor though.

Example:

Python extractor picks this up: _(u"Visible & works")
XML.extractor doesn't pick this up:

Nonvisible & not picked up

Also, the extractor seems to abort any further work below any statement containing an &-sign. There's no error message either. Any statement above the &-sign works as expected within the same template.

...while using '&' and not having msgids might not be smart, it should at least generate an error i think :)

lingua skips translation strings if they contain an ampersand

I just noticed that lingua's xml extractor will skip translation strings if they contain a '&' (ampersand).

lingua will also skip all following translation strings in the same file.

Global variables in python extractor

This is quite a design flaw in python extractor. The KEYWORDS variable is global, and that global dictionary is updated on call to Python extractor with new values from options and not vice versa. This makes it impossible to use two python extractors at the same time with different keywords.

Support JS templates

Im using underscore templates in a project. Lingua trows that error:

Aborting due to parse error in /opt/develop/privat/stuff/templates/home.pt: not well-formed (invalid token): line 4, column 24

the files looks like that:

<metal:use use-macro="view.helper.api['main'].macros['master']">                                            
  <metal:fill fill-slot="content">
    <script type="text/html" id='alert'>
      <div class="alert <%= type %>"> 
        <button type="button" class="close" data-dismiss="alert">&times;</button><%= text %>
      </div>
    </script>
    ...
  </metal:fill>
</metal:use>

multiple root nodes

Expat stops extracting after reading the first root element.

    def test_multiple_root_nodes(self): 
        snippet = """\                                                                                                                                                                                              
                <metal:foo>
                  <dummy i18n:translate="">Foo</dummy>
                  <dummy i18n:translate="">Foo</dummy>
                </metal:foo>
                <metal:bla>
                  <dummy i18n:translate="">Bla</dummy>
                  <dummy i18n:translate="">Bla</dummy>
                </metal:bla>
                """
        self.assertEqual(self.extract(snippet),
                [
                    (2, None, u"Foo", []),          
                    (3, None, u"Foo", []),          
                    (6, None, u"Bla", []),          
                    (7, None, u"Bla", []),          
                 ])

XML extractor doesn't report critical errors

https://github.com/wichert/lingua/blob/master/src/lingua/extractors/xml.py#L67

The try / except statement here swallows all critical errors, aborts processing of that template and never reports them. Since at least I make a lot of errors when i write templates, it would be great to know when I screw up :)

The best would be if errors were printed at the end of processing, or an optional switch to actually die during processing. But even a quick fix like this would be better than nothing:

    try:
        self.parser.ParseFile(fileobj)
    except expat.ExpatError, e:
        print e

Thanks a lot for working with this software :)
Regards,
Robin

Files-from parameter doesn't find files

If you use --files-from parameter pot-create doesn't find file(s).

 $VENV/bin/pot-create --config "lang.conf" --files-from locale-files.txt --output /dev/null
 Can not find file /home/raspi/projects/test/foo/__init__.py

$ stat /home/raspi/projects/test/foo/__init__.py
  File: '/home/raspi/projects/test/foo/__init__.py'
  Size: 2071      	Blocks: 8          IO Block: 4096   tavallinen tiedosto
Device: 801h/2049d	Inode: 4069600     Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/   raspi)   Gid: ( 1000/   raspi)
Access: 2017-02-21 09:21:38.523260776 +0200
Modify: 2017-02-21 09:21:38.523260776 +0200
Change: 2017-02-21 09:21:38.523260776 +0200
 Birth: -

locale-files.txt:

/home/raspi/projects/test/foo/__init__.py
/home/raspi/projects/test/foo/lib/__init__.py
/home/raspi/projects/test/foo/models/__init__.py
/home/raspi/projects/test/foo/models/user.py
/home/raspi/projects/test/foo/include/layouts.py
/home/raspi/projects/test/foo/include/urli18n.py
/home/raspi/projects/test/foo/include/security.py
/home/raspi/projects/test/foo/include/__init__.py
/home/raspi/projects/test/foo/include/routes.py
/home/raspi/projects/test/foo/database/__init__.py
/home/raspi/projects/test/foo/database/loginlog.py
/home/raspi/projects/test/foo/database/user.py
/home/raspi/projects/test/foo/views/dberrors.py
/home/raspi/projects/test/foo/views/errors.py
/home/raspi/projects/test/foo/views/__init__.py
/home/raspi/projects/test/foo/views/views.py
/home/raspi/projects/test/foo/views/templates/form_login.pt
/home/raspi/projects/test/foo/views/templates/500.pt
/home/raspi/projects/test/foo/views/templates/languages.pt
/home/raspi/projects/test/foo/views/templates/footer.pt
/home/raspi/projects/test/foo/views/templates/home.pt
/home/raspi/projects/test/foo/views/templates/default_layout.pt
/home/raspi/projects/test/foo/views/templates/form_logout.pt
/home/raspi/projects/test/foo/views/templates/main_menu.pt
/home/raspi/projects/test/foo/views/templates/403.pt
/home/raspi/projects/test/foo/views/templates/logout.pt
/home/raspi/projects/test/foo/views/templates/error_database.pt
/home/raspi/projects/test/foo/views/templates/404.pt
/home/raspi/projects/test/foo/views/templates/login.pt

Extract messages from xhtml attributes

Hello,

I have the following in my login.mako template:

    <div class="notice">
        <h1>${_(u"ProprietarySystem")}</h1>
        <p>${_(u"AuthorizedUsersOnly")}</p>
        <p>${_(u"MonitoringWarning")}</p>
        <p>${_(u"MonitoringConsent")}</p>
    </div>
    <div id="loginForm">
        <form action="${url}" method="post">
            <div id="fields">
                <input type="hidden" name="came_from" value="${came_from}"/>
                <input type="text" name="${_(u"Login")}" value="${login}"/><br/>
                <input type="password" name="${_(u"Password")}" value="${password}"/><br/>
                <input type="submit" name="form.submitted" value=${_(u"SignIn")}/>
            </div>
        </form>   
        <a href="">${_(u"ForgotPassword")}</a><br/>
        <a href="">${_(u"ChangePassword")}</a>
    </div>

It seems that extraction is blocked when encountering a message within an attribute. The output contains up to "MonitoringConsent".

If I omit the 'fields' div of the form though, the rest of the strings are extracted normally.

Lingua version is 1.3
Babel version is 0.9.6

issue with spaces around pipe character ('|') in chameleon templates

<div tal:repeat="choice values | field.widget.values" seems to get choked on but <div tal:repeat="choice values|field.widget.values" (removal of spaces around the pipe) seems to be okay. Both seem to be acceptable in chameleon.

Exclude files and skip lines

In the current version pot-create outputs the following message in our project:

$ pot-create --keyword=_ts app -o app/locale/translations.pot
./app/utils.py[92]: Message argument must be a string
./app/utils.py[118]: Message argument must be a string
No translatable strings found, aborting

In line 92 we defined our translation function which is also used as keyword

def _ts(string, mapping=None):
    pass

I think the python extractor should ignore this in general.

And in line 118 we used this function with variables, which is intended.

It would be great to be able to exclude files from directory scan and/or to skip lines lines by a simple comment hint like e.g. "# I18N !skip".

Directory parameter with quotes doesn't scan files

If you use --directory parameter with quotes (") pot-create doesn't scan any files.

$VENV/bin/pot-create --config "lang.conf" --directory "foo/*" --output /dev/null
No files scanned, aborting

$VENV/bin/pot-create --config "lang.conf" --directory foo/* --output /dev/null
PermissionError: [Errno 13] Permission denied: '/dev/tmptzuv_t65'

If you prepend the command with

strace -e trace=open $VENV/bin/pot-create ...

You can see that first example doesn't open any files from foo directory and latter does.

Could have a Javascript extractor in Lingua itself...

I've (sort of) written about half of it already.

Parse error in chameleon template that contains a "load:" statement in a tal:define

With lingua 2.1/python2.7...

test.pt:

<html tal:define="common load:common.pt"></html>

$ $env/bin/pot-create test.pt

Aborting due to parse error in ./test.pt[2]: load:common.pt

Leaving out the "load:" part works.

Problem with custom TALES extension

Hi,
I'm using Lingua (3.8) with Python (3.4) on a Pyramid package using Chameleon templates.
I have created a custom "provider:" TALES expressions, which works perfectly with Chameleon but generates an error in Lingua.
My PT code is as follow:

...
<tal:var content="structure provider:pyams.toolbar" />
...

Lingua error is: Aborting due to Python syntax error in ./src/templates/table.pt[11]: (python:pyams.toolbar)

Any idea?

Best regards,
Thierry

Expression in tal:define fails

I have this Chameleon template, which is perfectly accepted by Chameleon:

<form
  tal:define="css_class css_class|string:${field.widget.css_class or field.css_class or ''};"
  >
...
</form>

When I try to run pot-create, I get the following error message:

Aborting due to Python syntax error in %s[%d]: %s ./.../templates/form.pt 1 (css_class|string:${field.widget.css_class or field.css_class or ''})

Is it the ${...} expression which causes the problem?

keyword commandline argument is ignored for Python files

Using the --keyword argument to specify a custom function name does not work for the PythonExtractor.

Tested with version 4.5.1 and the current master.

Parse error in chameleon template with python expression that contains nested curly braces

Using a python2.7 virtualenv with lingua 2.1 installed, I'm trying to do the following on a chameleon template:

$ $ENV/bin/pot-create -c lingua.cfg -d test test.pt

Where lingua.cfg contains:

[extension:.pt]
plugin = xml

[extension:.py]
plugin = python

and test.pt contains:

<html xmlns:i18n="http://xml.zope.org/namespaces/i18n"
  i18n:domain="test">
  <head><title>tetst</title></head>
  <body>
    <a href="${request.route_url('set_locale', _query={'somevar': 'somevalue'})}">Test</a>
  </body>
</html>

This fails with the following error:

Aborting due to parse error in ./test.pt[9]: request.route_url('set_locale', _query={'somevar': 'somevalue'

I think it has to do with the nested curly braces, because running the same command on

<html xmlns:i18n="http://xml.zope.org/namespaces/i18n"
  i18n:domain="test">
  <head><title>tetst</title></head>
  <body>
    <a href="${request.route_url('set_locale')}">Test</a>
  </body>
</html>

works without error.

python3 support

maybe xlrd3 and xlwt3 yould be used to support xls imports exports

improper .pot file generated?

pot-create generates the following from _('Logout ${login}', mapping={'login':request.user.login})

#: ./mainserver/templates/menu.mako:70
#, python-format
msgid "Logout ${login}"
msgid_plural "login"
msgstr ""

msginit then chokes on that with the message: "mainserver.pot:172:10: syntax error" (172 is the "msgstr" line above)

I'm not sure if this is a lingua or Babel issue, but it seems like it's incorrectly detecting the translation string as plural and then generating some .pot content that's not proper syntax. If I change line 172 to msgstr[0] "" the syntax error goes away, but obviously it's still not what I wanted.

Skip expression-only messages

It is possible to force translation of pure strings by using i18n:translate="". This can be combined with expressions and you get something like this:

<span i18n:translate="">${title}</span>

This is currently extracted by lingua and produces this entry in the POT file:

msgid "${title}"
msgstr ""

which is not very useful. Lingua should skip all messages which only consist of expressions.

Domain not respected when extracting i18n:attributes in Chameleon template

$ cat test.pt
<ul i18n:domain="domX">
  <li tal:repeat="item items">
    <a title="Edit" i18n:attributes="title"></a>
  </li>
</ul>
$ pot-create -d domY -o out.pot test.pt
$ cat out.pot
#
# SOME DESCRIPTIVE TITLE
# This file is distributed under the same license as the PACKAGE package.
# FIRST AUTHOR <EMAIL@ADDRESS>, 2015.
#, fuzzy
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE 1.0\n"
"POT-Creation-Date: 2015-10-09 13:40+0200\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS\n"
"Language-Team: LANGUAGE <[email protected]>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=UTF-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Language: \n"
"Generated-By: Lingua 3.11\n"

#: ./test.pt:3
msgid "Edit"
msgstr ""

I expect extraction to skip Edit as it's declared in context of domX domain while extraction is for domY domain.

issue with using babel-mako extractor?

After updating to 2.5... (though it also didn't work in 2.4, but gave a different error)

I have a file lingua.cfg with contents:

[extension:.mako]
plugin = babel-mako

[extension:.pt]
plugin = xml

[extension:.py]
plugin = python

Then I try to use pot-create:

$env/bin/pot-create -c lingua.cfg -d mainserver -o mainserver/locale/mainserver.pot mainserver
Traceback (most recent call last):
  File "env/bin/pot-create", line 11, in <module>
    sys.exit(main())
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/lingua/extract.py", line 218, in main
    read_config(options.config)
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/lingua/extract.py", line 154, in read_config
    EXTENSIONS[extension] = EXTRACTORS[plugin](config)
TypeError: wrapper() missing 1 required positional argument: 'options'

prior to the update I got:

$env/bin/pot-create -c lingua.cfg -d mainserver -o mainserver/locale/mainserver.pot mainserver
Traceback (most recent call last):
  File "env/bin/pot-create", line 11, in <module>
    sys.exit(main())
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/lingua/extract.py", line 250, in main
    catalog.save(options.output)
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/polib.py", line 424, in save
    contents = getattr(self, repr_method)()
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/polib.py", line 620, in __unicode__
    return ret + _BaseFile.__unicode__(self)
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/polib.py", line 320, in __unicode__
    ret.append(entry.__unicode__(self.wrapwidth))
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/polib.py", line 956, in __unicode__
    val = getattr(self, c[0])
  File "/sites/metrics_dev/env/lib/python3.4/site-packages/lingua/extract.py", line 45, in comment
    return u'\n'.join(self._comments)
TypeError: sequence item 0: expected str instance, list found

pgettext does not work from babel plugins

I am using babel-javascript which readily scans pgettext, and correctly if I use it with pybabel/setup.py extract_messages. However if used via pot-create, the pgettext contents are interpreted as if they were ngettext.

pgettext('Task', 'Type') in a .js file and processed with babel-javascript should result in

msgctxt "Task"
msgid "Type"
msgstr ""

But currently results in

msgid "Task"
msgid_plural "Type"
msgstr[0] ""
msgstr[1] ""

Issue loading babel-jinja2

I'm having an issue loading the babel-jinja2 plugin:

%> pot-create -c lingua.cfg proteolims
Traceback (most recent call last):
  File "/u/jpl/virtenvs/proteolims/bin/pot-create", line 9, in <module>
    load_entry_point('lingua==3.9', 'console_scripts', 'pot-create')()
  File "build/bdist.linux-x86_64/egg/lingua/extract.py", line 262, in main
  File "build/bdist.linux-x86_64/egg/lingua/extractors/babel.py", line 42, in __call__
  File "build/bdist.linux-x86_64/egg/lingua/extractors/__init__.py", line 42, in check_c_format

  File "/u/jpl/virtenvs/proteolims/lib64/python2.6/re.py", line 186, in finditer
    return _compile(pattern, flags).finditer(string)
TypeError: expected string or buffer

If I remove the -c argument, lingua runs fine but does not visit the jinja2 files..
Any idea what the source of this problem might be ?

lingua.cfg:

[extensions]
.jinja2 = babel-jinja2

My extractors:

%> pot-create --list-extractors
babel-ignore      Pseudo extractor that does not actually extract anything, but simply
babel-javascript  Extract messages from JavaScript source code.
babel-jinja2      Babel extraction method for Jinja templates.
babel-mako        Extract messages from Mako templates.
babel-python      Extract messages from Python source code.
chameleon         Chameleon templates (defaults to Python expressions)
python            Python sources
xml               Chameleon templates (defaults to Python expressions)
zcml              Zope Configuration Markup Language (ZCML)
zope              Zope templates (defaults to TALES expressions)

BabelExtractor is inherently broken if function not in keywords

We recently upgraded to lingua 4.10 and stumbled upon the following stacktrace when running bin/pot-create with our custom extractor (which processes a csv file of our own format):

Traceback (most recent call last):
  File "bin/pot-create", line 9, in <module>
    load_entry_point('lingua==4.10', 'console_scripts', 'pot-create')()
  File "[...]/lib/python2.7/site-packages/lingua/extract.py", line 330, in main
    for message in extractor(real_filename, options):
  File "[...]/lib/python2.7/site-packages/lingua/extractors/babel.py", line 45, in __call__
    check_c_format(msgid, flags)
  File "[...]/lib/python2.7/site-packages/lingua/extractors/__init__.py", line 42, in check_c_format
    formats = list(re.finditer('%(?!%)', buf))
  File "[...]/lib/python2.7/re.py", line 190, in finditer
    return _compile(pattern, flags).finditer(string)
TypeError: expected string or buffer

Essentially, this is the same stacktrace as described in the closed issues #56.

After some investigation, the problem seems to be the following section in babel.py

if not isinstance(args, (list, tuple)):
    args = [args]
args = [(None, a, lineno) for a in args]
if function in self.keywords:
    (domain, msgctxt, msgid, msgid_plural, c) = parse_keyword(args, self.keywords[function], filename, lineno)
    if c:
        comment.append(c)
else:
    msgid = args[0]
    domain = msgid_plural = None

If there is no function set or the function isn't set in self.keywords, msgid will be set to args[0], which will always be a tuple instead of a string (because of the args = [(None, a, lineno) for a in args] statement before), causing both the check_c_format and the check_python_format function to fail. Even if that else: path wasn't broken, the final yield statement would also fail, given that msgctxt was never assigned.

I'm not involved enough with lingua to be able to really tell what you were trying to do here, but it seems to be quite broken. Any suggestions for adaptations to get it working again?

wichert / lingua Goto Github PK

lingua's People

Contributors

Stargazers

Watchers

Forkers

lingua's Issues

-msgstr ""

Recommend Projects

Recommend Topics

Recommend Org