mozilla / puente Goto Github PK
View Code? Open in Web Editor NEWUNMAINTAINED: Django/Jinja2 l10n extract/merge commands and things (Tower replacement)
License: BSD 3-Clause "New" or "Revised" License
UNMAINTAINED: Django/Jinja2 l10n extract/merge commands and things (Tower replacement)
License: BSD 3-Clause "New" or "Revised" License
Every time we land stuff, it should update the docs on readthedocs.
Tower and thus Puente have some code to mark all translated strings from gettext calls as safe and thus the values don't get escaped when used in a Jinja2 template.
Jinja2 defaults to everything being unsafe except parts which are explicitly marked as safe. Several things:
This issue covers ditching that code, putting a big backwards-compatability note in the changelog and writing up documentation walking developers through how they should be doing gettext in their templates in regards to |safe
.
The extract command is inconsistent in its error handling and exit codes.
FIX IT!
Puente has a merge
command and pybabel doesn't. It's possible pybabel's update
command is the same thing, but I don't know offhand.
This issue covers looking into that. Are they doing the same thing? If so, then we should convert Puente's merge to update and make it behave like pybabel's update. If not, then what's the pybabel equivalent?
The documentation for using gettext, ngettext, pgettext and npgettext in templates is pretty lacking. Amongst other things, we wanted to flesh out the various situations regarding escaping, variable interpolation, etc.
This issue covers that work.
Tower has a bunch of stuff for compendiums which I never ripped out of the extract/merge commands for Puente.
Does anyone use that? As far as I can tell, the answer is "no".
Does Django have something like that? As far as I can tell, the answer is still "no".
If those two things are really "no", then we should nix it.
SUMO has some infrastructure for extracting strings from the database and putting them in a Python file which then gets passed over for strings extraction.
Input is going to need that, too.
Can we add that to Puente?
We should add the rest of the details to the .pot file header with sensible defaults:
When using Jinja2 templates, assert method from Django's SimpleTestCase that relate to rendering don't work namely these:
Also, the response object has no context, so you can't test whether things were correctly in the context.
I wrote about it a bit in my blog post here:
I did some monkeypatch stuff in Fjord to deal with this. @robhudson is hitting the same problem in MDN. Maybe it makes sense to fix this issue in puente, while we/others figure out how to fix it upstream in Jinja2, django-jinja, Django, etc.
As of January 1 2019, Mozilla requires that all GitHub projects include this CODE_OF_CONDUCT.md file in the project root. The file has two parts:
If you have any questions about this file, or Code of Conduct policies and procedures, please see Mozilla-GitHub-Standards or email [email protected].
(Message COC001)
The merge command say that the locale/compendia
directory will be used by ./manage.py merge
:
puente/puente/management/commands/merge.py
Lines 21 to 26 in 7c04aad
During some recent tests (mdn/kuma#5302), compendia in this folder were not used to initialize a new .po
file from the template .pot
file. I suspect this functionality was removed, but the docstring wasn't cleaned up. I was able to write commands to get the desired behavior.
We need more tests for the merge command. First thing would be to look at the Tower code and see what's there. Second thing would be to add tests for things that aren't tested.
If someone wants to handle the first thing (which should be more straight-forward), I can do the analysis for the second thing.
We should implement a compile command or document how to use the babel one.
Add all the ones that we're currently missing.
Instead of converging on django's makemessages, we might want to converge instead on Babel's extract, init, update and compile. If we did this, then the amount of code we maintain drops by a lot.
We could do this now, but there are a few caveats:
First, Puente has its own PuenteI18nExtension
which fixes trans blocks to collapse whitespace and fixes gettext to output markup as safe. For extraction purposes, only the trans block fix is important. However, the Jinja2 Babel extractor does this goofy thing where it automatically adds Jinja2's i18n extension which overrides ours before doing extraction. Puente has an extract command that monkeypatches that first, then does the work. If we use Babel, we'll have to use our own extractor that does that, too.
We can fix this with our own extractor. Also, this problem goes away if when Jinja2 supports trimmed trans blocks.
Second, Puente pulls the Jinja2 extensions from the settings file and uses those to set up the environment for strings extraction. If we switch to Babel, we have to put those extensions in a babel.cfg
file which means we end up with two places we have to list the extensions.
I'm not sure what to about this one. We can live with it, but it's problematic. Maybe if we write our own extractor then it could pull the extensions and populate itself?
Third, if you're using vendor/
then you have to do all this stuff to make sure the pybabel script can access the libs being used. Further, if you have babel from source (i.e. a git submodule), then it's not installed and doesn't have the cdlr data.
This goes away when we stop doing the vendor thing.
Tests raise this warning:
tests/test_merge.py::TestMergecommand::test_basic
/home/travis/build/mozilla/puente/puente/commands.py:201: ResourceWarning: unclosed file <_io.TextIOWrapper name='/tmp/pytest-of-travis/pytest-0/test_basic0/locale/templates/LC_MESSAGES/django.pot' mode='r' encoding='UTF-8'>
domain_pot_file = open(domain_pot)
My reading of the code is that when the locale is en_US
, the file is left open:
Lines 201 to 223 in 4e22c22
It may be possible to eliminate the warning by closing domain_pot_file
before leaving the if
clause. A larger change would be to modify the code to use a context manager.
At some point P, we'll decide that this experiment is good and we should move forward. At that point, we should move this to the mozilla github org.
We should do that before doing any releases.
I had the best of intentions when setting silent=False automatically. However, I misread the Jinja2 code. I thought it'd just print out the errors and move on. I didn't realize it'd print out the error and then set itself on fire and die an agonizing death. That's totally not what we want.
Given that, we should nix it automatically getting added.
Add documentation and examples for using the extract and merge commands.
jingo 0.8.1 support django 1.7+. Given that, we should test puente with it and update the documentation where appropriate.
FYI: The following changes were made to this repository's wiki:
defacing spam has been removed
the wiki has been disabled, as it was not used
These were made as the result of a recent automated defacement of publically writeable wikis.
Pretty sure that we can drop collapsing whitespace in trans blocks in Jinja2 templates. We should verify that, update the Jinja2 requirement accordingly, and drop the hacky handling for it in Puente.
We need more tests for the extract command. First thing would be to look at the Tower code and see what's there. Second thing would be to add tests for things that aren't tested.
If someone wants to handle the first thing (which should be more straight-forward), I can do the analysis for the second thing.
Need to write up documentation on the following topics:
This is essentially a derivative of Tower and Fjord.
We need to double-check the license and copyright information through the files of the project.
Gah! We have no tests for ngettext!
Django has a makemessages command that does both extract and merge. Why did Tower separate the two steps? Can we merge the two commands into one thus getting us closer to vanilla Django?
We need a logo. Something that really evokes the underlying reason for existence for this project.
As soon as we come up with an idea, I'll draw it freehand in gimp.
A bunch of this code is derived from Tower. Ergo we should include that AUTHORS list with this one.
Need to add tox environments for python 3.3, 3.4 and 3.5 and then test Puente out with them all and fix the issues.
We inherited STANDALONE_DOMAINS
from Tower. The code is still in the extract and merge commands, but the setting isn't documented in the docs and the default is to merge all the extracted strings into one .pot
file regardless of what DOMAIN_METHODS
is which I don't think is a good default. A better default would be to have one .pot
file for each domain.
I checked Input and MDN and they're not using this feature. SUMO uses it, but doesn't need to--they could trivially tweak their configuration so they didn't need it anymore. I don't know of anyone else using it and can't think of a compelling reason to keep it around.
Given that, I think we should ditch the STANDALONE_DOMAINS
code from the extract and merge commands. Instead, each domain listed in DOMAIN_METHODS
should get its own .pot
file.
Templates have _
(alias for gettext), gettext
and ngettext
available. These are handled by Jinja2 contextfunctions which call the underlying installed gettext/ngettext callables, get strings back and then wrap in a Markup
.
https://github.com/mitsuhiko/jinja2/blob/master/jinja2/ext.py#L135
Yay! That's great!
However, there's no pgettext and no way that I can see to pass in a msgctxt.
This issue covers figuring out what to do about that. At the moment, I'm leaning towards adding it as a global to our PuenteI18nExtension
for the short-term.
Long-term, we should submit a PR for this issue: pallets/jinja#441
In the tower days, the mozilla sites all used vendor/
where they would have all the dependencies on the file system. This made it easy to specify extraction patterns so that strings from those dependencies would be included in the .pot
file.
Nowadays, mozilla sites aren't doing vendor/
. Instead, we've got requirements files and during deploys the dependencies are installed into a virtual environment.
The problem this creates is that there's no way to specify extraction for strings in a dependency. There's no way to know where that dependency's files are at the time the settings module is being loaded.
One possibility to figure this out is to do something like this:
from django.utils.functional import lazy
from django.utils.module_loading import import_string
import os
def get_pattern(module_name, pattern):
module = import_string(module_name)
path = os.path.dirname(module.__file__)
return path + pattern
get_path_lazy = lazy(get_path, str)
...
PUENTE = {
...
'DOMAIN_METHODS': {
'django': [
('**.py', 'python'),
('fjord/**/jinja2/**.html', 'jinja2'),
('fjord/**/templates/**.html', 'django'),
(get_path_lazy('django', '*.py'), 'python'),
(get_path_lazy('allauth', '**.html'), 'django')
],
...
}
}
Assuming that works (I haven't tested it), then is this something Puente should internalize? We can't let people import Puente things in their settings file--that's a world of hurt waiting to happen. We could provide a three-item tuple for specifying patterns. So then the above might be:
PUENTE = {
...
'DOMAIN_METHODS': {
'django': [
('**.py', 'python'),
('fjord/**/jinja2/**.html', 'jinja2'),
('fjord/**/templates/**.html', 'django'),
('django', '*.py', 'python'),
('allauth', '**.html', 'django')
],
...
}
}
Having three items would tip Puente off into knowing to import the module and use that as a root path for the pattern. Plus it's clear in the configuration what that pattern applies to.
Thoughts on other or better ways to solve this?
It'd be much easier for migrating if Puente worked with Django 1.7 and jingo. We should set up a tox environment for that and see how well it works.
In Django 1.8 option_list
was depracted in favour of implementing add_arguments
, and as such with Django's deprecation policy, the former way was removed in 1.10.
The recommended fix is to implement add_argument(self, parser)
in the management modules.
The prefix is "Translators:" for Django: https://docs.djangoproject.com/en/1.8/topics/i18n/translation/#comments-for-translators
@robhudson is having problems with puente and mdn.
First off, he needs to set extensions to ['jinja2.ext.i18n', 'puente.ext.PuenteI18nExtension', ...]
otherwise Jingo doesn't install null gettext stuff and then gettext isn't in the context for puente to find. Here's our discussion about that:
https://github.com/mozilla/kuma/pull/3613/files#r43560442
Including both extensions seems to solve that, but there's a new problem where it's erroring out with:
19:32:21 web.1 | [02/Nov/2015 11:32:21] "GET / HTTP/1.1" 301 0
19:32:23 web.1 | Traceback (most recent call last):
19:32:23 web.1 | File "/usr/lib/python2.7/wsgiref/handlers.py", line 85, in run
19:32:23 web.1 | self.result = application(self.environ, self.start_response)
19:32:23 web.1 | File "/home/vagrant/src/vendor/src/django/django/contrib/staticfiles/handlers.py", line 64, in __call__
19:32:23 web.1 | return self.application(environ, start_response)
19:32:23 web.1 | File "/home/vagrant/src/vendor/src/django/django/core/handlers/wsgi.py", line 187, in __call__
19:32:23 web.1 | response = self.get_response(request)
19:32:23 web.1 | File "/home/vagrant/src/vendor/src/django/django/core/handlers/base.py", line 199, in get_response
19:32:23 web.1 | response = self.handle_uncaught_exception(request, resolver, sys.exc_info())
19:32:23 web.1 | File "/home/vagrant/src/vendor/src/django/django/core/handlers/base.py", line 111, in get_response
19:32:23 web.1 | response = wrapped_callback(request, *callback_args, **callback_kwargs)
19:32:23 web.1 | File "/home/vagrant/src/vendor/src/django/django/db/transaction.py", line 394, in inner
19:32:23 web.1 | return func(*args, **kwargs)
19:32:23 web.1 | File "/home/vagrant/src/kuma/landing/views.py", line 31, in home
19:32:23 web.1 | return render(request, 'landing/homepage.html', context)
19:32:23 web.1 | File "/home/vagrant/src/vendor/src/django/django/shortcuts.py", line 50, in render
19:32:23 web.1 | return HttpResponse(loader.render_to_string(*args, **kwargs),
19:32:23 web.1 | File "/home/vagrant/src/vendor/src/django/django/template/loader.py", line 178, in render_to_string
19:32:23 web.1 | return t.render(context_instance)
19:32:23 web.1 | File "/home/vagrant/src/vendor/src/jingo/jingo/__init__.py", line 206, in render
19:32:23 web.1 | return super(Template, self).render(context_dict)
19:32:23 web.1 | File "/home/vagrant/env/local/lib/python2.7/site-packages/jinja2/environment.py", line 989, in render
19:32:23 web.1 | return self.environment.handle_exception(exc_info, True)
19:32:23 web.1 | File "/home/vagrant/env/local/lib/python2.7/site-packages/jinja2/environment.py", line 754, in handle_exception
19:32:23 web.1 | reraise(exc_type, exc_value, tb)
19:32:23 web.1 | File "/home/vagrant/src/kuma/landing/templates/landing/homepage.html", line 1, in top-level template code
19:32:23 web.1 | {% extends "base.html" %}
19:32:23 web.1 | File "/home/vagrant/src/templates/base.html", line 12, in top-level template code
19:32:23 web.1 | <title>{% block title %}{{ _('Mozilla Developer Network') }}{% endblock %}</title>
19:32:23 web.1 | File "/home/vagrant/src/templates/base.html", line 12, in block "title"
19:32:23 web.1 | <title>{% block title %}{{ _('Mozilla Developer Network') }}{% endblock %}</title>
19:32:23 web.1 | File "/home/vagrant/src/vendor/src/puente/puente/ext.py", line 11, in _gettext_alias
19:32:23 web.1 | context.resolve('gettext')(context, text, *args, **kwargs)
19:32:23 web.1 | TypeError: <lambda>() takes exactly 1 argument (2 given)
The lambda could be the null gettext. I'm not sure what else it could be.
I have a few thoughts:
This issue covers figuring all this out, implementing whatever needs implementing and updating the documentation accordingly.
Supporting most of the options should be pretty easy. Some of them we're going to want values in settings so they don't have to be specified on the command line.
However, there's one big difference between Puente extract and pybabel extract that I'm not sure what to do with. Puente takes an output directory and a domain, list of domains or 'all' and then generates all those .pot files and sticks them in the directory. pybabel takes a mapping file and an output file and extracts everything specified in the mapping file and dumps it in the output file. Those are two pretty different behaviors that don't work together.
Jinja2 trans tags don't support context. However, I'm not sure if gettext in Jinja2 templates can support context.
This issue covers figuring out whether they can or can't. If they can, we should add tests for it. If they can't, we should figure out what to do about that and possibly write up an issue in the Jinja2 issue tracker to add it.
We should be running tox for all environments on every pull request etc on Travis.
If you have a bunch of domains and can't squash them into
django.po(t)
andjavascript.po(t)
, then we should talk--open up an issue.
Sumo has several tools that extract strings. For example, there are tools in BuddyUp that generate a .pot file directly. We then use the tower merge command to merge those into the .po files. Tower's extract command doesn't do anything.
Options I can see:
Tower added special code to handle msgctxt. Since then both Django and Babel added support for msgctxt, so I removed the code from Tower that did the magic things.
In Django, this is covered by the pgettext routines:
https://docs.djangoproject.com/en/1.8/topics/i18n/translation/#contextual-markers
This issue covers adding tests for msgctxt and making sure that Puente handles this correctly. If it turns out it doesn't, then we need to add code to handle it.
There's a "Star" button in the upper left hand corner of the Puente docs on readthedocs. It's busted, though. We probably have to add something to docs/conf.py
to fix it.
According to https://www.gnu.org/software/gettext/manual/html_node/msgmerge-Invocation.html that should be --backup=off
to the msgmerge
tool.
When I do a extract/merge it creates a lot of churn in git b/c the line wrapping widths vary from Pontoon to our merge
command. It'd be great if there were an option to override puente's default of 200 with one that matches Pontoon's value of 78.
In a run of ./manage.py merge
in the kuma project:
Merging javascript.po for ka
............ done.
Merging javascript.po for ko
.............. done.
Can not find (/home/vagrant/src/locale/LC_MESSAGES/LC_MESSAGES/javascript.po). Creating...
msginit: cannot create output file "/home/vagrant/src/locale/LC_MESSAGES/LC_MESSAGES/javascript.po": No such file or directory
Merging javascript.po for LC_MESSAGES
msgmerge: error while opening "/home/vagrant/src/locale/LC_MESSAGES/LC_MESSAGES/javascript.po" for reading: No such file or directory
Merging javascript.po for ln
............. done.
Merging javascript.po for mg
............ done.
Merge should detect a locale name of LC_MESSAGES
and skip merging.
Currently, none of the extractors work for Django templates. You can sort of use the Jinja2 extractor, but if an error is thrown when parsing (and it's likely this happens), then parsing silently fails and no strings are extracted.
Maybe add a dependency for django-babel https://pypi.python.org/pypi/django-babel which adds a Babel extractor for Django templates.
This issue covers looking into this further and then figuring out the best fix.
Some projects are still using Puente. We should make sure it supports all currently supported versions of Django.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.