Comments (18)
I'm starting to wonder whether we should reimplement msgmerging in Perl directly. Sounds like a nightmare to do, but working around msgmerge issues is also demanding...
from po4a.
Could you please extend on how po4a makes your life more complex, please? Shouldn't this paragraph solve these issues?
Once setup, invoking po4a is enough to update both the translation PO files and translated documents. You may pass the "--no-translations" to po4a to not update the translations (thus only updating the PO files) or "--no-update" to not update the PO files (thus only updating the translations). This roughly corresponds to the individual po4a-updatepo and po4a-translate scripts which are now deprecated (see ``Why are the individual scripts deprecated'' in the FAQ below).
A link to your project would help me debugging the situation, if possible.
from po4a.
Hi,
Could you please extend on how po4a makes your life more complex, please? Shouldn't this paragraph solve these issues?
The main 2 issues are:
- I need to re-generate the config file each time a new document is added.
- I cannot do incremental builds, as it does not allow me to process one file at a time.
For that reason I opted to use the deprecated scripts, which I was able to integrate in make
. But I would be willing to change this if it fixes the problems I am having
Once setup, invoking po4a is enough to update both the translation PO files and translated documents. You may pass the "--no-translations" to po4a to not update the translations (thus only updating the PO files) or "--no-update" to not update the PO files (thus only updating the translations). This roughly corresponds to the individual po4a-updatepo and po4a-translate scripts which are now deprecated (see ``Why are the individual scripts deprecated'' in the FAQ below).
A link to your project would help me debugging the situation, if possible.
Sure, luckily the project was made public a couple of months ago: https://gitlab.com/securityinabox/securityinabox.gitlab.io/
You can find all the po4a stuff in the Makefile
, but it is not very easy to read as I had to use macros.
from po4a.
The first issue seems related to #272 right? For the second one, I'll have to investiguate your project a bit. No worry, I speak makefile :)
from po4a.
For the first issue: yes, globbing would solve that problem
For the second issue: thanks a lot, I really appreciate the help!
Now, this issue in particular (PO files changing format when updated).. is it a bug, or a result of me not using POT files? If it is not a bug, is there a way to do this properly with po4a-update?
from po4a.
I have no idea :)
from po4a.
I had some difficulties reproducing it, so here is some more clear steps:
$ cat >foo.md <<EOF
Hello World
===========
EOF
$ rm -f en.po && po4a-updatepo -f text -o markdown -m foo.md -p en.po
$ tail -n 5 en.po
#. type: Title =
#: foo.md:2
#, markdown-text, no-wrap
msgid "Hello World"
msgstr ""
Notice that Hello World
string has markdown-text
flag.
Now edit the file:
$ sed -i 's/World/world/' foo.md
$ po4a-updatepo -f text -o markdown -m foo.md -p en.po
$ tail -n 5 en.po
#. type: Title =
#: foo.md:2
#, no-wrap
msgid "Hello world"
msgstr ""
The markdown-text
flag is gone
po4a
doesn't seems to add the flag at all.
The flag was introduced by #208, and it looks like a bug in both: po4a
and po4a-updatepo
(when updating a file).
from po4a.
Digging a bit further: it actually getting removed by msgmerge -U
... I don't know if there anything could be done on po4a's part about that...
from po4a.
I wonder if anybody is actually maintaining GNU gettext.. There is a bug open about this for more than 2 years with no replies, as well as reports in the mailing list:
- https://savannah.gnu.org/bugs/?60947
- https://lists.nongnu.org/archive/html/bug-gettext/2015-11/msg00011.html (from 2015!)
- https://lists.nongnu.org/archive/html/bug-gettext/2021-04/msg00002.html
On the flip side, pot2po
from Template Toolkit seems to do this right:
$ cat >foo.md <<EOF
Hello World
===========
EOF
$ rm -f foo.pot && po4a-updatepo -f text -o markdown -m foo.md -p foo.pot
$ cp foo.pot foo.es.po
$ sed -i 's/msgstr ""/msgstr "Hola Mundo"/' foo.es.po
$ tail -n 5 foo.es.po
#. type: Title =
#: foo.md:2
#, markdown-text, no-wrap
msgid "Hello World"
msgstr "Hola Mundo"
$ sed -i 's/World/world/' foo.md
$ rm -f foo.pot && po4a-updatepo -f text -o markdown -m foo.md -p foo.pot
$ pot2po -t foo.es.po foo.pot foo.es.po
$ tail -n 5 foo.es.po
#. type: Title =
#: foo.md:2
#, fuzzy, markdown-text, no-wrap
msgid "Hello world"
msgstr "Hola Mundo"
from po4a.
from po4a.
I simply cannot find my path in the Template Toolkit source code .Could someone direct me to the right location where the fuzzy matching is done? Thanks in advance,
from po4a.
I am sorry, my memory betrayed me (I used to use Template::Toolkit a lot back in my Perl days 😂), I meant translate toolkit
from po4a.
The conversion is done in this file: https://github.com/translate/translate/blob/master/translate/convert/pot2po.py but I am having trouble following the many layers of abstraction they use..
from po4a.
This is here: https://github.com/translate/translate/tree/master/translate/search
A Levenshtein distance is used, with some tricks to speed things up. I need to read that further to see if we too could do without msgmerge.
from po4a.
The more I think about it, the less I think we should remove our dependency on gettext. Maybe we should fix gettext for others to enjoy it too.
from po4a.
This is here: https://github.com/translate/translate/tree/master/translate/search A Levenshtein distance is used, with some tricks to speed things up. I need to read that further to see if we too could do without msgmerge.
Yesterday I ran msgmerge and pot2po on a bunch of outdated PO files, and the merging was equivalent. But I found 2 other problems in pot2po:
- Does not allow changing the wrapping setting (the code is there, but no CLI option)
- Loses previous translation strings (
#|
)
The more I think about it, the less I think we should remove our dependency on gettext. Maybe we should fix gettext for others to enjoy it too.
It looks to me like Translate Toolkit has the potential to become the po2a replacement at some point, but it does not seem to be there yet. Fixing gettext would be the ideal solution, but from those bug reports I am not holding much hope. A workaround for now could be to re-add the flags in po4a after gettext runs?
from po4a.
I looked at the gettext code, and the fuzzying code is much more efficient and advanced than a simple Levenshtein distance. They use something about ngram which I don't quite understand, but which seems to be the state-of-the-art for fuzzy text matching.
Someone should fix gettext, that'd be so much better :(
from po4a.
FYI, I wrote a hacky script to copy the missing flags after running msgmerge. Of course, each tool does things slightly different, so polib is wrapping things a bit differently, but it is useable:
#!/usr/bin/env python3
# PO fix-up: copy missing flags from POT and re-wrap.
import argparse
import polib
# Copy flags from POT file, as GNU gettext drops any custom flags.
def copy_flags(pot, po):
potentries = {
entry.msgid_with_context: entry for entry in pot
}
for poentry in po:
potentry = potentries.get(poentry.msgid_with_context)
if not potentry:
continue
pofuzzy = poentry.fuzzy
potflags = set(potentry.flags) - {'fuzzy'}
poflags = set(poentry.flags) - {'fuzzy'}
missing = potflags - poflags
if newflags := poflags - potflags:
print('Unexpected new flags in file %s: %s' % (po.fpath, newflags))
poentry.flags.extend((
flag for flag in potentry.flags if flag not in poflags))
return po
def main():
parser = argparse.ArgumentParser(
description='Copy flags from POT file and re-wrap')
parser.add_argument(
'pofile', metavar='DEST',
help='PO file to copy flags to')
parser.add_argument(
'potfile', metavar='TEMPLATE',
help='template to copy flags from')
parser.add_argument(
'--wrap', metavar='N', type=int, default=77,
help='wrap lines after N columns')
args = parser.parse_args()
pot = polib.pofile(args.potfile)
po = polib.pofile(args.pofile, wrapwidth=args.wrap)
copy_flags(pot, po)
po.save()
if __name__ == '__main__':
main()
from po4a.
Related Issues (20)
- Mb -> MB in the po4a-gettextize docs
- Weblate: Set a repository browser URL for po4a-doc HOT 5
- How to generate Markdown without line wrap? HOT 2
- seems update to po4a to 0.70 breaks builds of apt HOT 14
- UTF-8 "\xF3" does not map to Unicode at /usr/share/perl5/vendor_perl/Locale/Po4a/TransTractor.pm line 583 HOT 5
- seems update to po4a to 0.70 the alternate language man pages are no longer installed. HOT 5
- Accept-Languages + Accept on website returns incorrect languages HOT 3
- docbook: Multiline entities not translated HOT 3
- Failed to run 'po4a-gettextize' for 'sgml' format (Error while running onsgmls -p) HOT 2
- Feature Request: Keep maintaining 'po4a-xxx' utilities HOT 4
- #☑️
- Malformed encoding while writing to file <output> with charset utf-8: Unknown failure HOT 3
- Error while build v0.71 on openSUSE HOT 11
- Markdown tables not parsed properly
- Lots of warnings and errors when building texinfo-docs with po4a-traslated texinfo.texi HOT 1
- Is it possible to use translated (but outdated version) of changed paragraph in translated file? HOT 2
- second argument to Locale::Po4a::TransTractor::read now mandatory HOT 2
- Problem after PerlIO overhaul in v0.70 HOT 9
- Fails to build with perl 5.40 HOT 2
- Add custom placeholders for po4a componentes in Weblate HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from po4a.