Giter VIP home page Giter VIP logo

Comments (17)

osamuaoki avatar osamuaoki commented on July 18, 2024 1

from po4a.

osamuaoki avatar osamuaoki commented on July 18, 2024

I agree adding specific solution with a narrow scope may complicate code without much benefit. That's not good thing. But I think we can do better by adding a generic feature cleanly by not including such translation exclusion code within po4a. (There is such needs See https://bugs.debian.org/607726 . As written there, -o option may have some answer for XML but it wasn't easy for me to implement.)

What po4a should offer is independent ways to specify 2 variants of original English document. One to make POT file and another to make translated text with the help of PO file in po4a.cfg.

Both of these should be generated by the external program.

This approach allow us to include many unstranslatable contents in many parts. This is how I manage to include many auto-generated statistical data included in Debian Reference with manual convoluted Makefile. If po4a support this kind of feature, I can clean up my Makefile :-)

For XML source, we can write XSLT filter to exclude specific tag contents such as ... for use as the input for POT. The final translated document can be generated by PO and the final English document with tag contents such as ... .

For non-XML source, we can deploy CPP predecessor directive to enable similar things by pre-processing.

This approach should be non-invasive and clean, I think...

from po4a.

osamuaoki avatar osamuaoki commented on July 18, 2024

This is follow up to my post yesterday.

As for implementing 2 English base input files, specifying this in current po4a/po4a.cfg syntax isn't trivial and very much confusing.

I think most reasonable approach is to create optional entries in po4a/po4a.cfg to set up custom prefilter programs:

  • [pot_prefilter]: optional entry to set up prefilter for input source test -> source text fed into "po4a-gettextize -m" option input file (POT generation base file)
  • [translation_prefilter]: optional entry to set up prefilter for input source test -> source text fed into "po4a-translate -m"option input file (Translation file generation base file)

This approach should be compatible with existing syntax while adding very generic flexibility to po4a infrastructure.

from po4a.

osamuaoki avatar osamuaoki commented on July 18, 2024

Hmmm.. maybe adding option to po4a command for these prefilters may be even better.

from po4a.

mquinson avatar mquinson commented on July 18, 2024

I like this idea of pre-filtering the input document before extracting the POT file. I think that this is a very appealing approach to solve this problem. Any help (or even better, patch) going in that direction would be really appreciated.

Thanks for the insight.

from po4a.

mquinson avatar mquinson commented on July 18, 2024

Hello there.
Actually, there is a preliminary implementation already in po4a :)

If you specify the pot_in for a given document, this is the file used to build the POT and PO files. We have an example in t-02-addendums/book-potin.conf (that I plan to rewrite as I do for all tests currently):

[po4a_langs] ja
[po4a_paths] tmp/book.pot ja:t-02-addendums/book.po.ja

[type:docbook] t-02-addendums/book-auto.xml \
        pot_in:t-02-addendums/book.xml \
        ja:tmp/book-auto.ja.xml \
        add_ja:t-02-addendums/book.addendum1 \
        opt:"-k 0 -o nodefault=\"<bookinfo> <author>\" \
                  -o break=\"<bookinfo> <author>\" \
                  -o untranslated=\"<bookinfo>\" \
                  -o translated=\"<author>\""

We have:

--- t-02-addendums/book-auto.xml        2020-04-09 00:23:24.801047067 +0200
+++ t-02-addendums/book.xml     2020-04-09 00:23:24.801047067 +0200
@@ -59,11 +59,6 @@
   </totalfake>
 </bogustag>
 </chapter>
-<chapter><title>Title: Auto add text</title>
-<para>
-This is to emulate auto added non-translated content.
-</para>
-</chapter>
 <appendix><title>Title: Optional Appendix</title>
 <para>
 Appendixes are optional.

As a result, these strings are not added to the pot, so their translation is not found in the po, so they remain unchanged. So it ... works.

But this is very cumbersome, because one has to implement the filtering externally, which kinda goes against the whole spirit of the po4a binary as opposed to the po4a-* tools.

I'd prefer to have a filter, as @osamuaoki proposed. I still need to think of how to express such a filter in the config file.

from po4a.

mquinson avatar mquinson commented on July 18, 2024

Hello @osamuaoki, thanks for the detailed answer.

I must however confess that I'm a bit lost here. You speak of the pot_in feature as something that would be desirable, but it's already implemented, right? I just pushed some tests to ensure that it will continue to work in the future.

So, maybe you mean that this bug can be closed because the filtering thing that I was suggesting is less useful? If so, I agree. I changed my mind in the meanwhile, and I think that it is much easier to keep the filtering out of the po4a program, that is already rather complex. I don't think that we can find a solution that fits all needs to specify the filtering command line in the po4a.conf, so I take it back: pot_in is sufficient from my point of view, and we could close this issue.

What would be needed from your point of view to close this?

Thanks for your help,
Mt.

from po4a.

mquinson avatar mquinson commented on July 18, 2024

Hello @osamuaoki, could you please help me understanding what remains to be done before closing this issue ?

Thanks in advance,

from po4a.

erciccione avatar erciccione commented on July 18, 2024

I have some paragraphs in a text file (markdown) that i don't want to have translated since they mostly contain code. Ideally i would have a pot file with some paragraphs marked as "not to translate" that would be ignored during conversions, so to keep them in english in the translated file.

I've been looking for ways to achieve that, but it's hard to find a solution. I now found this issue but it's still not clear to me if it's now possible to mark some paragraphs not-for-translations. Is there currently a native way to achieve this? Is there a workaround i'm missing?

from po4a.

mquinson avatar mquinson commented on July 18, 2024

Hello @erciccione, sorry for the delay.

Did you see https://po4a.org/man/man1/po4a.1.php#lbAN in the documentation?

If you've read the doc and it's not sufficient, could you please elaborate on your question? The idea is to produce a filtered file where the content you want to hide is removed. This filtered file should be used as pot_in.

Maybe your question is about how to produce that filtered file removing the content you want to hide? Well, this is not in the field of po4a: you have to filter it on your side, to produce the file that will be used as pot_in in po4a.

I'm not quite sure of how I'd do this for text files. In markdown, I'd use specific markers in comments to indicate the beginning and end of such area to hide, and then I'd come up with a small crude Perl script do do the actual filtering.

from po4a.

jnavila avatar jnavila commented on July 18, 2024

I don't think prefilter is a correct solution. If I understand correctly, po4a would not see the content that is tagged as no-translated when generating the pot files because, it would simply be eliminated from original content before. But, when po4a would blend the translations, the eliminated parts would need to be present and they would be counted as not translated, thus defeating the translations statistics of the file and the threshold logic.

from po4a.

mquinson avatar mquinson commented on July 18, 2024

Well, that's the currently implemented solution :) What would you propose as a replacement?

from po4a.

mquinson avatar mquinson commented on July 18, 2024

Just to be sure we are on the same pace here, @jnavila: Filtering is already implemented and integrated to po4a since several years already. If you want to update it to make it easier for the users, be my guest, but that's already working. There is even some tests.

One thing we could do is to improve Po.pm so that it does not could missing entries as untranslated. That should be rather easy to implement, but it could have bad side effects for people using the po4a-* subscripts in the wrong order. That's a drawback with which I could live, probably.

from po4a.

jnavila avatar jnavila commented on July 18, 2024

OK. Thank you for clearing up what's done and what could be enhanced. I cannot commit on changes right now.

from po4a.

osamuaoki avatar osamuaoki commented on July 18, 2024

As far as functional features are concerned, I think this is done deal. Now line matching rules can be created more intuitively, too for addendum.

As for easy usage for end-users for filtering, we may need XML filtering documentation to use attribute with example XSLT+Makefile since they are nontrivial for most people.

So let's rename this issue 77.

from po4a.

mquinson avatar mquinson commented on July 18, 2024

Hello,

reading again the logs of this issue, I come to the conclusion that the feature may be implemented and documented, it is still very cumbersome to use. I like very much the idea of @erciccione, of suppression POT files that would be a POT file which msgids get automatically marked as "not to translate". I think that is would be much easier to manage for the users, as you just have to check on your (usual) POT to seach for the entries that shouldn't be here, and copy/paste them unchanged to your suppression file to have them automatically removed. We could even probably warn about unused entries in the suppression file to ease the maintenance of this file (probably, because I'm not sure about split settings which could get in the way).

Internally, that shouldn't be too complex to implement, a bit like the po4a-gettextize internal behavior: after building the pot file from the master documents, just before writing it to disk, you load the suppression file in a new PO object, and then iterate over the entries of that PO object to remove those msgids from the master POT files.

Unfortunately, I'm not sure I'll have to implement this before releasing the long overdue v0.70, so I'm writing this to (1) confirm with you guys that this new feature would be the right answer to your need (2) remember about it the next time that I find some time for po4a.

from po4a.

osamuaoki avatar osamuaoki commented on July 18, 2024

Since my target is XML, filtering by XML-tag is easy. I basically use po4a in 2 stage. Once on filtered XML to create template for PO file. Second time with original XML to produce final result. But for markdown, this strategy doesn't work.

I agree creating blocking-pot file is a reasonable idea to address this needs via data-source neutral way.

msguniq-like filtering is all you need to implement .

from po4a.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.