leonelquinteros / gotext Goto Github PK
View Code? Open in Web Editor NEWGo (Golang) GNU gettext utilities package
License: Other
Go (Golang) GNU gettext utilities package
License: Other
Improvement
Want to use the Go Modules feature released in Go 1.11
for example
en_US heads
"Plural-Forms: nplurals=2; plural=(n != 1);\n"
ru_RU head
"Plural-Forms: nplurals=3; plural=(n%10==1 && n%100!=11 ? 0 : n%10>=2 && n%10<=4 && (n%100<10 || n%100>=20) ? 1 : 2);\n"
but zh_CN heads not Support
"Plural-Forms: nplurals=INTEGER; plural=EXPRESSION;\n"
panic error:
`panic: runtime error: index out of range [recovered]
panic: runtime error: index out of range
goroutine 6 [running]:
testing.tRunner.func1(0xc4200ce0f0)
/usr/local/Cellar/go/1.9.2/src/testing/testing.go:711 +0x2d2
panic(0x11473a0, 0x123aad0)
/usr/local/Cellar/go/1.9.2/src/runtime/panic.go:491 +0x283
git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/plurals.tokenize(0x0, 0x0, 0xc420055990, 0x10, 0x10)
/Users/zava/golang/src/git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/plurals/compiler.go:362 +0x529
git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/plurals.compileTest(0x0, 0x0, 0x4, 0x0, 0x0, 0x0)
/Users/zava/golang/src/git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/plurals/compiler.go:414 +0x4d
git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/plurals.ternaryStruct.compile(0xc420010c80, 0x4, 0x4, 0x1173781, 0x1, 0xc420055a01, 0x2)
/Users/zava/golang/src/git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/plurals/compiler.go:42 +0x147
git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/plurals.compileExpression(0xc420014a50, 0xe, 0xa, 0x1173987, 0x4, 0xc420014a50)
/Users/zava/golang/src/git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/plurals/compiler.go:407 +0xaa
git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/plurals.Compile(0xc420016ca9, 0xa, 0xc420016ca2, 0x6, 0x2, 0xc42000d8c0)
/Users/zava/golang/src/git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/plurals/compiler.go:390 +0x85
git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext.(*Po).parseHeaders(0xc42009e1b0)
/Users/zava/golang/src/git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/po.go:350 +0x509
git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext.(*Po).Parse(0xc42009e1b0, 0xc4200c4a80, 0x728, 0x928)
/Users/zava/golang/src/git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/po.go:178 +0x676
git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext.(*Po).ParseFile(0xc42009e1b0, 0xc420016ab0, 0x22)
/Users/zava/golang/src/git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/po.go:105 +0xb8
git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext.(*Locale).AddDomain(0xc4200808c0, 0xc4200163d9, 0x8)
/Users/zava/golang/src/git.xiaojukeji.com/soda-framework/go-polyglot/internal/gotext/locale.go:108 +0xb8
git.xiaojukeji.com/soda-framework/go-polyglot.Init.func1()
/Users/zava/golang/src/git.xiaojukeji.com/soda-framework/go-polyglot/gettext.go:65 +0x18b`
This is a bug.
The Printf
function used to format strings doesn't always invoke fmt.Sprintf
. As a result, the expansion of %%
sequences differs if there are arguments or not.
I expect that I can create a locale object, tr
, and then do this:
a := tr.Get("My string with an escaped value: %%s")
fmt.Printf(a, "text")
And that it prints My string with an escaped value: text
. I happen to need this because I have a map of integers to format strings, where the format strings contain %%
sequences.
Unfortunately, the behavior is inconsistent, so things are only passed through fmt.Sprintf
if there are arguments. This makes the behavior hard to reason about.
If there were separate functions to read strings that were formatted and ones that were not, this wouldn't be a problem.
Improvement
xgotext is too slow. packages.Load invocation is pretty expensive, and xgotext loads every package separately and does it multiple times if package is imported in different files. On a large project it didn't finished after 20 minutes.
xgotext could load root package with all it's imports recursively with a single packages.Load call with packages.NeedImports and packages.NeedDeps flags. And only inspect files from packages that import gotext. With that approach it would take seconds to get the job done.
I know there is a lot to implement if one wants to copy the complete xgettext cli but I would at least implement the behaviour with reading in files. I.e. that the last argument is a filelist to be read in. I already made a fork and would implement that there.
This would allow a tool like poedit to automatically extract strings from code.
The code in plurals package includes the following copyright header:
/*
* Copyright (c) 2018 DeineAgentur UG https://www.deineagentur.com. All rights reserved.
* Licensed under the MIT License. See LICENSE file in the project root for full license information.
*/
However, the code appears almost identical to that found in github.com/ojii/gettext.go/pluralforms, with some types, functions, and variables renamed. The code in the other repository was committed in 2016, so clearly predates this repo's copy.
The original code is released under the 3 clause BSD license, so reusing it should be fine provided it is correctly attributed and a copy of the original license text is included along side.
Implement Plural-Forms formula parser locally so we can remove Anko dependency
I installed gotext successfully but Every time I try to install xgotext CLI by running
go install github.com/leonelquinteros/gotext/cli/xgotext
I get this message
can't load package: package github.com/leonelquinteros/gotext/cli/xgotext: module github.com/leonelquinteros/gotext@latest found (v1.4.0), but does not contain package github.com/leonelquinteros/gotext/cli/xgotext
I try to run the command xgotext from the terminal but it tells me command not found
my go version is
go version go1.14.10 linux/386
The only documentation for this package is Godoc and the README.md file from this repo. We need to write better docs, some tutorial and use case examples to make package adoption easier.
Improvement
We don't have documentation. We want to have documentation
Decide between a Github page or the Wiki for this.
I have a small HTML template that have a small message sent via email
the HTML template contains dynamic variables as follows
<p> Following up on your proposal on {{.ApplicationDate}} </p>
Can you guide me on how to generate translation messages for HTML templates?
A Locale
should be able to use different backends to load domain Translator
s.
All methods should work as they do now, but we may need to change the AddDomain()
method so we can pass either a pre-loaded Translator
object, or a LoadTranslator()
function so it knows how to load a Translator
for a given domain.
It should be able to work with any other backend Translator, even if they're not implemented on this package.
Right now a Locale
object only knows how to work with Po/Mo Translators.
Use the package xD
Now that we're implemented the Translator
interface, we should rely on that to add domains to a Locale object, so it can even use different backends for different domains while still using the same Locale for a language.
importing github.com/leonelquinteros/gotext
does not contain the methods to check if a key is translated i.e. IsTranslated
I'd like to request you to tag the latest state as v1.5.3
importing the repo as github.com/leonelquinteros/gotext v1.5.3-0.20231003122255-12a99145a351
solved my issue for now.
I'm very new to this package, so sorry if I'm misunderstanding something!
But it seems like the GetND
and GetNDC
methods in locale.go
only return the plural default, not the singular when N=1.
I had quick look through the issues, and it looks like the same problem in the po.go
file was fixed in #8, but the corresponding methods in locale.go
weren't fixed at the same time.
Currently, a translation that doesn't exist just defaults to the passed message ID.
It can be helpful to be able to catch these missing cases e.g. to save to a log, so it might be helpful to have variants of the functions that return a boolean "ok" or an error or something
Here's a PR #41 that tries to implement this.
What do you think of this approach?
an improvement
expected to add support for json format
Consider adding JSON format support? For the front-end project used. Reference: https://github.com/php-gettext/Json
this is an bug, checking if there is a translation for a plural uses the actual amount instead of the do.pluralForm(n)
converter.
this causes anuthing above 1 to basically fail
the expected behaviour would be to convert the actual number to the pluralization form .
I will create a PR that solves this issue.
Hey everyone! 🎉
I've kinda taken the reins on maintaining this branch of gotext over at https://github.com/donseba/gotext. I’ve been tidying up a few things, squashing bugs, and I'm totally open to any cool ideas you've got for new features. So, don't be shy; hit me up with those PRs!
I've also noticed that the code’s style is a bit all over the place. Thinking it might be time for a little makeover or even a full-on rewrite to keep things smooth and consistent. What do you think?
If you’ve been using gotext, want to contribute, or have some thoughts on the direction we should take, let's chat. Whether it's improving what's already there or brainstorming brand new features, your input is super welcome.
Let’s make gotext awesome together! 🚀
due to existing code structure, sometimes we have to call Get()
with vars, and I just noticed that the return is the entire header when its empty string, seems expected given the header is always formatted as:
msgid ""
msgstr ""
"POT-Creation-Date: 2023-04-29 14:35-0800\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Language: en\n"
but also a bit surprise as its not expected outcome, so we have to check str
before calling Get
which is a bit strange.
Function wise, its not technically a bug, but from user perspective, it is
I would say we should just return ""
in this case, it does not really make sense to have empty string id in the message file.
gotext crashes during the call to gotext.Configure()
.
No crash.
gotext crashes due to a sigsev signal:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x658b22]
goroutine 1 [running]:
github.com/leonelquinteros/gotext.(*Locale).SetDomain(0x0, 0x73c4db, 0x3)
/home/morganamilo/git/yay/.go/src/github.com/leonelquinteros/gotext/locale.go:147 +0x22
github.com/leonelquinteros/gotext.Configure(0x73d0e7, 0x7, 0x73c324, 0x2, 0x73c4db, 0x3)
/home/morganamilo/git/yay/.go/src/github.com/leonelquinteros/gotext/gotext.go:150 +0xe2
main.initGotext()
/home/morganamilo/go/src/github.com/Jguer/yay/main.go:167 +0x60
main.main()
/home/morganamilo/go/src/github.com/Jguer/yay/main.go:175 +0x34
when calling:
gotext.Configure("locale/", "en", "yay")
Call gotext.Configure()
.
This started happening after cd46239, before then it worked as expected.
First off, thanks for all the work on this project. It's exactly what I need for my use case.
I get no string returned when localizing Arabic. I've tracked it down to the complex Pluralization rules of Arabic. In this case, I'm not trying to use plurals.
Bug
..
I've created a sample project that shows the behavior: https://github.com/chrisvaughn/ar_gotext
The key code being
enLocale := gotext.NewLocale("locale", "en_US")
enLocale.AddDomain("messages")
arLocale := gotext.NewLocale("locale", "ar_SA")
arLocale.AddDomain("messages")
po := new(gotext.Po)
po.ParseFile("locale/ar_SA/LC_MESSAGES/messages.po")
fmt.Printf("English With Just a Get: %s\n", enLocale.Get("This is a sample string"))
fmt.Printf("Arabic With Just a Get: %s\n", arLocale.Get("This is a sample string"))
fmt.Printf("Arabic Plural To 0: %s\n", arLocale.GetN("This is a sample string", "This is a sample string", 0))
fmt.Printf("Arabic Using po.Get: %s\n", po.Get("This is a sample string"))
the output of running this code is
English With Just a Get: This is a sample string
Arabic With Just a Get:
Arabic Plural To 0: هذه سلسلة عينة
Arabic Using po.Get: هذه سلسلة عينة
I would expect arLocale.Get("This is a sample string")
and po.Get("This is a sample string")
to return the same thing in this case.
...
I don't think it's safe to assume you can always call GetND
under the hood with a default plural of 1. I'm happy to submit an MR to attempt to address this.
I see that there is a test for Arabic from a previous issue but in that test the plural rules used are
"Plural-Forms: nplurals=2; plural=(n != 1);\n"
the actual plural rules for Arabic are
"Plural-Forms: nplurals=6; plural=(n==0 ? 0 : n==1 ? 1 : n==2 ? 2 : n%100>=3 && n%100<=10 ? 3 : n%100>=11 && n%100<=99 ? 4 : 5);\n"
Create a parsing method to work with MO file formats.
i'm trying to write a web service with language support. that is, the text to translate is located in HTML templates.
the documentation explains how to fetch string translations in templates by using a locale object. however, i couldn't find documentation yet on how to get the translatable strings from my templates into a .pot file in the first place. i tried xgotext
but it doesn't seem to search the templates. are there solutions for this available yet?
All Translator
and Locale
objects should be serializable somehow, to allow caching.
Translator
and Locale
objects can't be cached outside "local memory" because they can't be serialized.
A Translator
or Locale
object should be serializable into []byte
They can't.
Try to serialize and deserialize a Translator
or Locale
object.
We either implement a Serialize()
method for them, or implement most common serialization interfaces like json
and gob
.
proposal
We have pseudo support in FE, and would be great if we have that for Go from the library directly, it can be simple as char randomization (for english or alphabet languages, it can be just normal char to accented chars)
Return singular or plural (by checking N) instead of plural form by default.
GetN and GetNC functions both have plural and singular strings available and the "n" to determine which one to return.
It doesn't seem natural to expect a default plural string when it wasn't possible to translate the string.
Hi !
I find your lib very attractive since it implements gettext
in native Go and supports .po files, that are not the case for the other libs.
The wip extractor tool in CLI seems to be very useful although it's not include in v1.4.0.
I wonder if the project is still maintain. Indeed the last stable version was released on 2018 whereas some cool features (such as the cli) are in the pipes.
Furthermore, it would be great to accept a File
instead of a raw path to point a .po
file in Configure
.
It will offer the possibility to embed a po file with the new embed
api (since Go 1.16).
I'm open to contribute and send PR if necessary.
It seems after this commit, the PO header requires a different format than https://www.gnu.org/software/gettext/manual/html_node/PO-Files.html.
An empty untranslated-string is reserved to contain the header entry with the meta information (see Header Entry). This header entry should be the first entry of the file. The empty untranslated-string is reserved for this purpose and must not be used anywhere else.
A compliant header looks like:
msgid ""
msgstr "Project-Id-Version: %s\n"
"Report-Msgid-Bugs-To: %s\n"
Using the global package functions GetN()
and GetNC()
after setting a different domain than "default", these functions still load translations from the "default" domain.
That the mentioned functions load the translations from the domain set by SetDomain()
They still load translations from the "default" domain.
SetDomain
to use a different domainGetN
or GetNC
and check the returned values.In v1.4.0 one could use a Po
instance like this:
po := gotext.Po{}
po.Parse(b)
After upgrading to v1.5.0 this panics:
panic: NewPo() was not used to instantiate this object
and one has to use:
po := gotext.NewPo()
po.Parse(b)
instead.
This was introduced in 01a5eea
Following Semver this is a breaking change and would need to be released as v2.0.0.
I am aware Go modules make releasing a v2 very painful so I don't know if it's worth it. Could the previous behavior potentially be kept intact after the change?
The "migration" needed here is relatively simple and comes with a good error message, so I don't face any real problems here. The issue is mostly for making it visible for others who might run into the same problem.
We're trying to use this package to parse translations from PO files or export them into PO. When we use parsing, it's clear and works well. Now I'm trying to create a new PO and have managed to do it with a string format, but there are some troubles with plurals. I couldn't find any example of how to work with them. Instead of getting one key and several translation variants for it, I have several separate keys and translations. Also, most of the properties for working with objects are private and I can't use them.
So I'm doing something like this:
po := gotext.NewPo()
domain := po.GetDomain()
for i := range translations {
domain.SetN(key.Name, "some value", i, translations[i].Value)
}
And as result in the file I see this strings:
msgid "Just one user online"
msgid_plural "Just one user online"
msgstr[0] "Il y a %d utilisateurs en ligne"
msgid "Just one user online"
msgid_plural "Just one user online"
msgstr[1] "Un seul utilisateur en ligne"
But it is not what I wanted, certainly. I could just concatenate some strings by myself, but we wanted to use this library and not work with bare hands.
Perhaps it's not a bug, just a lack of documentation. It would be great if the package were supplemented with similar examples.
When exporting from a po and extracting text from files, there are multiple issues for multi-line and backquotes usage. This is fine until we use standard gettext tooling for compiling them to mo files.
this is my string
”, which then can’t be compiled to mo files.The loading is correct though, this only impact the generation file. Those generated files are thus then invalid and can’t be compile to mo file.
The gettext specification can be found at https://www.gnu.org/software/gettext/manual/html_node/Normalizing.html.
From:
fmt.Println(gotext.Get("Hello World, end line\n"))
fmt.Println(gotext.Get(`A single line text`))
fmt.Println(gotext.Get("A single line text"))
fmt.Println(gotext.Get("\nService already exists and will be reconfigured\n"))
fmt.Println(gotext.Get(`
Service already exists and will be reconfigured 2
`))
fmt.Println(gotext.Get("\nService already exists and will be reconfigured 2\n"))
fmt.Println(gotext.Get("\nService already exists and will be reconfigured without EOL))
Example of currently exported string:
#: main.go:22
msgid "\nService already exists and will be reconfigured\n"
msgstr ""
#: main.go:23
msgid `
Service already exists and will be reconfigured 2
`
msgstr ""
#: main.go:21
msgid `A single line text`
msgstr ""
#: main.go:22
msgid "Hello World, end line\n"
msgstr ""
#: main.go:30
msgid "\nService already exists and will be reconfigured without EOL"
msgstr ""
Expectation from the spec:
#: main.go:22
msgid ""
"\n"
"Service already exists and will be reconfigured\n"
msgstr ""
#: main.go:26
#: main.go:29
msgid ""
"\n"
"Service already exists and will be reconfigured 2\n"
msgstr ""
#: main.go:23
#: main.go:24
msgid "A single line text"
msgstr ""
#: main.go:22
msgid "Hello World, end line\n"
msgstr ""
#: main.go:30
msgid ""
"\n"
"Service already exists and will be reconfigured without EOL"
msgstr ""
I'm using the library to apply some changes to some existing PO files (from some open source projects), but due to how the library keeps the PO file internally, it's pretty hard to be able to use it:
When "writing back" the po file to text, it reorders all translations, which makes it impractical to later submit an MR to the upstream project.
This simple code (with input and output files attached) demonstrates the problem.
func main() {
inPo := filepath.Join(projectpath.Root, "./in.po")
outPo := filepath.Join(projectpath.Root, "./out.po")
po := gotext.NewPo()
po.ParseFile(inPo)
data, _ := po.MarshalText()
os.WriteFile(outPo, data, 0644)
}
Is there any way to "reconstruct" the pofile to how it was before parsing it?
gotext.GetD does not reload the po file. I use the code which copied and modified from the README.
package main
import (
"fmt"
"github.com/leonelquinteros/gotext"
)
func main() {
// Configure package
gotext.Configure("../locale", "en_UK", "default")
// Translate text from default domain
fmt.Println(gotext.Get("My text on 'domain-name' domain"))
// Translate text from a different domain without reconfigure
fmt.Println(gotext.GetD("extras", "Another text on a different domain"))
}
The gotext.Get
worked as expected and printed the correct result in the stdout. However, gotext.GetD
didn't give the expected result, as below:
Translated text
Another text on a different domain # expected: "another domain"
And here is how the po files are organized:
$ tree .
.
└── en_UK
└── LC_MESSAGES
├── default.po
└── extras.po
2 directories, 2 files
$ cat en_UK/LC_MESSAGES/default.po
msgid "My text on 'domain-name' domain"
msgstr "Translated text"
msgid "Another text on a different domain"
msgstr "default domain"
$ cat en_UK/LC_MESSAGES/extras.po
msgid "Another text on a different domain"
msgstr "another domain"
Hello!
We want to install the tool from the vendor directory, but we cannot install the cli/xgotext
, since it is not included in the module source code:
The PO file format specifies a way to define a context for a given msgid so several equal msgid can be identified for different contexts.
Reference: https://www.gnu.org/software/gettext/manual/html_node/PO-Files.html
Unfortunately it is not so ideal for a multilingual web application if the files are reloaded for each request, but it should be possible to use a cache for each locale, and with an option (devel) and function to grant the possibility of rebuild.
Therefore I consider it necessary to revise the modules.
Since I notice that it is delivered with 50k simultaneous requests with some the wrong language. Rather, it should be possible to determine the locale for each request and access it in this request, instead of a global "memory".
Regards, Josef
It's a bug that xgotext
emits backtick-delimited strings into .pot
files when performing string extraction, because these can't be handled by most gettext-based tools, including gotext itself.
Test case:
package main
import (
"fmt"
"github.com/leonelquinteros/gotext"
)
var Tr = gotext.NewLocale("/usr/share/locale", "en")
func main() {
fmt.Print(Tr.Get(`This is a multiline string.
It should be formatted properly in a .pot file.`))
}
The expected behavior is that xgotext
emits this as a double-quoted string with \n
in it. The current behavior is that it emits it as a backtick-quoted string.
Fix golint
warnings.
$ golint ./...
mo.go:23:2: exported const MoMagicLittleEndian should have comment (or a comment on this block) or be unexported
mo.go:198:3: var msgIdData should be msgIDData
po.go:369:10: if block ends with a return statement, so drop this else and outdent its block
translation.go:36:1: comment on exported method Translation.GetN should be of the form "GetN ..."
translator.go:8:6: exported type Translator should have comment or be unexported
plurals/compiler.go:6:1: package comment should not have leading space
plurals/tests.go:32:9: if block ends with a return statement, so drop this else and outdent its block
No warnings returned from golint
Warnings returned from golint
golint ./...
a proposal to expose lang
from Locale.
We load all Locale for all supported languages and store them in a global context during server start time, and do the map look up during run time. Now we have to store the lang
altogether with the Locale so we can know what language it is, which is a bit inconvenient as the info is available to Locale already.
So an GetLanguage
on Locale would be ideal.
I will be happy to make the PR if this is accepted.
Support more FileSystems like VFS, go-bindata, fileb0x etc.
Creation of an abstraction of file systems to be used. So that it is possible to use not only the system file system but also other virtual file systems.
Only supported the system file systems.
It should be possible to have an embedded filesystem, so that it is possible to have a binary only application. Without having to pass the locale to an external directory for each instance.
Plural form functions aren't GNU gettext compliant.
The Plural Forms header needs to be parsed and stored and the meaning for the N parameter on the "N" functions needs to be changed.
From the docs:
Therefore the solution implemented is to allow the translator to specify the rules of how to select the plural form. Since the formula varies with every language this is the only viable solution except for hardcoding the information in the code (which still would require the possibility of extensions to not prevent the use of new languages).
The information about the plural form selection has to be stored in the header entry of the PO file (the one with the empty msgid string). The plural form information looks like this:
Plural-Forms: nplurals=2; plural=n == 1 ? 0 : 1;
The nplurals value must be a decimal number which specifies how many different plural forms exist for this language. The string following plural is an expression which is using the C language syntax. Exceptions are that no negative numbers are allowed, numbers must be decimal, and the only variable allowed is n. Spaces are allowed in the expression, but backslash-newlines are not; in the examples below the backslash-newlines are present for formatting purposes only. This expression will be evaluated whenever one of the functions ngettext, dngettext, or dcngettext is called. The numeric value passed to these functions is then substituted for all uses of the variable n in the expression. The resulting value then must be greater or equal to zero and smaller than the value given as the value of nplurals.
The package can not work with Arabic translations:
My ar/categories.po
translation:
...
msgid "Alcohol & Tobacco"
msgstr "الكحول والتبغ"
...
Go example:
locale := gotext.NewLocale("i18n/", "ar")
locale.AddDomain("categories")
log.Println(locale.GetD("categories", "Alcohol & Tobacco", nil))
Output:
2018/08/14 17:33:18 %!(EXTRA <nil>)
Sometimes I have to deliver pictures or PDFs related to localization, so it would be a very good idea to save this data as "LC_RESSOURCES" and deliver it with the module.
The variable created by NewLocale(...)
when querying a string through Get(str, vars...)
for some reason refers to the DOM from the globalConfig
, not from the configuration of the variable. As a consequence, all logic breaks.
I add some Domain via myLocale.AddDomain("hello")
and myLocale.Get(str, vars...)
must use content from "hello"
domain in my local variable.
myLocale.Get(str, vars...)
call globalConfig
for content from my "hello"
domain, but found nothing and use "default"
.
See next parts:
Lines 90 to 104 in 92b69ff
(See other Get...
functions which use l.domains
)
Lines 108 to 110 in 92b69ff
Lines 76 to 82 in 92b69ff
If msgstr is empty msgstr ""
shouldn't the original string, untranslated version, be returned?
After some tests this looks for me the right checks to handle the empty case
func (t *translation) get() string {
// Look for translation index 0
if _, ok := t.trs[0]; ok {
// handle empty msgstr
if len(t.trs[0]) == 0 {
return t.id
}
return t.trs[0]
}
// Return unstranlated id by default
return t.id
}
func (t *translation) getN(n int) string {
// Look for translation index
if _, ok := t.trs[n]; ok {
// handle empty msgstr
if len(t.trs[n]) == 0 {
if len(t.pluralID) == 0 {
return t.id
}
return t.pluralID
}
return t.trs[n]
}
// Return unstranlated plural by default
return t.pluralID
}
Hey!
I haven't used gettext in a while, and I was looking into a golang variant. Kind of too bad you can't do: _("the message")
in golang, like you can everywhere else :/
In any case, IIRC (it's been a while) gettext tooling gives you a way to extract all the TO-BE-TRANSLATED strings into the .po files. That way you know what's left to do, and you can update these as you add new strings...
Do you have a script or command that does this? Maybe someone has done some quick golang lexer/parser/token hacking? Not sure how to proceed without it. Thanks!
There's a fair amount of duplicate code in po.go and mo.go. Running golangci-lint with dupl enabled yields:
mo.go:279: 279-330 lines are duplicate of po.go:307-358
(dupl)
po.go:307: 307-358 lines are duplicate of mo.go:279-330
(dupl)
mo.go:449: 449-472 lines are duplicate of po.go:478-501
(dupl)
po.go:478: 478-501 lines are duplicate of mo.go:449-472
(dupl)
I would propose some refactoring to move this common code into a new file, domain.go
. I have done this successfully in a branch and it can be done with minimal changes to the public methods, though there may be enough change to warrant this being v2.
This will also allow cleaning up some duplicate tests.
gettext documentation 17.2.2 states that:
In the LANGUAGE environment variable, but not in the LANG environment variable, ‘ll_CC’ combinations can be abbreviated as ‘ll’ to denote the language’s main dialect. For example, ‘de’ is equivalent to ‘de_DE’ (German as spoken in Germany), and ‘pt’ to ‘pt_PT’ (Portuguese as spoken in Portugal) in this context.
Take a string
fmt.Println(gotext.Get("en_US"))
and translate it in the .po file:
# pt_PT.po
msgid "en_US"
msgstr "pt_PT"
# pt_BR.po
msgid "en_US"
msgstr "pt_BR"
According to the excerpt above, pt
should be expanded to pt_PT
, when plain pt
is not available. However, it is not:
$ LANGUAGE=en_US foo
en_US
$ LANGUAGE=pt_PT foo
pt_PT
$ LANGUAGE=pt_BR foo
pt_BR
$ LANGUAGE=pt foo
en_US
$ # ^ here
This is a bug/deviation from the gettext behavior. When language is specified, but translation file is named language_country, msgid
is returned instead of the main dialect.
The main dialect version should be returned instead of the fallback string.
The library does not deal with environment variables. However, I believe this should be the default behavior; the developers using this library can handle differences between LANG
and LANGUAGE
themselves; supporting expansion for every language string should not break anyone, and can be handled easily in the application itself.
Quotation marks in translations or source strings are not escaped, and can break the file format. Newlines are not converted correctly either.
It's a bug.
This string:
po := gotext.NewPo()
strings := map[string]string{"myid": "test string\nwith \"newline\""}
for k, v := range strings {
po.Set(k, v)
}
po_bytes, _ := po.MarshalText()
fmt.Println(string(po_bytes))
is converted to this:
msgid ""
msgstr ""
msgid "myid"
msgstr "test string
with "newline""
Which breaks format and causes xgettext to throw errors if the file is used. It should be:
[...]
msgid "myid"
msgstr "test string"
"with \"newline\""
Wrapping the items in this block with this function solves the problem:
func EscapeSpecialCharacters(s string) string {
s = regexp.MustCompile(`([^\\])(")`).ReplaceAllString(s, "$1\\\"") // Escape non-escaped double quotation marks
s = strings.ReplaceAll(s, "\n", "\"\n\"") // Convert newlines into multi-line strings
return s
}
GNU gettext contains a support for priority lists. The LANGUAGE
environment variable can contain multiple languages (e.g. sv:de
or sk:cs:pl
) to provide alternative translation if the primary is not available, instead of falling back to the msgid
.
The current implementation of gotext does not support this; when multiple languages are specified, only the first one is loaded, the rest of the locale string is discarded.
Update config
struct and methods in gotext.go
(loadStorage
, Configure
, Get*
) to handle multiple Locale
objects instead of just one. Get*
methods, instead of reaching into the only domain, would iterate over loaded Locale
s and return the string in the first language in which it is translated, keeping the msgid
as the last option.
LANGUAGE
variable, not LC_*
or LANG
, this should be the new only implementation; the change is internal and would not alter the behavior for those who are using just one language..IsTranslated*
from PR #69 could be used to load the first available one. An alternative to this could be to pick a translations on load, and serve this string when requested -- effectively a tradeoff between startup time and memory efficiency during runtime. I lean towards the former, as it should be easier to implement and maintain, with the current structure of the code.I can create a patch containing these changes.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.