Giter VIP home page Giter VIP logo

python-docs-tx-translations's Introduction

python-docs-tx-translations

image

Scripts and procedures for maintaining Python documentation translation infrastructure under python-doc organization in Transifex.

Source strings are updated using continuous integration workflow under .github/workflows. Details:

  • Run weekly
  • Run for releases in beta, release candidate, stable, bugfixes and security-fixes status; alpha or EOL are excluded;
  • It DOES NOT store translations to be used by the published documentation;

See docs directory for more information on this project maintenance.

See Translating in Python Developer's Guide for more information.

python-docs-tx-translations's People

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

m-aciek

python-docs-tx-translations's Issues

Moving the repository to Python organization on GitHub

This repository keeps the source strings on Transifex up to date with CPython branches. Soon it should be handling the translations propagation between versions.

It would be good to have this repository in Python organization. For discoverability and stability of the process. @JulienPalard could we initiate the process of transferring the repo ownership?

Lock fails because obsolete POT files are still tracked by git

lock-translations.py compares the presence of resources in Transifex and in local cpython/Doc/locales/.tx/config file. This .tx/config is created by sphinx-intl based on the POT files present in cpython/Doc/locales/pot/. Because we track with Git all the POT files, and sphinx-intl will always list them in .tx/config, lock-translations.py is lead to consider all as ok when is not.

e.g. install/index.pot was removed from CPython, but is still present in our repo, .tx/config and therefore in python-newest project

Implementing the propagation automation between python-doc org's Transifex projects

We learned that Transifex stopped propagating translations between projects of a organization. To enable it, python-doc organization would have to sign up for a Premium plan ($$$$).

So right now, translations contributed to the main Python project (slug "python-newest") will not be automatically reused by the versioned projects in this same organization (slugs python-311, python-310, etc.).

How to solve this problem?

I'm discarding the option of duplicating, triplicating efforts of translating more than one project, because Python documentation has about 24K strings (yeah, twenty four thousands). Let's talk about scripting then.

A script has to be made to pull translations for the current translation version, and merge them in older versions. This would should be used by/in every team.

pomerge looks an awesome tool for the job, as it allows merge translated string into different branches.

git switch 3.12   # current translation branch
pomerge --from-files *.po **/*.po
git switch 3.11    #older translation branch
pomerge --to-files *.po **/*.po

pomerge also has a --no-overwrite flag to consider. When used, merging translations will not overwrite already translation strings.

Who should run the propagation automation?

Handling this task of propagation automation to the python-doc organization administrator (me) would centralize the effort hence making it easier for the translation teams, but by pushing the merged translations back to Transifex would add my username as translation in teams' translation files.

By leaving to the teams to automatize the propagation could result in different implementations and different bugs, which would multiply the effort of propagating translations. On the bright side, they could implement in their own way, and not have my username as translator.

For the second option, we could come up with a script that would make everyone happy and this be presented in an announce of some kind.

Ideas are welcome.

Scriptize version bump

Bumping python-doc versions in Transifex requires a lot of work, but there is a lot of resource automating this, like the public API and the Python SDK. It would be nice to have scripts automating steps (in CI whenever possible) so it requires only double-checking the result.

Macro idea

  • Merge python-docs-tx-translations with this repo, putting translations in another branch named 'translations' or 'locales'.
  • A Python script of argparse arguments that call functions. CI workflow that manually run each script argument (finish and wait for double-checking results for moving to the next steps.
  • tx pull reviewed and only-translated files into different folders, so I can first push reviewed (flag it as reviewed!) and then push translated in order for the versioned project honor the reviewed state?

Step 1: Lock all resources and update translation files

  1. Lock python-newest resources
  2. Do tx pull python-newest (duration: more than 2 hours)

Step 2: Versioned project creation

  1. Create python-XXX project (e.g python-311)
  2. Create resources for python-XXX before pushing translations (good for setting the proper name)
  3. Patch '.tx/config' from "p:python-newest" to "p:python-xxx"
  4. Do tx push sources to python-XXX (required!)
  5. Do tx push translations to python-XXX (duration: less than 30 minutes)

Step 3: Version bump of python-newest

  1. Update ci.yml with new version
  2. Manually run ci.yml to:
    1. make pot for new python-newest
    2. create python-newest resources if not existent (good for setting the proper name
    3. tx push sources (from pot)
    4. check unused resources: lock & delete procedure (frc-docs-translations style)
  3. Unlock python-newest resources, if locked (save API call when unlocked already)

3.11, failing to pull all strings

Pulling step fails for 3.11 (and judging by the time taken, it is incomplete), possibly related to #15.

python-311.library--re [ar] - 404, not_found: Resource with id 'o:python-doc:p:python-311:r:library--re' does not have any content
python-311.library--re [ar_EG] - 404, not_found: Resource with id 'o:python-doc:p:python-311:r:library--re' does not have any content

Idea: Propagate only changed translation files

Currently, we are propagating latest version's translations of all translation files from all languages into all other versioned branches.

A more optimized solution would be to identify changed files and only run pomerge only for those languages, for all versioned branches. E.g. if only Chinese translated files in this week, propagate only Chinese's translations instead of all languages.

Looks like scripts/manage_versions.py would need to be adapted for this.

Ilegal character breaks pushing source strings to resources

\N is parsed as line break by Transifex and should not be a source string. This happens in 3.12 and 3.11.

Pushing only keeps going due of --skip (ignore errors and continue pushing), but the resources below are basically broken.

2023-06-02T14:16:19.7928091Z python-newest.library--codecs - upload of resource 'o:python-doc:p:python-newest:r:library--codecs' failed - parse_error: null value in column "string" of relation "resources_translation" violates not-null constraint
2023-06-02T14:16:19.7929439Z DETAIL:  Failing row contains (1323078349, null, 44d0dc437936b13f7cea2f77053806bd, 5, 2023-06-02 14:16:18.774916+00, 2023-06-02 14:16:18.774916+00, 460434883, 20, 7302, 1, 2237910, API, f, null, eff3207b-359b-4931-a08e-b12329f89665, 181125, null, null, null, 44d0dc437936b13f7cea2f77053806bd, null, null, null).
2023-06-02T14:16:19.7930405Z CONTEXT:  COPY resources_translation, line 21: "2023-06-02 14:16:18.774916	\N	eff3207b-359b-4931-a08e-b12329f89665	20	2023-06-02 14:16:18.774916	\N	..."
2023-06-02T14:17:08.9828644Z python-newest.library--re - upload of resource 'o:python-doc:p:python-newest:r:library--re' failed - parse_error: null value in column "string" of relation "resources_translation" violates not-null constraint
2023-06-02T14:17:08.9830000Z DETAIL:  Failing row contains (1323078439, null, 44d0dc437936b13f7cea2f77053806bd, 5, 2023-06-02 14:17:08.051214+00, 2023-06-02 14:17:08.051214+00, 460434951, 20, 7302, 1, 2238076, API, f, null, ba1172d8-006c-43d5-afea-f5ee8384854c, 181125, null, null, null, 44d0dc437936b13f7cea2f77053806bd, null, null, null).
2023-06-02T14:17:08.9830940Z CONTEXT:  COPY resources_translation, line 48: "2023-06-02 14:17:08.051214	\N	ba1172d8-006c-43d5-afea-f5ee8384854c	20	2023-06-02 14:17:08.051214	\N	..."
2023-06-02T14:17:57.4107373Z python-newest.reference--lexical_analysis - upload of resource 'o:python-doc:p:python-newest:r:reference--lexical_analysis' failed - parse_error: null value in column "string" of relation "resources_translation" violates not-null constraint
2023-06-02T14:17:57.4108983Z DETAIL:  Failing row contains (1323081513, null, 44d0dc437936b13f7cea2f77053806bd, 5, 2023-06-02 14:17:56.720266+00, 2023-06-02 14:17:56.720266+00, 460435448, 20, 7302, 1, 2238198, API, f, null, 640fba53-eb10-430f-810e-ed849100f5c4, 181125, null, null, null, 44d0dc437936b13f7cea2f77053806bd, null, null, null).
2023-06-02T14:17:57.4110165Z CONTEXT:  COPY resources_translation, line 86: "2023-06-02 14:17:56.720266	\N	640fba53-eb10-430f-810e-ed849100f5c4	20	2023-06-02 14:17:56.720266	\N	..."

Idea: A GitHub Actions to pull translations from Transifex

Each team has its own solution to pull translations from Transifex. Setting up a solution can be an issue to new teams that prefer to translate rather than keep reinventing the wheel.

How about a GitHub Actions perform all necessary actions to pull translations?

Workflow idea:

   permissions:
     contents: write 
   steps:
    - name: Checkout the repository
       uses: actions/checkout@3
    - name: Pull translations
      uses: THIS_ACTION@main
      with:
        tx_token: ${{ secrets.TX_TOKEN }}
        cpython_branch: 3.12
        commit_and_push: true
  • python-docs-uk has an awesome script that uses transifex-python package to pull translations from Transifex. It is simple and clean.
  • README would instruct on setting up the Transifex API token as repository secret
  • commit_and_push argument in the above example is to avoid the need for posterior commit and push, but leaving false would allow customized commit and push tasks.
  • Note how cpython and dependencies wouldn't have to be set up in the language's workflow.
  • A Docker-based or JavaScript-based github action? I've read JavaScript-based is faster, but I don't know if all these operations can be done other than in Docker-based.
  • This repository could document suggestion to new teams using this new action

Add and document placeholders

  • :py:const:
  • :monitoring-event:
  • :title-reference:
  • `text`_
  • :program:
  • |python_version_literal|
  • :pypi:
  • :cve:
  • :cwe:
  • Transifex custom placeholders seem to be case-sensitive now.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.