Giter VIP home page Giter VIP logo

backtranslation's Introduction

BackTranslation

version Downloads license

BackTranslation is a python library that implemented to back translate the words among any two languages. This utilizes googletrans library and Baidu Translation API to translate the words.

Since there is an error in current verison of googletrans, you have to create only one instance to do back-translation for your work. Otherwise, it is easy to cause a bug from multi-requests. We will keep implementing this library with other translator libraries soon.

If you face any bug, you can open a issue in Github.

Installation

You can install it from PyPI:

$ pip install BackTranslation

Usage

Backtranslation with googletrans

Translate the original text to other language and translate back to augment the diversity of data in NLP research.

Parameters:

  • url: option. provide a list of services urls for translation if need. Default url is translate.google.com.
  • proxies: Optional. Proxies configuration. Dictionary mapping protocol or protocol and host to the URL of the proxy. i.e. proxies = {'http': '127.0.0.1:1234', 'http://host.name': '127.0.0.1:4012'}
  • text: required. Original text that need to do back translation.
  • src: option. Source language code of original text. If this parameter is None, the method will detect the language of text automatically. (Default: None)
  • tmp: option. Middle language code. If this parameter is None, the method will pick one of two languages which is different from src.
  • sleeping: option. It is a timer to limite the speed of back-translation to avoid the limitation of Google. (Default: 0)

Return parameter: object Translated.

Attributes:

  • source_text: original sentence.
  • src: the language of original sentence
  • tmp: the target language as middle man
  • trans_text: intermediate result
  • back_text: back-tranlsated result
from BackTranslation import BackTranslation
trans = BackTranslation(url=[
      'translate.google.com',
      'translate.google.co.kr',
    ], proxies={'http': '127.0.0.1:1234', 'http://host.name': '127.0.0.1:4012'})
result = trans.translate('hello', src='en', tmp = 'zh-cn')
print(result.result_text)
# 'Hello there'

Note: You just need to create one instance of BackTranslation in order to avoid the issue in current version of googletrans.

Search the language code

You may find out your language code with full language name by using this method.

Parameters:

  • language: required. A language name in english.
from BackTranslation import BackTranslation
trans = BackTranslation()
trans.searchLanguage('Chinese')
# {'chinese (simplified)': 'zh-cn', 'chinese (traditional)': 'zh-tw'}

Backtranslation_Baidu with Baidu Translation API

To use this stable translation, you are required to register in Baidu Translation API for getting your own appID. It supports 2 million chacters per day for free. Note: Currently, they only support Chinese phone number to register the accout.

from BackTranslation import BackTranslation_Baidu
trans = BackTranslation_Baidu(appid='YOUR APPID', secretKey='YOUR SECRETKEY')
result = trans.translate('hello', src='auto', tmp='zh')
print(result.result_text)
# 'hello'
trans.closeHTTP()

Seach language code

Since Baidu provides the different language code, it will be updated soon.

Version Information

Version 0.3.1: fix some bugs for Baidu translator.

Version 0.2.2: fix the services url for Google Translator.

Version 0.2.1: fix the small bug. From this version, the library googletrans version is 4.0.0rc1.

Version 0.2.0: support back-translation with Baidu API, and fix bugs

Version 0.1.0: support back-translation with googletrans library

Contribution

Welcome to contribute BackTranslation library!

reference

backtranslation's People

Contributors

hhhwwwuuu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

backtranslation's Issues

Proxy Support

Hello
I think google ban my ip
I want to send request using proxy
How i can use proxy BackTranslation

【Bug】Baidu Translation

Thanks for this library which is useful!

back_text = self._get_translatedText(self._sendRequest(text, tmp, src))

It should be back_text = self._get_translatedText(self._sendRequest(tran_text, tmp, src)) here. Otherwise, the tran_text is not used, which means the logic and result are wrong.

Besides, I recommend it be noticed that Baidu Translation API has QPS (requests per second) limit. If you are using standard service, then you can only submit 1 request per second. In the current version of this library, this kind of unexpected behavior is not caught, which may cause some trouble for users (at least it has been caused on me).

Wish

Can you provide a complete example such as translation: "Anh ấy đã chữa khỏi cảm cúm bằng aspirin. "

AttributeError: 'Translator' object has no attribute 'raise_Exception'`

I'm doing backtranslation (English as the source, and French as the target language).

I got this AttributeError after translating about 150 sentences.

Traceback (most recent call last):
File "backtranslate.py", line 51, in
backTranslation.translate()
File "backtranslate.py", line 38, in translate
result = self.translator.translate(src_text, src="en", tmp=self.tgt_language)
File "/home/EC/izaskr/anaconda3/lib/python3.6/site-packages/BackTranslation/translation.py", line 67, in translate
result_text = self.translator.translate(tran_text, src=tmp, dest=src)
File "/home/EC/izaskr/anaconda3/lib/python3.6/site-packages/googletrans/client.py", line 194, in translate
data, response = self._translate(text, dest, src)
File "/home/EC/izaskr/anaconda3/lib/python3.6/site-packages/googletrans/client.py", line 122, in _translate
if r.status_code != 200 and self.raise_Exception:
AttributeError: 'Translator' object has no attribute 'raise_Exception'

{'error_code': '54003', 'error_msg': 'Invalid Access Limit'}

from BackTranslation import BackTranslation_Baidu
trans = BackTranslation_Baidu(appid='xxx', secretKey='xxx')
result = trans.translate('好吧,你可以理解他的想法,但我们不能用金钱来考验家庭的感情,', src='auto', tmp='en')
print(result.result_text)
trans.closeHTTP()

We use the code as above ,but It reply me "{'error_code': '54003', 'error_msg': 'Invalid Access Limit'}"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.