Giter VIP home page Giter VIP logo

language_tool_python's Introduction

language_tool_python: a grammar checker for Python 📝

language tool python on pypi

Test with PyTest

Current LanguageTool version: 6.4

This is a Python wrapper for LanguageTool. LanguageTool is open-source grammar tool, also known as the spellchecker for OpenOffice. This library allows you to make to detect grammar errors and spelling mistakes through a Python script or through a command-line interface.

Local and Remote Servers

By default, language_tool_python will download a LanguageTool server .jar and run that in the background to detect grammar errors locally. However, LanguageTool also offers a Public HTTP Proofreading API that is supported as well. Follow the link for rate limiting details. (Running locally won't have the same restrictions.)

Using language_tool_python locally

Local server is the default setting. To use this, just initialize a LanguageTool object:

import language_tool_python
tool = language_tool_python.LanguageTool('en-US')  # use a local server (automatically set up), language English

Using language_tool_python with the public LanguageTool remote server

There is also a built-in class for querying LanguageTool's public servers. Initialize it like this:

import language_tool_python
tool = language_tool_python.LanguageToolPublicAPI('es') # use the public API, language Spanish

Using language_tool_python with the another remote server

Finally, you're able to pass in your own remote server as an argument to the LanguageTool class:

import language_tool_python
tool = language_tool_python.LanguageTool('ca-ES', remote_server='https://language-tool-api.mywebsite.net')  # use a remote server API, language Catalan

Apply a custom list of matches with utils.correct

If you want to decide which Match objects to apply to your text, use tool.check (to generate the list of matches) in conjunction with language_tool_python.utils.correct (to apply the list of matches to text). Here is an example of generating, filtering, and applying a list of matches. In this case, spell-checking suggestions for uppercase words are ignored:

>>> s = "Department of medicine Colombia University closed on August 1 Milinda Samuelli"
>>> is_bad_rule = lambda rule: rule.message == 'Possible spelling mistake found.' and len(rule.replacements) and rule.replacements[0][0].isupper()
>>> import language_tool_python
>>> tool = language_tool_python.LanguageTool('en-US')
>>> matches = tool.check(s)
>>> matches = [rule for rule in matches if not is_bad_rule(rule)]
>>> language_tool_python.utils.correct(s, matches)
'Department of medicine Colombia University closed on August 1 Melinda Sam'

Example usage

From the interpreter:

>>> import language_tool_python
>>> tool = language_tool_python.LanguageTool('en-US')
>>> text = 'A sentence with a error in the Hitchhiker’s Guide tot he Galaxy'
>>> matches = tool.check(text)
>>> len(matches)
2
...
>>> tool.close() # Call `close()` to shut off the server when you're done.

Check out some Match object attributes:

>>> matches[0].ruleId, matches[0].replacements # ('EN_A_VS_AN', ['an'])
('EN_A_VS_AN', ['an'])
>>> matches[1].ruleId, matches[1].replacements
('TOT_HE', ['to the'])

Print a Match object:

>>> print(matches[1])
Line 1, column 51, Rule ID: TOT_HE[1]
Message: Did you mean 'to the'?
Suggestion: to the
...

Automatically apply suggestions to the text:

>>> tool.correct(text)
'A sentence with an error in the Hitchhiker’s Guide to the Galaxy'

From the command line:

$ echo 'This are bad.' > example.txt
$ language_tool_python example.txt
example.txt:1:1: THIS_NNS[3]: Did you mean 'these'?

Closing LanguageTool

language_tool_python runs a LanguageTool Java server in the background. It will shut the server off when garbage collected, for example when a created language_tool_python.LanguageTool object goes out of scope. However, if garbage collection takes awhile, the process might not get deleted right away. If you're seeing lots of processes get spawned and not get deleted, you can explicitly close them:

import language_tool_python
tool = language_tool_python.LanguageToolPublicAPI('de-DE') # starts a process
# do stuff with `tool`
tool.close() # explicitly shut off the LanguageTool

You can also use a context manager (with .. as) to explicitly control when the server is started and stopped:

import language_tool_python

with language_tool_python.LanguageToolPublicAPI('de-DE') as tool:
  # do stuff with `tool`
# no need to call `close() as it will happen at the end of the with statement

Client-Server Model

You can run LanguageTool on one host and connect to it from another. This is useful in some distributed scenarios. Here's a simple example:

server

>>> import language_tool_python
>>> tool = language_tool_python.LanguageTool('en-US', host='0.0.0.0')
>>> tool._url
'http://0.0.0.0:8081/v2/'

client

>>> import language_tool_python
>>> lang_tool = language_tool_python.LanguageTool('en-US', remote_server='http://0.0.0.0:8081')
>>>
>>>
>>> lang_tool.check('helo darknes my old frend')
[Match({'ruleId': 'UPPERCASE_SENTENCE_START', 'message': 'This sentence does not start with an uppercase letter.', 'replacements': ['Helo'], 'offsetInContext': 0, 'context': 'helo darknes my old frend', 'offset': 0, 'errorLength': 4, 'category': 'CASING', 'ruleIssueType': 'typographical', 'sentence': 'helo darknes my old frend'}), Match({'ruleId': 'MORFOLOGIK_RULE_EN_US', 'message': 'Possible spelling mistake found.', 'replacements': ['darkness', 'darkens', 'darkies'], 'offsetInContext': 5, 'context': 'helo darknes my old frend', 'offset': 5, 'errorLength': 7, 'category': 'TYPOS', 'ruleIssueType': 'misspelling', 'sentence': 'helo darknes my old frend'}), Match({'ruleId': 'MORFOLOGIK_RULE_EN_US', 'message': 'Possible spelling mistake found.', 'replacements': ['friend', 'trend', 'Fred', 'freed', 'Freud', 'Friend', 'fend', 'fiend', 'frond', 'rend', 'fr end'], 'offsetInContext': 20, 'context': 'helo darknes my old frend', 'offset': 20, 'errorLength': 5, 'category': 'TYPOS', 'ruleIssueType': 'misspelling', 'sentence': 'helo darknes my old frend'})]
>>>

Configuration

LanguageTool offers lots of built-in configuration options.

Example: Enabling caching

Here's an example of using the configuration options to enable caching. Some users have reported that this helps performance a lot.

import language_tool_python
tool = language_tool_python.LanguageTool('en-US', config={ 'cacheSize': 1000, 'pipelineCaching': True })

Example: Setting maximum text length

Here's an example showing how to configure LanguageTool to set a maximum length on grammar-checked text. Will throw an error (which propagates to Python as a language_tool_python.LanguageToolError) if text is too long.

import language_tool_python
tool = language_tool_python.LanguageTool('en-US', config={ 'maxTextLength': 100 })

Full list of configuration options

Here's a full list of configuration options. See the LanguageTool HTTPServerConfig documentation for details.

'maxTextLength' - maximum text length, longer texts will cause an error (optional)
'maxTextHardLength' - maximum text length, applies even to users with a special secret 'token' parameter (optional)
'secretTokenKey' - secret JWT token key, if set by user and valid, maxTextLength can be increased by the user (optional)
'maxCheckTimeMillis' - maximum time in milliseconds allowed per check (optional)
'maxErrorsPerWordRate' - checking will stop with error if there are more rules matches per word (optional)
'maxSpellingSuggestions' - only this many spelling errors will have suggestions for performance reasons (optional,
                          affects Hunspell-based languages only)
'maxCheckThreads' - maximum number of threads working in parallel (optional)
'cacheSize' - size of internal cache in number of sentences (optional, default: 0)
'cacheTTLSeconds' - how many seconds sentences are kept in cache (optional, default: 300 if 'cacheSize' is set)
'requestLimit' - maximum number of requests per requestLimitPeriodInSeconds (optional)
'requestLimitInBytes' - maximum aggregated size of requests per requestLimitPeriodInSeconds (optional)
'timeoutRequestLimit' - maximum number of timeout request (optional)
'requestLimitPeriodInSeconds' - time period to which requestLimit and timeoutRequestLimit applies (optional)
'languageModel' - a directory with '1grams', '2grams', '3grams' sub directories which contain a Lucene index
                  each with ngram occurrence counts; activates the confusion rule if supported (optional)
'word2vecModel' - a directory with word2vec data (optional), see
https://github.com/languagetool-org/languagetool/blob/master/languagetool-standalone/CHANGES.md#word2vec
'fasttextModel' - a model file for better language detection (optional), see
                  https://fasttext.cc/docs/en/language-identification.html
'fasttextBinary' - compiled fasttext executable for language detection (optional), see
                  https://fasttext.cc/docs/en/support.html
'maxWorkQueueSize' - reject request if request queue gets larger than this (optional)
'rulesFile' - a file containing rules configuration, such as .langugagetool.cfg (optional)
'warmUp' - set to 'true' to warm up server at start, i.e. run a short check with all languages (optional)
'blockedReferrers' - a comma-separated list of HTTP referrers (and 'Origin' headers) that are blocked and will not be served (optional)
'premiumOnly' - activate only the premium rules (optional)
'disabledRuleIds' - a comma-separated list of rule ids that are turned off for this server (optional)
'pipelineCaching' - set to 'true' to enable caching of internal pipelines to improve performance
'maxPipelinePoolSize' - cache size if 'pipelineCaching' is set
'pipelineExpireTimeInSeconds' - time after which pipeline cache items expire
'pipelinePrewarming' - set to 'true' to fill pipeline cache on start (can slow down start a lot)

Installation

To install via pip:

$ pip install --upgrade language_tool_python

What rules does LanguageTool have?

Searching for a specific rule to enable or disable? Curious the breadth of rules LanguageTool applies? This page contains a massive list of all 5,000+ grammatical rules that are programmed into LanguageTool: https://community.languagetool.org/rule/list?lang=en&offset=30&max=10

Customizing Download URL or Path

If LanguageTool is already installed on your system, you can defined the following environment variable:

$ export LTP_JAR_DIR_PATH = /path/to/the/language/tool/jar/files

Overwise, language_tool_python can download LanguageTool for you automatically.

To overwrite the host part of URL that is used to download LanguageTool-{version}.zip:

$ export LTP_DOWNLOAD_HOST = [alternate URL]

This can be used to downgrade to an older version, for example, or to download from a mirror.

And to choose the specific folder to download the server to:

$ export LTP_PATH = /path/to/save/language/tool

The default download path is ~/.cache/language_tool_python/. The LanguageTool server is about 200 MB, so take that into account when choosing your download folder. (Or, if you you can't spare the disk space, use a remote URL!)

Prerequisites

The installation process should take care of downloading LanguageTool (it may take a few minutes). Otherwise, you can manually download LanguageTool-stable.zip and unzip it into where the language_tool_python package resides.

LanguageTool Version

As of April 2020, language_tool_python was forked from language-check and no longer supports LanguageTool versions lower than 4.0.

Acknowledgements

This is a fork of https://github.com/myint/language-check/ that produces more easily parsable results from the command-line.

language_tool_python's People

Contributors

aokellermann avatar bobbens avatar bzm3r avatar dcbaker avatar hiddenspirit avatar jayvdb avatar jie-mei avatar johndle avatar jxmorris12 avatar matthewmcintire-savantx avatar misrasaurabh1 avatar myint avatar paulminogue avatar pidefrem avatar rafaelwo avatar samber avatar sils avatar tetris0k avatar tilka avatar tuomastik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

language_tool_python's Issues

How to apply remote server and other arguments

Hi,
is there a way to add arguments to this interpreter call? Thanks for your help.

import language_tool_python
tool = language_tool_python.LanguageTool('en-US')
text = u'A sentence with a error in the Hitchhiker’s Guide tot he Galaxy'
matches = tool.check(text)
len(matches)

how to run language_tool_python.tool("en-US") from pre-downloaded server

Hey,

I need to run the language_tool_python package on a remote server that does not allow for the tool to download the "en-US" server (URLError) when its running there. I can do the pip install language_tool_python without problems but can not download the server during runtime. Is there a way that I can manually download the "en-Us" server, copy it to the remote server and run language_tool_python from there without connecting to the internet when i run the language_tool_python.tool ?

Can not deploy Cloud Function with language_tool_python in Python 3.7 runtime

How you use language tool in Google Cloud Function? I'm trying to deploy in GCP Cloud Function with runtime Python 3.7. I could use the library using my local virtual environment where I have java installed.

But when I'm trying to deploy it in Cloud Function, I'm getting ModuleNotFoundError: No java install detected. Please install java to use language-tool-python.

I'm using the language_tool_python library in the cloud function

# install and import for grammar accuracy
import language_tool_python
tool = language_tool_python.LanguageTool('en-IN')
matches = tool.check(input_string)

in the requirement.txt we have --

language-tool-python==2.4.5

I'm getting the following error message --

Function failed on loading user code. Error message: Code in file main.py can't be loaded. Did you list all required modules in requirements.txt? Detailed stack trace: Traceback (most recent call last):
 File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker_v2.py", line 359, in check_or_load_user_function _function_handler.load_user_function() 
File "/env/local/lib/python3.7/site-packages/google/cloud/functions/worker_v2.py", 
 line 236, in load_user_function spec.loader.exec_module(main_module) File "<frozen importlib._bootstrap_external>", line 728, in exec_module File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
File "/user_code/main.py", line 10, in <module> from libraries.unknown_word import word_meaning File "/user_code/libraries/unknown_word.py", line 18, in <module> tool = language_tool_python.LanguageTool('en-IN') 
File "/env/local/lib/python3.7/site-packages/language_tool_python/server.py", line 46, in __init__ self._start_server_on_free_port() File "/env/local/lib/python3.7/site-packages/language_tool_python/server.py", line 183, in _start_server_on_free_port self._start_local_server() 
File "/env/local/lib/python3.7/site-packages/language_tool_python/server.py", line 193, in _start_local_server download_lt() File "/env/local/lib/python3.7/site-packages/language_tool_python/download_lt.py", line 144, in download_lt confirm_java_compatibility() 
File "/env/local/lib/python3.7/site-packages/language_tool_python/download_lt.py", line 75, in confirm_java_compatibility 
raise ModuleNotFoundError('No java install detected. Please install java to use language-tool-python.') 
ModuleNotFoundError: No java install detected. Please install java to use language-tool-python. 

Can anyone please provide a solution how to use LanguageTool from server less functions? How can we have java environment in Cloud Function along with Python 3.7?

Use locally downloaded language tool instead of downloading it from url?

When using the local server, there is a download from a url for the language tool during initialization. Is it possible to add a local configuration that lets you use an already downloaded tool under LTP_PATH and bypass the download code? Wondering if there's a way to bypass the download from the url with 100% certainty.

urllib Forbidden 403 when retrieving English language

I believe this is a similar issue to this: myint/language-check#25

I've been getting the following error when running the line tool = language_tool_python.LanguageTool('en-US') during my Docker build:

image

I've looked into this specific urllib error. Most suggest this is a web-scrape prevention mechanism of websites, but the LanguageTool .jar server wouldn't care about that.

The strangest thing is that this Docker build works when I build in my company's main VPN, but fails when I build in my company's specialized VPN. I'd assume that the LanguageTool .jar server isn't being installed correctly on the specialized network, but the stack trace implies that LanguageTool was installed just fine and is running. It's just the urllib 403 forbidden error that happens afterwards when querying the server for http://127.0.0.0:8081/v2/languages. I presume all other attempts to query the language server would result in the same issue.

Some details:

  1. OS is Debian GNU/Linux 11.0
  2. Python version 3.8
  3. Java 11.0 JVM
  4. language_tool_python version 2.5.4, installing LanguageTool server version 5.3

I can provide any other info as needed. I think I've exhausted all attempts to fix this issue though...

Disabling spell check = HTTP Error 400?

I'm trying to disable spell checking, but when I do, I get language_tool_python.utils.LanguageToolError: http://127.0.0.1:8081/v2/: HTTP Error 400: Bad Request.

Code is:
tool = language_tool_python.LanguageTool('en-GB')
tool.disable_spellchecking()
tool.correct(somestringhere)

en-US

@jayvdb @hiddenspirit @Tilka @myint @misrasaurabh1

import language_tool_python
tool = language_tool_python.LanguageTool('en-UK')
# tool = language_tool_python.LanguageTool('en-US')
text = 'A sentence with a error in the Hitchhiker’s Guide tot he Galaxy'
matches = tool.check(text)
len(matches)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-10-21873d5f8bb0> in <module>()
      1 import language_tool_python
----> 2 tool = language_tool_python.LanguageTool('en-UK')
      3 # tool = language_tool_python.LanguageTool('en-US')
      4 text = 'A sentence with a error in the Hitchhiker’s Guide tot he Galaxy'
      5 matches = tool.check(text)

2 frames
/usr/local/lib/python3.7/dist-packages/language_tool_python/server.py in _query_server(self, url, params, num_tries)
    216         for n in range(num_tries):
    217             try:
--> 218                 with requests.get(url, params=params, timeout=self._TIMEOUT) as response:
    219                     try:
    220                         return response.json()

AttributeError: __enter__

match.errorlength missing

error when correcting matches

File "/Users/aidantan/grammer_check_transformer/src/grammar_fix.py", line 11, in grammar_correct
revised_sentence = tool.correct(sentence)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/language_tool_python/init.py", line 288, in correct
return correct(text, self.check(text, srctext))
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/language_tool_python/init.py", line 517, in correct
errors = [ltext[match.offset:match.offset + match.errorlength]
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/language_tool_python/init.py", line 517, in
errors = [ltext[match.offset:match.offset + match.errorlength]
TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'

check with wordlist to ignore certain suggestions

Is there a way to provide the language tool with a list of words that should NOT be marked as mistakes?
I have a lot of technical terms in my data that are wrongly corrected when automatically applying the suggestions of the language tool.

ConnectionResetError encountered while multi threading

Help!
When i use python's concurrent.future.ThreadPoolExecutor to concurrently run a large number of threads in
parallel, the sript would throw the following error.

However, when i descrease the number of cucurrent threads to a number low enough, such errors can be avoided.

So, my question is, how can i avoid this error, when i want the script to run as many threads as my machine's hardware allows?

Any help or input would be much appreciated.

Traceback (most recent call last):
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 449, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 444, in _make_request
    httplib_response = conn.getresponse()
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 1374, in getresponse
    response.begin()
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 279, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\socket.py", line 705, in readinto
    return self._sock.recv_into(b)
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\adapters.py", line 489, in send
    resp = conn.urlopen(
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 787, in urlopen
    retries = retries.increment(
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\util\retry.py", line 550, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\packages\six.py", line 769, in reraise
    raise value.with_traceback(tb)
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 449, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\urllib3\connectionpool.py", line 444, in _make_request
    httplib_response = conn.getresponse()
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 1374, in getresponse
    response.begin()
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 318, in begin
    version, status, reason = self._read_status()
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\http\client.py", line 279, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\socket.py", line 705, in readinto
    return self._sock.recv_into(b)
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\language_tool_python\server.py", line 218, in _query_server
    with requests.get(url, params=params, timeout=self._TIMEOUT) as response:
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\requests\adapters.py", line 547, in send
    raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "D:\dev\jkyndir\Tech\Python\LingoCheck\LTCheck_on_xls.py", line 219, in <module>
    main()
  File "D:\dev\jkyndir\Tech\Python\LingoCheck\LTCheck_on_xls.py", line 213, in main
    ltCheck_col_in_xls(args.dir, args.col, args.lng)
  File "D:\dev\jkyndir\Tech\Python\LingoCheck\LTCheck_on_xls.py", line 193, in ltCheck_col_in_xls
    future.result()
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\_base.py", line 439, in result
    return self.__get_result()
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\_base.py", line 391, in __get_result
    raise self._exception
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "D:\dev\jkyndir\Tech\Python\LingoCheck\LTCheck_on_xls.py", line 182, in ltCheck_col_in_xl
    o_df = ltCheck_col_in_dfDict(lngTool, i_df_dicts, iStr_colNm)
  File "D:\dev\jkyndir\Tech\Python\LingoCheck\LTCheck_on_xls.py", line 169, in ltCheck_col_in_dfDict
    ltCheckResult_df = ltCheck_col_in_df(lngTool, i_df, iStr_colNm)
  File "D:\dev\jkyndir\Tech\Python\LingoCheck\LTCheck_on_xls.py", line 152, in ltCheck_col_in_df
    if f.result() is not None:
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\_base.py", line 439, in result
    return self.__get_result()
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\_base.py", line 391, in __get_result
    raise self._exception
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\concurrent\futures\thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "D:\dev\jkyndir\Tech\Python\LingoCheck\LTCheck_on_xls.py", line 124, in ltCheck_col_in_row
    row_ltCheckResult_df = ltCheck(lngTool, iStr)
  File "D:\dev\jkyndir\Tech\Python\LingoCheck\LTCheck_on_xls.py", line 81, in ltCheck
    matches = lngTool.check(iStr)
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\language_tool_python\server.py", line 129, in check
    response = self._query_server(url, self._create_params(text))
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\language_tool_python\server.py", line 229, in _query_server
    self._terminate_server()
  File "C:\Users\devJ\AppData\Local\Programs\Python\Python310\lib\site-packages\language_tool_python\server.py", line 309, in _terminate_server
    self._server.terminate()
AttributeError: 'NoneType' object has no attribute 'terminate'

Clarify the License

Hi there. Thanks for taking over maintenance of this, it looks quite useful.

The previous fork is/was licensed under the LGPL.

I see here the code seems to now say the repository is licensed under the MIT license with no mention of LGPL.

I do see some code (probably quite a bit) that remains from the original repository and which therefore in those sections seems to be LGPL licensed.

Can you clarify whether you have permission from the original authors to relicense the code under the MIT license or whether in fact the code remains under the LGPL?

Thanks!

Update to latest version

Hi,

I was playing with the package (which is really nice btw ! thanks) and I found that some mistakes don't have replacements, but there are replacements with the online tester (https://languagetool.org/fr).

One is about plural, wordX wordY which propose => wordX_plural wordY_plural or wordX_sing wordY_sing.

However when using it with the python package I don't get this replacements (it's just empty).

Could it come from a more recent version (5.4 online) than the package one (5.1)?
If yes could you please update it ?

Else, could you point me in the direction to have this type of corrections in the python package ?

Thanks a lot,

Have a great day.

close java after task completed

i have many java defunct process when running task in loop, and cause cpu usage and memory issues, any methode to close java after task completed?

58802 ? Z 0:22 [java]
58892 ? Z 0:22 [java]
58999 ? Z 0:24 [java]
59127 ? Z 0:37 [java]
59234 ? Z 0:32 [java]
59303 ? Z 0:29 [java]
59441 ? Z 0:36 [java]

error executing :: language_tool_python.correct(text, res)


AttributeError Traceback (most recent call last)
in ()
----> 1 language_tool_python.correct(text, res)

AttributeError: module 'language_tool_python' has no attribute 'correct'

matches are found but for this line I am getting error

language_tool_python.utils.ServerError: Server running; don't start a server here

when i tries to use this model to correct a sentence, a problem occurs that
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/zxk/.conda/envs/clean_label/lib/python3.7/site-packages/language_tool_python/server.py", line 55, in __init__ self._start_server_on_free_port() File "/home/zxk/.conda/envs/clean_label/lib/python3.7/site-packages/language_tool_python/server.py", line 225, in _start_server_on_free_port self._start_local_server() File "/home/zxk/.conda/envs/clean_label/lib/python3.7/site-packages/language_tool_python/server.py", line 283, in _start_local_server raise ServerError('Server running; don\'t start a server here.') language_tool_python.utils.ServerError: Server running; don't start a server here.
when I execute the command tool = language_tool_python.LanguageTool('en-US')
how could this happen?

List of all supported languages?

Hi, sorry for my question but I wish to have a full list of all languages supported by this tool. Over the list I wish to have for any language the "id"(s)
For example I imagine that for english there are "en-US" , "en-GB" exc... For example I want to know the id for Italian. ITALIAN -> "it".

Wrong detection of Java version on Mac

Hi
I've the following Java version

java version "1.8.0_112"
Java(TM) SE Runtime Environment (build 1.8.0_112-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.112-b16, mixed mode)

When I tried to use language_tool_python, I've got the following error

Traceback (most recent call last):
  --- removed my function calls ---
  File "/Users/yfyodorov/.pyenv/versions/anaconda3-4.3.1/lib/python3.6/site-packages/language_tool_python/server.py", line 37, in __init__
    self._start_server_on_free_port()
  File "/Users/yfyodorov/.pyenv/versions/anaconda3-4.3.1/lib/python3.6/site-packages/language_tool_python/server.py", line 183, in _start_server_on_free_port
    cls._start_local_server()
  File "/Users/yfyodorov/.pyenv/versions/anaconda3-4.3.1/lib/python3.6/site-packages/language_tool_python/server.py", line 194, in _start_local_server
    download_lt()
  File "/Users/yfyodorov/.pyenv/versions/anaconda3-4.3.1/lib/python3.6/site-packages/language_tool_python/download_lt.py", line 133, in download_lt
    confirm_java_compatibility()
  File "/Users/yfyodorov/.pyenv/versions/anaconda3-4.3.1/lib/python3.6/site-packages/language_tool_python/download_lt.py", line 74, in confirm_java_compatibility
    raise SystemError('Detected java {}.{}. LanguageTool requires Java >= 8.'.format(major_version, minor_version))
SystemError: Detected java 1.8. LanguageTool requires Java >= 8.

From what I've seen in the code parse_java_version determined major version as 1 and minor as 8 instead of 8 and 0_112 (I supopose) which caused it to throw and error in confirm_java_compatibility.
I've tweaked the check to look at minor version for now and it looks like it is working now.

It seems version parser needs to be fixed to parse java version "1.8.0_112" correctly.

NameError: name 'ElementTree' is not defined

I tried importing Element Tree and it's still not found. I can't find the dependency import in your code either. the language-check package works just fine, so not sure what's the problem here if this is a fork.

tool = language_tool_python.LanguageTool('en-US')

Traceback (most recent call last):

  File "<ipython-input-121-a8b75bd4f267>", line 1, in <module>
    tool = language_tool_python.LanguageTool('en-US')

  File "/Users/xxx/anaconda3/lib/python3.7/site-packages/language_tool_python/server.py", line 44, in __init__
    self._start_server_on_free_port()

  File "/Users/xxx/anaconda3/lib/python3.7/site-packages/language_tool_python/server.py", line 188, in _start_server_on_free_port
    self._start_local_server()

  File "/Users/xxx/anaconda3/lib/python3.7/site-packages/language_tool_python/server.py", line 252, in _start_local_server
    tree = ElementTree.parse(f)

NameError: name 'ElementTree' is not defined

Extracting language from source code

Citing myint/language-check#29, since you asked me to maybe port that issue to your fork:

Cfr. vim-syntastic/syntastic/issues/1918

When using syntastic with language-check on source files, program code is obviously parsed as language, and obviously has a lot of grammatical mistakes.

I know this is a big one, but would it be interesting to add some experimental support to language-check to parse source files? This would be especially useful for LaTeX, as that contains a lot of text. I also think it is useful to parse eg. C, C++ and Java code, and check comments for grammatical and spelling mistakes.

What do you think?

Now, your README doesn't state anything about syntastic anymore, so question one would be:

  • Does your fork support syntastic/would it work with something like syntastic? Do you have any interest in Vim-related support at all?

If so, a second:

  • A lot has happened since November 2016, among which is the world's transition to LSP language servers instead of individual checkers, such as syntactic uses. I'm not sure how a language and spell checker would integrated with such a world, but it would be awesome to check e.g. comments in source code, or text in LaTeX documents. So, how would this integrate?

@myint had a good comment in the original issue about this, I'd love to read how they manage these things in the more modern 2020's. It's been four years, after all!

preloading 'En-US' model in a docker build?

Hi,

First I want to say thanks for creating this great tool. It's awesome!

I have an issue using the tool in a docker container on google cloud. I need to minimise runtime as much as possible, so I'd like to preload the model I need in the build phase rather than downloading it at the point of calling tool = language_tool_python.LanguageTool('en-US'). That way the program won't be waiting for the model to download.

I tried locating the download link in the GCP logs, and manually doing that step in the dockerfile like this:

ENV VERSION 5.0
RUN apt-get install -y wget && \
  apt-get update && apt-get install -y unzip && \
  wget https://www.languagetool.org/download/LanguageTool-$VERSION.zip && \
  unzip LanguageTool-$VERSION.zip && \
  rm LanguageTool-$VERSION.zip

The file downloaded, but when I called the tool command it started downloading again.

I'm no docker expert, but if there is a way to pre-download the model along with the tool I'd love to know how. Thanks!

AttributeError: __enter__

Hi,

I recently encountered an AttributeError when trying to load the model using :

tool = language_tool_python.LanguageTool('fr')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_9928/2112106970.py in <module>
      1 import language_tool_python
      2 
----> 3 tool = language_tool_python.LanguageTool('fr')

c:\users\abel\appdata\local\programs\python\python38\lib\site-packages\language_tool_python\server.py in __init__(self, language, motherTongue, remote_server, newSpellings, new_spellings_persist)
     60             self._new_spellings = newSpellings
     61             self._register_spellings(self._new_spellings)
---> 62         self._language = LanguageTag(language, self._get_languages())
     63         self.motherTongue = motherTongue
     64         self.disabled_rules = set()

c:\users\abel\appdata\local\programs\python\python38\lib\site-packages\language_tool_python\server.py in _get_languages(self)
    187         url = urllib.parse.urljoin(self._url, 'languages')
    188         languages = set()
--> 189         for e in self._query_server(url, num_tries=1):
    190             languages.add(e.get('code'))
    191             languages.add(e.get('longCode'))

c:\users\abel\appdata\local\programs\python\python38\lib\site-packages\language_tool_python\server.py in _query_server(self, url, params, num_tries)
    207         for n in range(num_tries):
    208             try:
--> 209                 with requests.get(url, params=params, timeout=self._TIMEOUT) as response:
    210                     try:
    211                         return response.json()

AttributeError: __enter__

knowing that a few days ago everything was working perfectly !

Do you have any idea where the problem could come from ?

Tkx in advance

Invalid syntax at server.py

When I run server.py, I got an error:

Traceback (most recent call last):
File "language.py", line 1, in
import language_tool_python
File "/home/usr/.local/lib/python3.5/site-packages/language_tool_python/init.py", line 8, in
from .server import LanguageTool, LanguageToolPublicAPI
File "/home/usr/.local/lib/python3.5/site-packages/language_tool_python/server.py", line 135
raise FileNotFoundError(f"Failed to find the spellings file at {spelling_file_path}\n"
^
SyntaxError: invalid syntax

The error runs away when i remove the "f" from both lines FileNotFoundError("Failed to find the spellings file at {spelling_file_path}\n" and "print("Updated the spellings at {spelling_file_path}")". Somehow the current syntax is FileNotFoundError(s"...") and print(f"...") which leads to syntax error.

Python version: 3.5

'Match' object has no attribute 'length'

Unable to iterate through matches list and print the objects (or print an individual match) because length is not being used.

Input
m = tool.check('All of the deaths have occurred in Washington state') print (m[0])

Output:

AttributeError Traceback (most recent call last)
in
1 m = tool.check('All of the deaths have occurred in Washington state')
----> 2 print (m[0])
/env/lib/python3.8/site-packages/language_tool_python/match.py in str(self)
76 ruleId = self.ruleId
77 s = 'Offset {}, length {}, Rule ID: {}'.format(
---> 78 self.offset, self.length, ruleId)
79 if self.message:
80 s += '\nMessage: {}'.format(self.message)
/env/lib/python3.8/site-packages/language_tool_python/match.py in getattr(self, name)
104 def getattr(self, name):
105 if name not in get_match_ordered_dict():
--> 106 raise AttributeError('{!r} object has no attribute {!r}'
107 .format(self.class.name, name))
AttributeError: 'Match' object has no attribute 'length'

Separate client from server

For some distributed computations, the client must be instantiated separately from the server. It must be created within a separate environment since the server cannot be serialized.

Language Tool Python lumps the client and server into a single object so that when the LanguageTool object is created, the server starts. The same object is used to perform queries. This design precludes the distributed computations mentioned above.

Consider the design that Stanford NLP uses, which is to run a server in a separate process and create a client within any Python process.

Adding full stops

Hey, I was just wondering if there is a way to automatically add full stops? (.)

Execution time is too high

Then execution time of matches = tool.check(text) is too high.
I am checking a string which contain 100 characters but its take 25-30 second to execute.
sis this normal behavior or can we improve this ?

spell checker displays output then gets stuck in IDLE

This is my code, I am adding bullets wherever there is a new line and then trying to get spell-checked texts stored in res

import language_tool_python
#tool = language_tool_python.LanguageTool('en-US', config={ 'cacheSize': 1000, 'pipelineCaching': True })

text = 'A sentence with a error in the Hitchhiker’s Guide tot he Galaxy\n. Bet there is something here.\n Bests goats are in town.'
lines = text.split('\n\n' or '\n')
lines = [ "• " + para for para in lines]
res = "\n\n".join(lines)
with language_tool_python.LanguageTool('en-US', config={ 'cacheSize': 1000, 'pipelineCaching': True }) as tool:
  # do stuff with `tool`
    result = tool.correct(res)
    print(result)

I get an output of the corrected text. But there is no error, no other output. Just freezes there. I tried giving explicit close and using with statement as well to no avail.

Rate limiting details

Hello!
I found your library very cool and easy to use.
But I have one question: in documentation you wrote that using Public HTTP Proofreading API have limitation.
However, when I try to check this and run following code I didn't find any limitations and errors.

import language_tool_python
tool = language_tool_python.LanguageToolPublicAPI('es')
for _ in range(10000):
    matches = tool.check(text)
    print(matches)

Can you help me with this? I just want to understand why this happened.

Please add tqdm as dependency

I was trying to get language_tool_python working under ArchLinux but I had to install python-tqdm first, otherwise importing failed.

By the way I have also created a package in the AUR for language_tool_python because I have integrated it as optional spellchecker to Manuskript now and needed a convenient way to install. Keep up the good work. ^^

Detection of your/you're is extremely poor

With the following code:

import language_tool_python
language = language = language_tool_python.LanguageTool("en-UK")

resultsOne = language.check("If you think this sentence is fine then, your wrong.")
resultsTwo = language.check("You're mum is called Emily, is that right?")

print(f"resultsOne:{resultsOne}\nresultsTwo:{resultsTwo}")

The output is:

resultsOne: []
resultsTwo: []

These are very obvious grammar mistakes, and yet the bot isn't picking them up for some reason.

Error

import language_tool_python
tool = language_tool_python.LanguageTool('en-US')

Error:

Traceback (most recent call last):
  File "/Users/petar.ulev/Documents/competence-score/main.py", line 2, in <module>
    tool = language_tool_python.LanguageTool('en-US')
  File "/Users/petar.ulev/Documents/competence-score/venv/lib/python3.8/site-packages/language_tool_python/server.py", line 62, in __init__
    self._start_server_on_free_port()
  File "/Users/petar.ulev/Documents/competence-score/venv/lib/python3.8/site-packages/language_tool_python/server.py", line 238, in _start_server_on_free_port
    self._start_local_server()
  File "/Users/petar.ulev/Documents/competence-score/venv/lib/python3.8/site-packages/language_tool_python/server.py", line 248, in _start_local_server
    download_lt()
  File "/Users/petar.ulev/Documents/competence-score/venv/lib/python3.8/site-packages/language_tool_python/download_lt.py", line 144, in download_lt
    confirm_java_compatibility()
  File "/Users/petar.ulev/Documents/competence-score/venv/lib/python3.8/site-packages/language_tool_python/download_lt.py", line 77, in confirm_java_compatibility
    output = subprocess.check_output([java_path, '-version'],
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/subprocess.py", line 415, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/usr/bin/java', '-version']' returned non-zero exit status 1.

java.lang.NoClassDefFoundError

Hi,

I just install the package using pip. I can use the public server using language_tool_python.LanguageToolPublicAPI('fr')

But when I try to use check with a local server, it fails:

matches = tool.check(text)
Traceback (most recent call last):

  Input In [3] in <module>
    matches = tool.check(text)

  File ~/.local/lib/python3.8/site-packages/language_tool_python/server.py:129 in check
    response = self._query_server(url, self._create_params(text))

  File ~/.local/lib/python3.8/site-packages/language_tool_python/server.py:220 in _query_server
    return response.json()

  File ~/.local/lib/python3.8/site-packages/requests/models.py:899 in json
    return complexjson.loads(

  File ~/.local/lib/python3.8/site-packages/simplejson/__init__.py:525 in loads
    return _default_decoder.decode(s)

  File ~/.local/lib/python3.8/site-packages/simplejson/decoder.py:370 in decode
    obj, end = self.raw_decode(s)

  File ~/.local/lib/python3.8/site-packages/simplejson/decoder.py:400 in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())

JSONDecodeError: Expecting value

And if I try from bash, I get:

$ curl -d "text=This is an test." -d "language=en-US" http://127.0.0.1:8082/v2/check
Error: Internal Error: java.lang.NoClassDefFoundError: Could not initialize class org.languagetool.gui.Configuration, detected: en-U

Code I'm using to start the server and check the text:

import language_tool_python

tool = language_tool_python.LanguageTool('en-US')
text = 'A sentence with a error in the Hitchhiker’s Guide tot he Galaxy'
matches = tool.check(text)

Initial download is extremely slow and times out in IDLE

When run in IDLE for the first time (at least on Windows), the tool floods the IDLE shell with thousands of download progress messages and gradually grind to a halt until it reaches several bytes per second and eventually fails.

It works flawlessly in cmd/terminal, but it would be nice if it worked correctly in IDLE as well. I realize that IDLE has a lot of issues and may or may not be worth fully supporting, but currently there's not even any indication to the user about what the problem is.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.