Comments (4)
Hi.
I am also seeing this bug when attempting to use gensim, which uses smart_open to open gz compressed files... here is a minimal reproduction of the issue:
➜ /tmp virtualenv venv
New python executable in venv/bin/python
Installing setuptools, pip...done.
➜ /tmp source venv/bin/activate
(venv)➜ /tmp pip install -U pip smart_open
You are using pip version 6.0.8, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Collecting pip from https://pypi.python.org/packages/b6/ac/7015eb97dc749283ffdec1c3a88ddb8ae03b8fad0f0e611408f196358da3/pip-9.0.1-py2.py3-none-any.whl#md5=297dbd16ef53bcef0447d245815f5144
Using cached pip-9.0.1-py2.py3-none-any.whl
Collecting smart-open
Using cached smart_open-1.5.0.tar.gz
Collecting boto>=2.32 (from smart-open)
Using cached boto-2.46.1-py2.py3-none-any.whl
Collecting bz2file (from smart-open)
Using cached bz2file-0.98.tar.gz
Collecting requests (from smart-open)
Using cached requests-2.13.0-py2.py3-none-any.whl
Installing collected packages: requests, bz2file, boto, smart-open, pip
Running setup.py install for bz2file
Running setup.py install for smart-open
Found existing installation: pip 6.0.8
Uninstalling pip-6.0.8:
Successfully uninstalled pip-6.0.8
Successfully installed boto-2.46.1 bz2file-0.98 pip-9.0.1 requests-2.13.0 smart-open-1.5.0
(venv)➜ /tmp
(venv)➜ /tmp echo 'test text' | gzip > test_text.gz
(venv)➜ /tmp zcat test_text.gz
test text
(venv)➜ /tmp python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import smart_open
>>> fname = './test_text.gz'
>>> fd = smart_open.smart_open(fname)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/venv/local/lib/python2.7/site-packages/smart_open/smart_open_lib.py", line 138, in smart_open
return file_smart_open(parsed_uri.uri_path, mode)
File "/tmp/venv/local/lib/python2.7/site-packages/smart_open/smart_open_lib.py", line 642, in file_smart_open
return compression_wrapper(open(fname, mode), fname, mode)
File "/tmp/venv/local/lib/python2.7/site-packages/smart_open/smart_open_lib.py", line 630, in compression_wrapper
return make_closing(GzipFile)(file_obj, mode)
File "/usr/lib/python2.7/gzip.py", line 94, in __init__
fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb')
TypeError: coercing to Unicode: need string or buffer, file found
from smart_open.
Reproduced. Working on a fix and test to read/write compressed files. In particular, it broke gensim Travis tests.
For completeness, could you paste a code snippet that breaks for you in this version.
CC @robottwo
from smart_open.
model = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin.gz', binary=True)
from smart_open.
Thanks for reporting. Fixed in #110 and released in 1.5.1 on pypi
from smart_open.
Related Issues (20)
- S3 ContentEncoding is disregarded HOT 2
- Support for streaming from ZIP archives broken since 6.0? HOT 1
- Check files consistency between cloud providers storages
- Support for wasb/wasbs protocols HOT 2
- Copiyng and decompressing huge files on the fly HOT 3
- Slow performance due to lack of buffering for GzipFile.write HOT 6
- S3Path or PureS3Path returns NoSuchKeyExists on open('rb') handle intermittently HOT 5
- S3 multipart uploads to text streams are committed on exception HOT 3
- Error when opening docstring HOT 1
- GCS permission denied 'storage.buckets.get' when using 'open' HOT 2
- python 3.11 support?
- Support for type annotations HOT 3
- Suggeted - allowing cache mechanism for files
- Getting OSError in s3 when permission for kms:Decrypt are missing HOT 4
- S3 open fails on files that contain '@' in their path HOT 5
- Writing to FTP fails with error "503 ASCII (Text) data type is not supported for file transfer operations. Please configure your FTP client to use IMAGE (Binary) type and try again" HOT 1
- Test failures with urllib3 2.0.4 HOT 4
- Compatibility issue with soundfile HOT 1
- Add OAuth2 support HOT 1
- pip install for version 3.0.0 failing HOT 14
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from smart_open.