Comments (4)
This breaks for Python 2.7 too. Pickle has no arguments on 2.x
$ /usr/local/anaconda2/bin/python cliner predict --txt data/examples/ex_doc.txt --out data/predictions --model models/silver.crf --format i2b2
Traceback (most recent call last):
File "cliner", line 60, in
main()
File "cliner", line 52, in main
predict.main()
File "/data/CliNER-master/code/predict.py", line 79, in main
predict(files, args.model, args.output, format=format)
File "/data/CliNER-master/code/predict.py", line 96, in predict
model = pickle.load(f,encoding = 'latin1')
TypeError: load() got an unexpected keyword argument 'encoding'
Removing the encoding argument results in a different error (same command):
File "/data/CliNER-master/code/machine_learning/crf.py", line 181, in predict
clf_byte = bytearray(clf, 'latin1')
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf4 in position 6: ordinal not in range(128)
I'm not sure if this is a fault with the code or the encoding in the silver.crf pickle.
EDIT:
Temporary workaround. Remove the encoding as above, then on line 181 of code/machine_learning/crf.py remove the encoding from the bytearray() arguments:
before:
clf_byte = bytearray(clf, 'latin1')
after:
clf_byte = bytearray(clf)
from cliner.
same here issue is still there
from cliner.
Changes to model.py, predict.py, and crf.py have been to address the encoding issues.
from cliner.
@simthyrearch - for python 3.7 this issue is still there.
(cliner) C:\Users\rgupta98\github\CliNER\code>python
Python 3.7.1 (default, Dec 10 2018, 22:54:23) [MSC v.1915 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>
(cliner) C:\Users\rgupta98\github\CliNER\code>python ..\cliner predict --txt ../data/examples/ex_doc.txt --out ../data/p
redictions --model ../models/silver.crf --format i2b2
Traceback (most recent call last):
File "..\cliner", line 60, in <module>
main()
File "..\cliner", line 52, in main
predict.main()
File "C:\Users\rgupta98\github\CliNER\code\predict.py", line 79, in main
predict(files, args.model, args.output, format=format)
File "C:\Users\rgupta98\github\CliNER\code\predict.py", line 96, in predict
model = pickle.load(f, encoding='utf8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf4 in position 6: invalid continuation byte
(cliner) C:\Users\rgupta98\github\CliNER\code>python ..\cliner predict --txt ../data/examples/ex_doc.txt --out ../data/p
redictions --model ../models/silver.crf --format i2b2
C:\Users\rgupta98\AppData\Local\Continuum\anaconda3\envs\cliner\lib\site-packages\sklearn\base.py:251: UserWarning: Tryi
ng to unpickle estimator DictVectorizer from version 0.19.1 when using version 0.20.1. This might lead to breaking code
or invalid results. Use at your own risk.
UserWarning)
1 of 1
../data/examples/ex_doc.txt
vectorizing words all
predicting labels all
Traceback (most recent call last):
File "..\cliner", line 60, in <module>
main()
File "..\cliner", line 52, in main
predict.main()
File "C:\Users\rgupta98\github\CliNER\code\predict.py", line 79, in main
predict(files, args.model, args.output, format=format)
File "C:\Users\rgupta98\github\CliNER\code\predict.py", line 179, in predict
labels = model.predict_classes_from_document(note)
File "C:\Users\rgupta98\github\CliNER\code\model.py", line 282, in predict_classes_from_document
return self.predict_classes(tokenized_sents)
File "C:\Users\rgupta98\github\CliNER\code\model.py", line 313, in predict_classes
hyperparams = hyperparams)
File "C:\Users\rgupta98\github\CliNER\code\model.py", line 707, in generic_predict
predictions = crf.predict(clf, X)
File "C:\Users\rgupta98\github\CliNER\code\machine_learning\crf.py", line 181, in predict
clf_byte = bytearray(clf)
TypeError: string argument without an encoding
from cliner.
Related Issues (18)
- error when trying to dump the model into tmp file HOT 5
- LSTM HOT 1
- UMLS: package utilities isn't available any more HOT 1
- unable to run HOT 1
- Tokenization question HOT 1
- vector2.txt in LSTM_parameters.txt is not found HOT 2
- evaluate not working HOT 2
- cliner: command not found HOT 1
- format.py not working HOT 3
- Comparison to word "blood" hardcoded in get_cui
- UMLS utility HOT 1
- Cliner is not recognized HOT 3
- Directly interacting with code examples HOT 1
- Unable to run lstm model successfully HOT 3
- Unable to predict using LSTM model HOT 10
- cliner command not found, after using all the steps from README. HOT 2
- Cliner Training foo.model issue HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cliner.