Comments (4)
Thanks for the issue: I just saw that there are a few chars missing from my new implementation of the unicode table.
I will include them as soon as possible.
I also need to update the Readme and the documentation after the big refactoring that ended with the 0.6.1 release (yesterday). For example, you should not write anymore t.type != "non-bo"
, but t.chunk_type != "NON_WORD"
. Now, all the available variables are conveniently hard-coded in this file.
If you're ok with it, you could send me pybo-related code so I can update it with the new syntax, before our documentation is fully updated. If you don't feel posting it here, send it to me at [email protected]
from botok.
Please check that the missing char now parses as expected, with the 0.6.3 release.
Given the changes in my previous msg, you shouldn't have any problem anymore.
from botok.
solved:
drupchen@drupchen-Inspiron-5558:~$ pybo string "࿖ བཀྲ་ཤིས་བདེ་ལེགས།།"
Loading Trie... (2s.)
࿖_ བཀྲ་ཤིས་ བདེ་ལེགས །།
from botok.
Thank you!
from botok.
Related Issues (20)
- Splitting མངས་བས་ wrong?
- Missing English words at the end of the text during sentence tokenization
- 催更帮助文档!
- understanding custom pipelines HOT 3
- dict like `get` method for Token object
- detect any language
- Download of dialect packs fails on macOS when running CI HOT 1
- Why VOWELS constant only has one vowel? HOT 1
- Invalid index in merge rule silently produces uncalled for result.
- Unexpected skip of syllable while tokenizing.
- POS tags ? distinguishing some patterns HOT 2
- identifying weak syllables HOT 1
- issue with Python 3.9
- importing a custom dictionary HOT 1
- syllable tokenizer request
- syllable component
- Missing pos for PUNCT
- `token.text_unaffixed` failed to add tsek
- Can we remove "Loading Trie... (1s.)" message
- [Feature] Classify all PUNCTs into left and right
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from botok.