aosingh / lexpy Goto Github PK
View Code? Open in Web Editor NEWPython package for lexicon; Trie and DAWG implementation.
License: GNU General Public License v3.0
Python package for lexicon; Trie and DAWG implementation.
License: GNU General Public License v3.0
As an update is there any way that we can add some information about a word for exmaple wordcount in a text or POS tag or something and result both the searched word and its value
example input: arc:1200
art:1450
bar:2300
example output: dawg.search_with_prefix("a")
[arc:1200,art:1450]
The following code prints true.
trie = Trie()
trie.add_all(['ash', 'ashley'])
print('mash lolley' in trie)
Just needs a small change to return false when the letter is not in node.children
Line 39 in 8535dcb
In Lexpy, ?
means "zero or one character" and *
means "zero or more characters". Based on this, why is the pattern ?*
considered illegal while *?
is allowed? Don't they both have the same semantics here:
*?
: zero or more || zero or one -> zero || zero, zero || one, more || zero, more || one -> zero, one, more -> zero or more
?*
: zero or one || zero or more -> zero || zero, zero || more, one || zero, one || more -> zero, more, one -> zero or more
The code at _utils.py#L15 already translates *?
to *
, why isn't this also done for ?*
?
Line 51 in b69e029
Hi,
I wonder if there is a small issue in the file automata.py, function __words_with_wildcard, between lines 128 and 147, when the case letter=='*' is processed.
If the dictionary is made of, for example, "CHIAC" and "CHIC", and the query is "CHI*C", the result will be return in an incorrect alphabetical order : "CHIC" then "CHIAC".
This is because the case words_at_current_level is processed before checking the children.
So, for "CHI*C",
Any idea? Or maybe did I misunderstood the code?
Best,
Lionel
Why is the restriction on special chars in a wildcard pattern needed?
Line 14 in b69e029
Do you think there's a way to implement a search_with_suffix function that looks for words in the DAWG that contain some suffix? Also is there a way to search the DAWG for words that contain a substring? For instance, if I wanted words that contained the substring "ST," the function would return "first," "star," and "sophisticated" Thanks!
HI,
There is an issue with your Dawg data-structure. The maximum nodeId that it can reach is 2. I tried to print the nodeid and val while inserting and here is the output i got
id and val at current is 1
id and val at current is 2 a
id and val at current is 2 n
id and val at current is 2 p
id and val at current is 2 e
id and val at current is 2 p
id and val at current is 2 l
id and val at current is 2 e
id and val at current is 2 b
id and val at current is 2 a
id and val at current is 2 n
id and val at current is 2 a
id and val at current is 2 n
id and val at current is 2 a
id and val at current is 2 t
id and val at current is 2 c
id and val at current is 2 a
id and val at current is 2 n
id and val at current is 2 a
id and val at current is 2 n
id and val at current is 2 a
it is not creating a child id after first child from root
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.