Comments (5)
That sounds pretty strange. Maybe a place to start would be some more details:
- What do you mean by "fail"?
- Are you getting an exception or does the result differ from expectations?
- Does this happen consistently for a specific word? Is it possible to create a minimum reproduction by using a specific word with a specific TimeLimit value?
- Is the difference with
Check
orSuggest
? - For the machines where the results don't meet expectations, does it always not meet expectations or is the result intermittent even on those machines?
My guess is you are running into the time limits but you mentioned you tinkered with the time limits already. The design of Hunspell has, in my opinion, an awkward timing mechanism to prevent overuse of CPU resources. You might have tried this already, but increasing the MinTimer to a larger value and increasing the TimeLimit values might help ensure you get more consistent results during testing. See:
WeCantSpell.Hunspell/WeCantSpell.Hunspell/QueryOptions.cs
Lines 62 to 85 in 17e7024
from wecantspell.hunspell.
That sounds pretty strange. Maybe a place to start would be some more details:
- What do you mean by "fail"?
I mean that it fails to use the right algorithm to give an answer.
- Are you getting an exception or does the result differ from expectations?
Different result, seems to be using the second algorithm.
- Does this happen consistently for a specific word? Is it possible to create a minimum reproduction by using a specific word with a specific TimeLimit value?
Yes, our tests are made with single and multiple words (a phrase), but always the same, so to know the expected result. We have played around with the TimeLimit value, all the way from 1ms (to force second algorithm) to 3000ms (12 times the base time limit to be sure it's the first one solving the query). That's how we realized that this issue happened in some machines and in those only.
I'm not sure if you're asking for a minimum reproduction example code, if that's the case please tell and I'd gladly write it.
- Is the difference with
Check
orSuggest
?
To be precise, in the multi-word input I run a Check on each word first and Suggest only in those not present in the dictionary, but that part working fine I haven't checked the inner workings of Check. does it use the same timed algorithm-switching mechanism?
- For the machines where the results don't meet expectations, does it always not meet expectations or is the result intermittent even on those machines?
Seems to be consistent in those machines. Will make a deeper check though.
My guess is you are running into the time limits but you mentioned you tinkered with the time limits already. The design of Hunspell has, in my opinion, an awkward timing mechanism to prevent overuse of CPU resources. You might have tried this already, but increasing the MinTimer to a larger value and increasing the TimeLimit values might help ensure you get more consistent results during testing. See:
WeCantSpell.Hunspell/WeCantSpell.Hunspell/QueryOptions.cs
Lines 62 to 85 in 17e7024
As I said, I've played quite a bit with the different time configurations available, changed nothing.
from wecantspell.hunspell.
It sounds like you are saying there is a timing issue and something in the first part of some algorithm is going too slow on some machines which prevents the following part of Suggest from returning results. If I got that right, this is starting to make some sense to me. To debug this, you could create a test case for your specific wods and dictionaries to see if you can find specifically which code in the codebase is returning results and which code is not being executed. Setting breakpoints on or around opLimiter
usages might reveal which specific code is going slow for your specific words and dictionary.
from wecantspell.hunspell.
Let me explain a bit better.
In #40 (comment) you said that at some point of the code, the suggestion algorithm switches from the one it begins using (MapRelated) to NGram, which I call first and second algorithms, respectively.
Now, the issue is that, in some machines, even with a long TimeLimit, I'm getting the same results as when I use a TimeLimit of 1 (to force NGram use internally). The only point in common for those machines is the processor family, as stated.
I will try debugging opLimiter and let you know the results.
from wecantspell.hunspell.
@ADD-eNavarro , I made a new release that might fix your issue. Give it a try and let me know. I was previously using Environment.TickCount which wasn't really a great choice. This new release changes that and may behave differently.
https://github.com/aarondandy/WeCantSpell.Hunspell/releases/tag/5.0.0
from wecantspell.hunspell.
Related Issues (20)
- Thank you! HOT 1
- Infix support for Kurdish language HOT 4
- Suggest() method result inconsitent HOT 15
- [Q] Add custom words to loaded dictionary? HOT 5
- Strong-Naming The Library HOT 1
- Any suggestion on how to use this library for real-time word suggestions? HOT 5
- Areas for improvement: Infrastructure HOT 1
- Areas for improvement: Affix HOT 1
- Areas for improvement: Word List HOT 1
- Restore disabled test: allcaps.aff
- Future target frameworks HOT 7
- Suggest algorithm optimization: Levenshtein distance HOT 1
- can i use it as dotnet tool as part of msbuild in csproj? HOT 2
- Occasional System.IndexOutOfRangeException for Suggest HOT 6
- How to ignore punctuation symbols HOT 3
- Parsing text for individual words HOT 3
- Get words that start with X HOT 1
- Some suggestions have incorrect spelling HOT 2
- Support for UWP HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from wecantspell.hunspell.