Giter VIP home page Giter VIP logo

fuzzy-match's People

Contributors

clementchouteau avatar guillaumekln avatar maxwell1447 avatar panosk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

fuzzy-match's Issues

Production experience?

Hi, this looks very promising for a project where we would run this on iOS/macOS wrapped in Swift, and also compiled to wasm for web.

Are there any production experiences with this to share, or demos / products which integrate it that we could try out? How does it compare with the popular "fzf" algorithm for instance, or other popular fuzzy match tools? Thank you.

Improve reliability with fuzz testing

This could help to get rid of most types of crashes.

Generate a large index with random entries of varied size:

  • many small entries
  • uppercase/lowercase
  • punctuation

Then fuzz the pattern, using different options.

Setup CI

A CI would be useful to check the compilation is working and tests are passing.

Let's try to use GitHub Actions.

Incremental index

We use FuzzyMatch-cli for bigger data, so indexing time counts.
Do you have a plan when this incremental add feature - mentioned in the TODO.md - will be implemented?

Compiled FuzzyMatch-cli binary segfaults

Hi,
We tried to build and use the utility but in the end this is all what happened:

FuzzyMatch-cli  -c mycorpus
STEP	Importing TM: mycorpus	ELAPSE	0.005	TOTAL	0.005
STEP	Sorting Index	ELAPSE	0.117	TOTAL	0.123
STEP	Dump: mycorpus.fmi	ELAPSE	0.029	TOTAL	0.152
Segmentation fault

We have not faced any compilation problems that's why we don't really have any idea what could have gone wrong. This is our cmake (v. 3.18.4) short report:

-- The C compiler identification is GNU 4.8.5
-- The CXX compiler identification is GNU 4.8.5
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Boost: /usr/include (found version "1.57.0") found components: serialization iostreams system regex 
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Check if compiler accepts -pthread
-- Check if compiler accepts -pthread - yes
-- Found Threads: TRUE  
-- Found ICU: /usr/lib64/libicuuc.so;/usr/lib64/libicui18n.so;/usr/lib64/libicudata.so (found version "50.2.0") 
-- Found Boost: /usr/include (found version "1.57.0") found components: program_options 
-- Found GTest: /share/local/src/googletest/build/lib/libgtest.a  
-- Found Boost: /usr/include (found version "1.57.0") found components: filesystem system 
-- Configuring done
-- Generating done
-- Build files have been written to: /share/local/src/fuzzy-match/build

And everything compiles fine.

Scanning dependencies of target FuzzyMatch
[  7%] Building CXX object src/CMakeFiles/FuzzyMatch.dir/fuzzy_match.cc.o
[ 15%] Building CXX object src/CMakeFiles/FuzzyMatch.dir/ngram_matches.cc.o
[ 23%] Building CXX object src/CMakeFiles/FuzzyMatch.dir/suffix_array_index.cc.o
[ 30%] Building CXX object src/CMakeFiles/FuzzyMatch.dir/vocab_indexer.cc.o
[ 38%] Building CXX object src/CMakeFiles/FuzzyMatch.dir/suffix_array.cc.o
[ 46%] Building CXX object src/CMakeFiles/FuzzyMatch.dir/sentence.cc.o
[ 53%] Building CXX object src/CMakeFiles/FuzzyMatch.dir/fuzzy_matcher_binarization.cc.o
[ 61%] Building CXX object src/CMakeFiles/FuzzyMatch.dir/edit_distance.cc.o
[ 69%] Linking CXX shared library libFuzzyMatch.so
[ 69%] Built target FuzzyMatch
Scanning dependencies of target FuzzyMatch-cli
Scanning dependencies of target FuzzyMatch-test
[ 76%] Building CXX object cli/src/CMakeFiles/FuzzyMatch-cli.dir/FuzzyMatch-cli.cc.o
[ 84%] Building CXX object test/CMakeFiles/FuzzyMatch-test.dir/test.cc.o
[ 92%] Linking CXX executable FuzzyMatch-test
[ 92%] Built target FuzzyMatch-test
[100%] Linking CXX executable FuzzyMatch-cli
[100%] Built target FuzzyMatch-cli

Any help is greatly appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.