Giter VIP home page Giter VIP logo

lua-ahocorasick's Introduction

lua-ahocorasick

This is a lua binding to libahocorasick, part of MultiFast

The Aho-Corasick algorithm allows for fast and efficient string matching against large sets of strings.

Usage

new_automata = require "lua-ahocorasick".new

First, an automata must be built up. You need to add each word you want to search for to the automata.

my_automata = new_automata()

my_automata:add("some")
my_automata:add("strings")
my_automata:add("to")
my_automata:add("search")
my_automata:add("for")

Once the automata has been finalised, it can be used:

my_automata:finalize()

You can inspect the automata (prints to stdout):

my_automata:display()

Finding a needle in a haystack

Without a callback, the result is a Boolean indicating if a match was found.

found = my_automata:search("a string with a word to find")

Getting the matches

If called with a callback, it will be called for each match. Return true from your callback if you want to stop searching.

local str = "A long string with some words in it to find."
did_break = my_automata:search(str, function(start_pos, end_pos, n_matches)
	print("match found at positions "..start_pos.." through "..end_pos..": "..str:sub(start_pos, end_pos))
	return nil -- Continue searching
end, false)
match found at positions 20 through 23: some
match found at positions 37 through 38: to

Streaming input

If you pass true as the next argument, the search will continue from where it left off.

print("A")
my_automata:search("a series of stri", print)
print("B")
my_automata:search("ngs broken over multi", print, true)
print("C")
my_automata:search("ple packets in which t", print, true)
print("D")
my_automata:search("o search.", print, true)
print("E")
A
B
13	19	1
C
D
59	60	1
62	67	1
E

lua-ahocorasick's People

Contributors

james-callahan avatar mwild1 avatar

Watchers

James Cloos avatar 林碧城 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.