Giter VIP home page Giter VIP logo

gokmp's Introduction

gokmp

String-matching in Golang using the Knuth–Morris–Pratt algorithm (KMP).

Disclaimer

This library was written as part of my Master's Thesis and should be used as a helpful implementation reference for people interested in the Knuth-Morris-Pratt algorithm than as a performance string searching library.

I believe the compiler has since caught up to most of the gains that this library bought me back in the day.

See Documentation on GoDoc.

Example:

package main

import (
  "fmt"
	"github.com/paddie/gokmp"
)

const str = "aabaabaaaabbaabaabaaabbaabaabb"
//          "        _          _      _   "
//                   8          19     26
const pattern = "aabb"

func main() {
	kmp, _ := gokmp.NewKMP(pattern)
	ints := kmp.FindAllStringIndex(str)

	fmt.Println(ints)
}

Output:

[8 19 26]

Tests and Benchmarks:

go test -v -bench .

Output:

=== RUN TestFindAllStringIndex
--- PASS: TestFindAllStringIndex (0.00 seconds)
=== RUN TestFindStringIndex
--- PASS: TestFindStringIndex (0.00 seconds)
=== RUN TestContainedIn
--- PASS: TestContainedIn (0.00 seconds)
=== RUN TestOccurrences
--- PASS: TestOccurrences (0.00 seconds)
=== RUN TestOccurrencesFail
--- PASS: TestOccurrencesFail (0.00 seconds)
PASS
BenchmarkKMPIndexComparison	10000000	       178 ns/op
BenchmarkStringsIndexComparison	10000000	       359 ns/op
ok  	github.com/paddie/gokmp	5.854s

Comparison:

gokmp.FindStringIndex / strings.Index = 178/359 = 0.4958

Almost a 2x improvement over the naive built-in method.

gokmp's People

Contributors

paddie avatar

Stargazers

Rusty avatar  avatar  avatar Ba'tiar Afas Rahmamulia avatar Bogdan Dinu avatar Bruce Du avatar oc avatar TimWhite avatar xiluoduyu avatar  avatar YunLinWen avatar Marius Karnauskas avatar  avatar  avatar  avatar  avatar  avatar  avatar luohua13 avatar JerryKwan avatar ZWLee avatar ebayboy avatar Ben Boyter avatar  avatar Irara avatar Jonathan Lin avatar  avatar Peter Williams avatar C.R. Kirkwood-Watts avatar pyk avatar pj avatar darkdarkfruit avatar brian avatar Prof Syd Xu avatar Unai Avila avatar gangge avatar Hans Rødtang avatar Chris Sim avatar Michael Hood avatar Matthew Clemens avatar Kitsion avatar Nick Presta avatar

Watchers

James Cloos avatar 罗德 avatar

gokmp's Issues

why TEN times slower than strings.contains

contains( 100001 times) * str len( 1193 ) run time:
482.834µs
kmp-containedIn( 100001 times) * str len( 1193 ) run time:
376.550657ms

strings.contains:

        str := "735224100,1152888111,1112227112,1113337982,111777642..."
        var temp bool // prevent compiler optimization

        //strings.containsIn
        containsCount := 1
	containsRunTime := time.Now()
CONTAINSLOOP:
	if strings.Contains("111111111", str) {
		temp = true
	}

	containsCount++
	if containsCount <= MaxLoop {
		goto CONTAINSLOOP
	}

	fmt.Println("contains(", MaxLoop, "times) * str len(", len(slice), ") run time:\n", time.Since(containsRunTime))

        //kmp.containsIn

        kmpCount := 1
	kmpRunTime := time.Now()
	kmp, _ := NewKMP("1111111111")
KMPLOOP:
	if kmp.ContainedIn(str) {
		temp = true
	}

	kmpCount++
	if kmpCount <= MaxLoop {
		goto KMPLOOP
	}
	fmt.Println("kmp-containedIn(", MaxLoop, "times) * str len(", len(slice), ") run time:\n", time.Since(kmpRunTime))

Support searching in streams

Awesome library and thank you for making it public!

I'd like to be able to search inside streams (basically any Reader, or maybe ReaderAt), which should probably not be a hard change! I'll try to implement it myself over the next week, but filing this issue to keep track...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.