Giter VIP home page Giter VIP logo

like-aho-corasick-but-different-py's Introduction

lacbd

lacbd is a Python library written in Rust that implements the Aho Corasick algorithm for fast subsentence matching of many keywords against one string.

You can find the actual Rust library as nitros12/like-aho-corasick-but-different.

Features

  • Supports arbitrary values associated with each keyword
  • Operates on Unicode word bounds, rather than naïve substring matching
  • Case insensitive
  • 10× faster than an equivalent regex

None of the existing python libraries fit my needs.

License

This library is AGPLv3+ licensed. That may seem like an odd choice for a library. However, doing so ensures that users of this code must make their application open source, even if run as a service (such as in a Discord bot). If you want to use this to make proprietary software, look somewhere else.

Copyright © 2019 Ben Simms and Ben Mintz

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see https://www.gnu.org/licenses/.

like-aho-corasick-but-different-py's People

Contributors

ioistired avatar simmsb avatar bmintz avatar altendky avatar

Stargazers

 avatar

Watchers

 avatar  avatar

like-aho-corasick-but-different-py's Issues

Romping for wheels

i cheated and put my single romp command line into a .py to build it but... here's the final result:

https://dev.azure.com/altendky/romp-on/_build/results?buildId=5251
/home/altendky/like-aho-corasick-but-different-py/venv/bin/romp --command 'git clone https://github.com/nitros12/like-aho-corasick-but-different-py repo; git -C repo submodule update --init; python -m pip install cibuildwheel; export CIBW_BUILD='"'"'cp36-* cp37-*'"'"'; export CIBW_ENVIRONMENT='"'"'PATH=$PATH:$HOME/.cargo/bin'"'"'; export CIBW_BEFORE_BUILD='"'"'curl --proto "=https" --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y; source $HOME/.cargo/env; python -m pip freeze; echo $PATH; echo $HOME; cat $HOME/.cargo/env'"'"'; python -m cibuildwheel --output-dir wheelhouse repo' --version 3.7 --artifact-paths wheelhouse

Though this is pretty silly since you can ask romp to just upload a .py (or whatever file) for you and run it remotely rather than packing this all in a local command line. Occasional it's 'fun' though. Or something.

https://dev.azure.com/altendky/romp-on/_build/results?buildId=5252

import os
import pathlib
import subprocess
import sys


command = '; '.join((
    'git clone https://github.com/nitros12/like-aho-corasick-but-different-py repo',
    'git -C repo submodule update --init',
    'python -m pip install cibuildwheel',
    "export CIBW_BUILD='cp36-* cp37-*'",
    "export CIBW_ENVIRONMENT='PATH=$PATH:$HOME/.cargo/bin'",
    """export CIBW_BEFORE_BUILD='curl --proto "=https" --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y; source $HOME/.cargo/env; python -m pip freeze; echo $PATH; echo $HOME; cat $HOME/.cargo/env'""",
    'python -m cibuildwheel --output-dir wheelhouse repo',
))


def main():
    subprocess.run(
        [
            os.fspath(pathlib.Path(sys.executable).with_name('romp')),
            '--command', command,
            '--version', '3.7',
            '--artifact-paths', 'wheelhouse',
        ],
        check=True,
    )

main()

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.