Giter VIP home page Giter VIP logo

flashlight-text's Introduction

Flashlight Text: Fast, Lightweight Utilities for Text

Quickstart | Installation | Python Documentation | Citing

CircleCI Join the chat at https://gitter.im/flashlight-ml/community PyPI PyPI - Format codecov

Flashlight Text is a fast, minimal library for text-based operations. It features:

Quickstart

The Flashlight Text Python package containing beam search decoder and Dictionary components is available on PyPI:

pip install flashlight-text

To enable optional KenLM support in Python with the decoder, KenLM must be installed via pip:

pip install git+https://github.com.kpu/kenlm.git

See the full Python binding documentation for examples and more.

Building and Installing

From Source (C++) | From Source (Python) | Adding to Your Own Project (C++)

Requirements

At minimum, C++ compilation requires:

  • A C++ compiler with good C++17 support (e.g. gcc/g++ >= 7)
  • CMake โ€” version 3.16 or later, and make
  • A Linux-based operating system.

KenLM Support: If building with KenLM support, KenLM is required. To toggle KenLM support use the FL_TEXT_USE_KENLM CMake option or the USE_KENLM environment variable when building the Python bindings.

Tests: If building tests, Google Test >= 1.10 is required. The FL_TEXT_BUILD_TESTS CMake option toggles building tests.

Instructions for building/installing the Python bindings from source can be found here.

Building from Source

Building the C++ project from source is simple:

git clone https://github.com/flashlight/text && cd flashlight
mkdir build && cd build
cmake ..
make -j$(nproc)
make test    # run tests
make install # install at the CMAKE_INSTALL_PREFIX

To disable KenLM while building, pass -DFL_TEXT_USE_KENLM=OFF to CMake. To disable building tests, pass -DFL_TEXT_BUILD_TESTS=OFF.

KenLM can be downloaded and installed automatically if not found on the local system. The FL_TEXT_BUILD_STANDALONE option controls this behavior โ€” if disabled, dependencies won't be downloaded and built when building.

Adding Flashlight Text to a C++ Project

Given a simple project.cpp file that includes and links to Flashlight Text:

#include <iostream>

#include <flashlight/lib/text/dictionary/Dictionary.h>

int main() {
  fl::lib::text::Dictionary myDict("someFile.dict");
  std::cout << "Dictionary has " << myDict.entrySize()
            << " entries."  << std::endl;
 return 0;
}

The following CMake configuration links Flashlight and sets include directories:

cmake_minimum_required(VERSION 3.10)
set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

add_executable(myProject project.cpp)

find_package(flashlight-text CONFIG REQUIRED)
target_link_libraries(myProject PRIVATE flashlight::flashlight-text)

Contributing and Contact

Contact: [email protected]

Flashlight Text is actively developed. See CONTRIBUTING for more on how to help out.

Citing

You can cite Flashlight using:

@misc{kahn2022flashlight,
      title={Flashlight: Enabling Innovation in Tools for Machine Learning},
      author={Jacob Kahn and Vineel Pratap and Tatiana Likhomanenko and Qiantong Xu and Awni Hannun and Jeff Cai and Paden Tomasello and Ann Lee and Edouard Grave and Gilad Avidov and Benoit Steiner and Vitaliy Liptchinsky and Gabriel Synnaeve and Ronan Collobert},
      year={2022},
      eprint={2201.12465},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

License

Flashlight Text is under an MIT license. See LICENSE for more information.

flashlight-text's People

Contributors

0xjc avatar ahmedatawfik avatar akhti avatar an918tw avatar andresy avatar avidov avatar jacobkahn avatar jubick1337 avatar kacperkubara avatar mthrok avatar vineelpratap avatar williamtambellini avatar xuqiantong avatar yfeldblum avatar zertosh avatar zpao avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.