kpu / kenlm Goto Github PK

View Code? Open in Web Editor NEW

2.5K 2.5K 510.0 5.95 MB

KenLM: Faster and Smaller Language Model Queries

Home Page: http://kheafield.com/code/kenlm/

License: Other

Shell 0.12% C 0.27% Python 0.48% C++ 95.97% CMake 2.24% Cython 0.92%

kenlm's People

Contributors

Stargazers

Watchers

Forkers

lacozhang vchahun yv kho jmp84 mnuhn dowobeha jonmay xuanhan863 phddone mbardea jiejiang tilde-nlp raphael-forks wilkeraziz khristich changq chagge dl-nisl kaynewest xapajiamnu strubell dmhowcroft jgwinnup graehl emertoglu snakeztc sakamoto33 jtanstat hummyhummy cidermole amirpouya simon-joseph relausen rune-elmgaard elmg mjpost anih skystrife legin13 shannonyu i3thuan5 rimorio jozef-mokry stevenlol kcarnold alextnewman mrhungry engassa larkinzhang bobpaulin vrasneur cshanbo kellensunderland daormar huangpeng1126 fgaim kindraywind sergey-serebryakov asn264 gaohuan2016 dwery allenai iassael linguistliu kpe ncoder ryanleary ishoxing aina9 wenrg yofayed lvjiujin slbinilkumar xd9999 subu-cliqz libardo1 hunan-rostomyan vansky jeffreynghm whaozl leepaul009 citysir garvys xxks-kkk kapoyegou sinboyxx fireae victor9000 a2393439531 alirezadir sanjayraojjn mansurul11 shubhampachori12110095 ganji15 fakira binhvq disha3 happyyolanda enjoysport2022

kenlm's Issues

"bjam install" produces "unable to find file error"

Not sure if it is an issue, but right now I am not able to install the git clone on either windows (under cygwin) or linux machines. Both give me an error when running "./bjam install":

warning: mismatched versions of Boost.Build engine and core
warning: Boost.Build engine (bjam) is 2014.03.00
warning: Boost.Build core (at /usr/share/boost-build) is 2013.05-svn
error: Unable to find file or target named
error: 'prefix-include'
error: referred to from project at
error: '.'

Compression support for Python module

setup.py needs to be adjusted manually to add flags like HAVE_ZLIB under extra_compile_args section in order to be able to read compressed LM files.

Issue compiling on OS X El Capitan

Compiling kenlm on OS X El Capitan with ./bjam yields the following output – any suggestions?

I have also installed Boost via Homebrew.

Using 'darwin' toolset.

rm -rf bootstrap
mkdir bootstrap
cc -o bootstrap/jam0 command.c compile.c constants.c debug.c execcmd.c frames.c function.c glob.c hash.c hdrmacro.c headers.c jam.c jambase.c jamgram.c lists.c make.c make1.c object.c option.c output.c parse.c pathsys.c regexp.c rules.c scan.c search.c subst.c timestamp.c variable.c modules.c strings.c filesys.c builtins.c class.c cwd.c native.c md5.c w32_getreg.c modules/set.c modules/path.c modules/regex.c modules/property-set.c modules/sequence.c modules/order.c execunix.c fileunix.c pathunix.c
make.c:296:37: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( "make\t--\t%s%s\n", spaces( depth ), object_str( t->name ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:296:37: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:303:37: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( "make\t--\t%s%s\n", spaces( depth ), object_str( t->name ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:303:37: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:376:45: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( "bind\t--\t%s%s: %s\n", spaces( depth ),
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:376:45: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:384:45: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( "time\t--\t%s%s: %s\n", spaces( depth ),
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:384:45: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:389:45: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( "time\t--\t%s%s: %s\n", spaces( depth ),
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:389:45: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:731:13: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
spaces( depth ), object_str( t->name ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:731:13: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

6 warnings generated.
modules/path.c:16:12: warning: implicit declaration of function 'file_query' is invalid in C99
[-Wimplicit-function-declaration]
return file_query( list_front( lol_get( frame->args, 0 ) ) ) ?
^
1 warning generated.
./bootstrap/jam0 -f build.jam --toolset=darwin --toolset-root= clean
...found 1 target...
...updating 1 target...
...updated 1 target...
./bootstrap/jam0 -f build.jam --toolset=darwin --toolset-root=
...found 139 targets...
...updating 3 targets...
[MKDIR] bin.macosxx86_64
[COMPILE] bin.macosxx86_64/b2
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: optimization flag '-finline-functions' is not supported
clang: warning: argument unused during compilation: '-finline-functions'
make.c:296:37: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( "make\t--\t%s%s\n", spaces( depth ), object_str( t->name ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:296:37: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:303:37: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:376:45: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:384:45: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:389:45: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:731:13: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:768:43: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( "->%s%2d Name: %s\n", spaces( depth ), depth, target_name( t
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:768:43: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:772:43: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s%2d Name: %s\n", spaces( depth ), depth, target_name( t
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:772:43: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:778:38: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s Loc: %s\n", spaces( depth ), object_str( t->boundname )
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:778:38: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:784:42: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s : Stable\n", spaces( depth ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:784:42: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:787:41: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s : Newer\n", spaces( depth ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:787:41: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:790:56: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s : Up to date temp file\n", spaces( depth ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:790:56: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:793:65: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s : Temporary file, to be updated\n", spaces( depth )
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:793:65: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:797:61: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s : Been touched, updating it\n", spaces( depth ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:797:61: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:800:56: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s : Missing, creating it\n", spaces( depth ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:800:56: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:803:57: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s : Outdated, updating it\n", spaces( depth ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:803:57: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:806:56: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s : Rebuild, updating it\n", spaces( depth ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:806:56: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:809:47: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s : Updating it\n", spaces( depth ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:809:47: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:812:51: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s : Can not find it\n", spaces( depth ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:812:51: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:815:47: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s : Can make it\n", spaces( depth ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:815:47: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:821:34: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s : ", spaces( depth ) );
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:821:34: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

make.c:833:52: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
printf( " %s : Depends on %s (%s)", spaces( depth ),
^~~~~~~~~~~~~~~
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

                ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~

make.c:833:52: note: use array indexing to silence this warning
make.c:85:44: note: expanded from macro 'spaces'

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

22 warnings generated.
modules/path.c:16:12: warning: implicit declaration of function 'file_query' is invalid in C99 [-Wimplicit-function-declaration]
return file_query( list_front( lol_get( frame->args, 0 ) ) ) ?
^
1 warning generated.
[COPY] bin.macosxx86_64/bjam
...updated 3 targets...
~/Downloads/kenlm
Failed to run bash -c "g++ -dM -x c++ -E /dev/null -include boost/version.hpp 2>/dev/null |grep '#define BOOST_'"
Boost does not seem to be installed or g++ is confused.

Update python module to accomodate renaming of base functions

Installing using pip no longer works since the changes made in 500406a

Pip install fails with the following errors:

python/kenlm.cpp:1430:59: error: ‘class lm::base::Model’ has no member named ‘Score’
python/kenlm.cpp:1450:57: error: ‘class lm::base::Model’ has no member named ‘Score’

python/kenlm.cpp:1637:74: error: ‘class lm::base::Model’ has no member named ‘FullScore’
python/kenlm.cpp:1693:72: error: ‘class lm::base::Model’ has no member named ‘FullScore’

Is it possible to generate a word?

Hi, I was wondering if there was a way to access the probability of the next word in a sentence.

How can I access words from a given index ?

Is reverse lookup supported ?.

'kenlm.Model' object has no attribute 'score'

I've just tried using KenLM, and hit an error.

>>> model = kenlm.Model('LM/en.europarl-nc.lm')
Loading the LM will be faster if you build a binary file.
Reading /Users/bittlingmayer/Desktop/sgnln2/private-SignalN-Research/tsiran/lm/LM/en.europarl-nc.lm
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
*The ARPA file is missing <unk>.  Substituting log10 probability -100.
***************************************************************************************************
>>> model.score('This is a test')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'kenlm.Model' object has no attribute 'score' 
>>> model
<Model from en.europarl-nc.lm>
>>> dir(model)
['BaseFullScore', 'BaseScore', 'BeginSentenceWrite', 'NullContextWrite', '__class__', '__contains__', '__delattr__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'order', 'path']

Any idea what I may be doing wrong?

If it makes a difference, I installed via pip and I'm using python 2.7 (anaconda).

Continuous ngram count

Is there a nice way to emulate SRILM's continuous-ngram-count? My goal is to have markers for punctuation (such as commas, periods, exaclamation marks, etc.) and to be able to keep context across sentences.
Currently I put the whole text on one line, but it's not great memory wise.

clang error pip install inside virtualenv

I can install kenlm Python package outside of virtualenv but having trouble inside virtualenv.

Using Mac OS 10.11.4

nlp $ uname -a
Darwin Motokis-Macintosh.local 15.4.0 Darwin Kernel Version 15.4.0: Fri Feb 26 21:17:08 PST 2016; root:xnu-3248.40.184~2/RELEASE_X86_64 x86_64
nlp $ which clang
/usr/bin/clang
nlp $ clang --version
Apple LLVM version 7.3.0 (clang-703.0.31)
Target: x86_64-apple-darwin15.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

Error message:

(nlp) nlp $ STATIC_DEPS=true pip install https://github.com/kpu/kenlm/archive/master.zip
Collecting https://github.com/kpu/kenlm/archive/master.zip
  Downloading https://github.com/kpu/kenlm/archive/master.zip (513kB)
    100% |████████████████████████████████| 522kB 636kB/s 
Installing collected packages: kenlm
  Running setup.py install for kenlm ... error
    Complete output from command /Users/apewu/smartannotations/nlp/bin/python2.7 -u -c "import setuptools, tokenize;__file__='/var/folders/d1/2291vfk93bq5l675mc1dy21m0000gn/T/pip-obzcbl-build/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/d1/2291vfk93bq5l675mc1dy21m0000gn/T/pip-rSkucL-record/install-record.txt --single-version-externally-managed --compile --install-headers /Users/apewu/smartannotations/nlp/bin/../include/site/python2.7/kenlm:
    running install
    running build
    running build_ext
    building 'kenlm' extension
    creating build
    creating build/temp.macosx-10.11-x86_64-2.7
    creating build/temp.macosx-10.11-x86_64-2.7/util
    creating build/temp.macosx-10.11-x86_64-2.7/lm
    creating build/temp.macosx-10.11-x86_64-2.7/util/double-conversion
    creating build/temp.macosx-10.11-x86_64-2.7/python
    clang -fno-strict-aliasing -fno-common -dynamic -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c util/bit_packing.cc -o build/temp.macosx-10.11-x86_64-2.7/util/bit_packing.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
    clang -fno-strict-aliasing -fno-common -dynamic -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c util/ersatz_progress.cc -o build/temp.macosx-10.11-x86_64-2.7/util/ersatz_progress.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
    clang -fno-strict-aliasing -fno-common -dynamic -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c util/exception.cc -o build/temp.macosx-10.11-x86_64-2.7/util/exception.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
    clang -fno-strict-aliasing -fno-common -dynamic -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c util/file.cc -o build/temp.macosx-10.11-x86_64-2.7/util/file.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
    clang -fno-strict-aliasing -fno-common -dynamic -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c util/file_piece.cc -o build/temp.macosx-10.11-x86_64-2.7/util/file_piece.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
    util/file_piece.cc:37:1: warning: control reaches end of non-void function [-Wreturn-type]
    }
    ^
    In file included from util/file_piece.cc:3:
    In file included from ./util/double-conversion/double-conversion.h:31:
    ./util/double-conversion/utils.h:302:16: warning: unused typedef 'VerifySizesAreEqual' [-Wunused-local-typedef]
      typedef char VerifySizesAreEqual[sizeof(Dest) == sizeof(Source) ? 1 : -1]
                   ^
    2 warnings generated.
    clang -fno-strict-aliasing -fno-common -dynamic -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c util/float_to_string.cc -o build/temp.macosx-10.11-x86_64-2.7/util/float_to_string.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
    In file included from util/float_to_string.cc:3:
    In file included from ./util/double-conversion/double-conversion.h:31:
    ./util/double-conversion/utils.h:302:16: warning: unused typedef 'VerifySizesAreEqual' [-Wunused-local-typedef]
      typedef char VerifySizesAreEqual[sizeof(Dest) == sizeof(Source) ? 1 : -1]
                   ^
    1 warning generated.
    clang -fno-strict-aliasing -fno-common -dynamic -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c util/integer_to_string.cc -o build/temp.macosx-10.11-x86_64-2.7/util/integer_to_string.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
    clang -fno-strict-aliasing -fno-common -dynamic -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c util/mmap.cc -o build/temp.macosx-10.11-x86_64-2.7/util/mmap.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
    util/mmap.cc:246:15: warning: unused variable 'from_size' [-Wunused-variable]
      std::size_t from_size = mem.size();
                  ^
    1 warning generated.
    clang -fno-strict-aliasing -fno-common -dynamic -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c util/murmur_hash.cc -o build/temp.macosx-10.11-x86_64-2.7/util/murmur_hash.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
    clang -fno-strict-aliasing -fno-common -dynamic -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c util/parallel_read.cc -o build/temp.macosx-10.11-x86_64-2.7/util/parallel_read.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
    clang -fno-strict-aliasing -fno-common -dynamic -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c util/pool.cc -o build/temp.macosx-10.11-x86_64-2.7/util/pool.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
    clang -fno-strict-aliasing -fno-common -dynamic -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk -I/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.11.sdk/System/Library/Frameworks/Tk.framework/Versions/8.5/Headers -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/usr/local/Cellar/python/2.7.11/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c util/read_compressed.cc -o build/temp.macosx-10.11-x86_64-2.7/util/read_compressed.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
    util/read_compressed.cc:24:10: fatal error: 'lzma.h' file not found
    #include <lzma.h>
             ^
    1 error generated.
    error: command 'clang' failed with exit status 1

    ----------------------------------------
Command "/Users/apewu/smartannotations/nlp/bin/python2.7 -u -c "import setuptools, tokenize;__file__='/var/folders/d1/2291vfk93bq5l675mc1dy21m0000gn/T/pip-obzcbl-build/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /var/folders/d1/2291vfk93bq5l675mc1dy21m0000gn/T/pip-rSkucL-record/install-record.txt --single-version-externally-managed --compile --install-headers /Users/apewu/smartannotations/nlp/bin/../include/site/python2.7/kenlm" failed with error code 1 in /var/folders/d1/2291vfk93bq5l675mc1dy21m0000gn/T/pip-obzcbl-build/

building python library

Hi, thank you for this nice tool and also thanks for providing a windows version.

I have to work on a sever with Windows Server 8 R2 and I had successfully built the KenLM itself with the project files in the widows folder. However, it always went to error when I was trying to install the Python library on this windows server.

P.S. I am mainly working on python, so KenLM training tool is not that urgent for me since I could train the data from other machine, so I just want to know how to install the python part.

Any help would be appreciated.

equivalent to hidden-ngram

Hi Ken,

Is there a way to replicate with KenLM the workflow where we can build a LM as with the continuous-ngram-count and then query/process a text with hidden-ngram (given an hidden-vocab file) ?

Cheers,
Vince

Add access to vocabulary in python bindings

It would be nice to have access to kenlm.LanguageModel.vocab or even (maybe more pytonic way) to support iterable protocol on kenlm.LanguageModel.

rewindable stream does not detect overrun when blocks are recycled

The current RewindableStream implementation can cycle through blocks with operator++ then not detect an overrun because the same memory block has been recycled.

Build systems

Can you comment on build systems?

bjam is the default and preferred. I see you have provided compile_query_only.sh, presumably for convenience for folks who don't want to bother with boost.

What about cmake? I see that that is added to the system and in fact just build Joshua using it, but it's not clear to me that this was the right thing to do. In particular, cmake does not seem to respond to environmental settings of e.g., KENLM_MAX_ORDER. Why is cmake present, and what is its intended use, and why is it included? It also litters files all over the place.

It seems I should revert to using bjam in my own build process.

(My goal is to make it easier to depend on KenLM. Ideally I'd like to package it as a submodule. I've already separated KenLM from Joshua's wrappers and it works well, apart from the build system complication).

(Caveat: I do not understand modern build systems.)

Compute probabilities of all n-grams

Hi Kenneth!
I am now using kenlm to experiment with different language models. From time to time I need to compute conditional probabilities of all n-grams. Arpa files do not contain them all and there is a rule how to compute probabilities that are not explicitly listed. I wrote a simple 20 line python script that uses arpa package to do that. Basically what that package (arpa) does is it accepts an n-gram string and returns probability of a last word conditioned on prefix. Maybe I did something wrong but it takes "forever" to compute, for instance, all 5-gram probabilities even with hundreds of threads.
I am wondering what would be the best way to compute probabilities of all possible n-grams with kenlm? I looked through the code and your examples and I think something like this may work:

Convert arpa to binary (probing?).
Load that model.
Write OpenMP parallel loop and use FullScoreForgotState to get probabilities.

Does it sound like something reasonable or is there a better way to do it?

Thanks,
Sergey.

compile error

platform : 64bit, Red Hat Enterprise Linux Server release 5.8 (Tikanga)
g++ : g++ (GCC) 4.1.2 20080704 (Red Hat 4.1.2-52)

You must use ./bjam if you want language model estimation, filtering, or support for compressed files (.gz, .bz2, .xz)
Compiling with g++ -I. -O3 -DNDEBUG -DKENLM_MAX_ORDER=6
./util/scoped.hh: In static member function 'static void util::scoped_c_forward<T, clean>::Close(T*) [with T = void, void (* clean)(T*) = free]':
./util/scoped.hh:28:   instantiated from 'util::scoped_base<T, Closer>::~scoped_base() [with T = void, Closer = util::scoped_c_forward<void, free>]'
./util/scoped.hh:55:   instantiated from here
./util/scoped.hh:70: internal compiler error: in build_call, at cp/call.c:321
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://bugzilla.redhat.com/bugzilla> for instructions.
Preprocessed source stored into /tmp/ccFSXFye.out file, please attach this to your bugreport.

Segmentation fault

Hi, I got a segmentation fault when running "lmplz -o 3 < text > arpa" on a corpus. Stack trace is pasted below. I've got lmplz running fine on several other corpora. The only thing special about this corpus is it contains a lot of duplicated sentences, don't know if this could cause the segmentation fault.

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffca054700 (LWP 9216)]
0x00000000004856ca in lm::builder::NGram::IsMarked (this=0x7fffca053c20) at ./lm/builder/ngram.hh:77
77 return Value().count >> (sizeof(Value().count) * 8 - 1);
(gdb) bt
#0 0x00000000004856ca in lm::builder::NGram::IsMarked (this=0x7fffca053c20)

at ./lm/builder/ngram.hh:77

#1 0x000000000048e12a in lm::builder::NGram::CutoffCount (this=0x7fffca053c20)

at ./lm/builder/ngram.hh:93

#2 0x000000000048afa6 in lm::builder::(anonymous namespace)::PruneNGramStream::operator++ (

this=0x7fffca053c20) at /home/cfan/tools/kenlm/lm/builder/initial_probabilities.cc:74

#3 0x000000000048bb40 in lm::builder::(anonymous namespace)::MergeRight::Run (this=0x95bc78,

primary=...) at /home/cfan/tools/kenlm/lm/builder/initial_probabilities.cc:238

#4 0x000000000048df48 in util::stream::Thread::operator()<util::stream::ChainPosition, lm::builder::{anonymous}::MergeRight>(const util::stream::ChainPosition &, lm::builder::(anonymous namespace)::MergeRight &) (this=0x928170, position=..., worker=...) at ./util/stream/chain.hh:77
#5 0x000000000048ddf1 in boost::_bi::list2boost::_bi::value<util::stream::ChainPosition, boost::_bi::valuelm::builder::{anonymous}::MergeRight >::operator()boost::reference_wrapper<util::stream::Thread, boost::_bi::list0>(boost::_bi::type, boost::reference_wrapperutil::stream::Thread &, boost::_bi::list0 &, int) (this=0x95bc40, f=..., a=...) at /usr/include/boost/bind/bind.hpp:313
#6 0x000000000048dccf in boost::_bi::bind_t<void, boost::reference_wrapperutil::stream::Thread, boost::_bi::list2boost::_bi::value<util::stream::ChainPosition, boost::_bi::valuelm::builder::{anonymous}::MergeRight > >::operator()(void) (this=0x95bc38)

at /usr/include/boost/bind/bind_template.hpp:20

#7 0x000000000048dc34 in boost::detail::thread_data<boost::_bi::bind_t<void, boost::reference_wrapperutil::stream::Thread, boost::_bi::list2boost::_bi::value<util::stream::ChainPosition, boost::_bi::valuelm::builder::{anonymous}::MergeRight > > >::run(void) (this=0x95bab0)

at /usr/include/boost/thread/detail/thread.hpp:61

lack of basic functionality

Even the most basic function P(you | where are) cannot be computed,
full_scores("where are you") automatically append "" at the beginning of the phrase which is stupid.
If I really want to compute " ~~where are you", I will append "~~" by myself.~~~~

Parsing ngrams as it is.

Hello, in the kenlm documents I found only one function to use: en_model.score(sentence).
Can you please provide detail description of functions available if there are?
I'm trying to parse unigram, birgram trigram probabilities from LM as they appear there.
For example LM contain the following lines. I need to have a function which will work like this: en_model.bigram_prob("too recognize") will return -4.923469
-4.923469 too recognised
-4.923469 too recognises
-4.923469 too recognize
-4.923469 too recommend
The same for unigrams and trigrams.

Does kenlm support such functionality ?

Thank you,
Zaven.

Different scores for 4-gram and 5-gram LM on sentence whose length is 4

hi @kpu i have a question for you.

i train a 4-gram lm and a 5-gram lm on same corpus with the same configuration.

when i test the language model on a sentence, i found an unreasonable result:

for example, i have a sentence here:

m4 = kenlm.Model('4gram-lm')
m5 = kenlm.Model('5gram-lm')
sent_3 = 'bolivia holds presidential'
s4 = m4.score(sent_3, bos = False, eos = False)
s5 = m5.score(sent_3, bos = False, eos = False)

i test the language model score on a sentence whose length is 3, i got exactly same s4 and s5 which is reasonable.

s4: -13.948734283447266
s5: -13.948734283447266

but when i test on a sentence whose length is 4, strange thing happens:

sent_4 = 'bolivia holds presidential and'
s4 = m4.score(sent_4, bos = False, eos = False)
s5 = m5.score(sent_4, bos = False, eos = False)

s4: -8.61363410949707
s5: -8.647890090942383

i think, s4 and s5 should be same, as we can see, however, i got a little different s4 and s5 there:
because for string the length of which is 4, do not consider bos and eos

p4(w1 w2 w3 w4) = p(w1) * p(w2 | w1) * p(w3 | w1 w2) * p(w4 | w1 w2 w3)
p5(w1 w2 w3 w4) = p(w1) * p(w2 | w1) * p(w3 | w1 w2) * p(w4 | w1 w2 w3)

so p4 and p5 should be the same , right ? can you give me some explanations about this ?

of cause, for sentence whose length is 5, it will be different, because last items in following formula are different.

p4(w1 w2 w3 w4 w5) = p(w1) * p(w2 | w1) * p(w3 | w1 w2) * p(w4 | w1 w2 w3) * p(w5 | w2 w3 w4)
p5(w1 w2 w3 w4 w5) = p(w1) * p(w2 | w1) * p(w3 | w1 w2) * p(w4 | w1 w2 w3) * p(w5 | w1 w2 w3 w4)

libkenlm.so ends with free(): invalid pointer

Hi,

I've tried to use kenlm as a library form in my decoder. However, libkenlm.so gives unexpected results.

You can reproduce my situation as follows. Assume kenlm is compiled.

cd </path/to/kenlm/lm>
g++ -DKENLM_MAX_ORDER=2 -I../  -c -o query_main.o query_main.cc
 g++ -L../lib -o query_main query_main.o -lkenlm
export LD_LIBRARY_PATH=../lib
./query_main test.arpa

This rises a coredump.

My environment is Ubuntu 12.04.2, with g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3.

Add flag to lmplz for lexicographic sorting

Add a flag to lmplz such that in the created ARPA file, within each order the n-grams would be sorted lexicographically.

installing KenLM

Running on OSX, boost 1.55
I'm essentially following the instructions in this document:
http://victor.chahuneau.fr/notes/2012/07/03/kenlm.html

Interestingly, everything worked once, but then stopped working. When I input ./bjam, I get a couple of errors, one involving a broken pipe, but the one I'm most concerned about is:

-bash: ./kenlm/bin/lmplz: No such file or directory

The output begins with: warning: No toolsets are configured.
warning: Configuring default toolset "darwin".
warning: If the default is wrong, your build may not work correctly.
warning: Use the "toolset=xxxxx" option to override our guess.
warning: For more configuration options, please consult
warning: http://boost.org/boost-build2/doc/html/bbv2/advanced/configuration.html
...patience...
...found 628 targets...
...updating 38 targets...

and at the end, the output is...

...failed darwin.link lm/bin/left_test.test/darwin-5.1.0/release/threading-multi/left_test...
...skipped <plm/bin/left_test.test/darwin-5.1.0/release/threading-multi>left_test.run for lack of <plm/bin/left_test.test/darwin-5.1.0/release/threading-multi>left_test...
...failed updating 23 targets...
...skipped 15 targets...

If I could attach the log I would but it's very long!

Mavericks compatibility

I've run into issues trying to compile kenlm on Mavericks 10.9. Using the default clang provided by Xcode:

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn)
Target: x86_64-apple-darwin13.0.0
Thread model: posix

everything seems to compile okay (a few tests fail), but when I go to train a model, I get:

jbg-hackintosh:simtrans jbg$ lmplz -o 3 -S 2G -T /tmp < scratch/lm/train-de > scratch/lm/train-de.arpa
=== 1/5 Counting and sorting n-grams ===
Reading stdin
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100

Function not implemented

I thought maybe clang was the issue, so I also tried with gcc 4.8 (via homebrew), which produces a linking error (which I won't copy here, as it may be a boost issue; haven't debugged fully). My student reproduced the same issue on his Mavericks laptop.

Is there a recommended path for building kenlm in 10.9?

Converting files from ARPA format to KenLM's "--intermediate" format

Hi,

Is there a tool to convert existing ARPA format files to files as generated by lmplz's --intermediate flag ?

Thanks Mittul

Fails to compile with Boost 1.41

Looks like the required flag from Boost.Program_options is used, which was only added in 1.41. I guess the version requirement in CMakeLists.txt should be upped.

[ 85%] Building CXX object lm/CMakeFiles/partial_test.dir/partial_test.cc.o
/home/cortex-m40/kenlm/lm/kenlm_benchmark_main.cc: In function ‘int main(int, char**)’:
/home/cortex-m40/kenlm/lm/kenlm_benchmark_main.cc:200:51: error: ‘class boost::program_options::typed_value<std::basic_string<char>, char>’ has no member named ‘required’
       ("model,m", po::value<std::string>(&model)->required(), "Model to query or convert vocab ids")
                                                   ^
make[2]: *** [lm/CMakeFiles/kenlm_benchmark.dir/kenlm_benchmark_main.cc.o] Error 1
make[1]: *** [lm/CMakeFiles/kenlm_benchmark.dir/all] Error 2

internal complie error

Hi ~ I have some issue when I tried to compile mosesdecoder on RedHat 5.8 & gcc 4.1.2.
Any tip how to fix it?
Thank you very much!

./util/scoped.hh:70: internal compiler error: in build_call, at cp/call.c:321
Please submit a full bug report,
with preprocessed source if appropriate.
See <URL:http://bugzilla.redhat.com/bugzilla> for instructions.
Preprocessed source stored into /tmp/ccnex5v6.out file, please attach this to your bugreport.
...failed gcc.compile.c++ lm/bin/gcc-4.1.2/release/debug-symbols-on/link-static/threading-multi/quantize.o...

Possible Cause of BadDiscountException?

.../libs/kenlm/lm/builder/adjust_counts.cc:61 in void lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const lm::builder::DiscountConfig&) threw BadDiscountException because `discounts_[i].amount[j] < 0.0 || discounts_[i].amount[j] > j'.
ERROR: 1-gram discount out of range for adjusted count 2: -1.6000001
Aborted (core dumped)

What could have happened to cause this error? We preprocessed the files to limit to 10k vocab (replace out of vocab words with ). The files are sufficiently big enough (with line breaks, thanks to the help in the other thread), some output info:

Unigram tokens 77187240 types 10002
=== 2/5 Calculating and sorting adjusted counts ===
Chain sizes: 1:120024 2:37614993408 3:70528114688
ERROR: 1-gram discount out of range for adjusted count 2: -1.75
Aborted (core dumped)

Did fixing the vocabulary externally cause this problem?

'limits' issue compiling on OSX

KenLM is sweet.

In order to compile it on OSX (10.8.3) I had to modify the 'limits' include in:

https://github.com/kpu/kenlm/blob/master/util/file.cc

I added one more include to the very top of this file:

#include <limits.h>

and everything suddenly compiled like magic. The '.h' was the secret sauce.

Brew is pretty nice for the boost stuff too. I was dreading this aspect, but:

$ brew install boost

just worked.

Question about model interpolation

Hello,
I know this isn't an issue but I didn't found anywhere else to ask.
I think there is no way ton interpolate plural arpa models into one as the SRILM and IRSTLM does. Is this a features planned or kenlm don't do it on purpose ?

Thank you anyway for the awesome job with kenlm !

LM without Probabilities including <s> <s/>

Hi,

I am using the tool to build a LM over Entity Grids. As it is obvious i am therefore not interested in including probabilities of n-gramms that contain the sentence boundaries. Is it possible to somehow achieve this? I still want to only calculate n-gramms within a sentence so making one big sentence would not solve the problem.

thanks! (especially for the great tool!)

Perplexity evaluated on full documents or full sentence?

Sorry about this question :(

I ran into some confusion that I always thought perplexity for a document is evaluated per sentence, then you do the average for all the sentences' perplexities in the document. Is this how KenLM implemented bin/query?

Or did KenLM evaluate the perplexity on the whole documents then normalize it by the length of the document?

reserved identifier violation

I would like to point out that identifiers like "LM_NGRAM_QUERY__" and "LM_VOCAB__" do not fit to the expected naming convention of the C++ language standard.
Would you like to adjust your selection for unique names?

PyPI Package

It's possible to upload the official repo into PyPI?

Missing documentation for base of log probabilities

Hello. Firstly thanks for this great tool. The Python support has made this very easy to use alongside nltk for some recent research.

I'm having difficulty finding documentation for the probabilities from model.full_scores(). They appear to be log probabilities, but I'm unsure of which base?

Scanning through the repository, I found this line that seems to indicate that it is base 10:

kenlm/lm/wrappers/nplm.cc

Line 67 in a8a1b55

base_instance_->set_log_base(10.0);

But I can't find any other reason to confirm that this is the case. Thanks.

Cannot compile lmplz_main.cc

I'm using Ubuntu 12.04.
Previously tried to compile using Boost 1.46 but failed at all due to -lboost_exception was not exist in /usr/lib.
Then, tried to compile using Boost 1.55 (/usr/local) but lmplz_main always failed while others compiled successfully. Both source code from github and from http://kheafield.com/code/kenlm.tar.gz complains the same error.

gcc.compile.c++ /home/***/LM/kenlm/lm/builder/bin/gcc-4.6/release/link-static/threading-multi/lmplz_main.o
/home/***/LM/kenlm/lm/builder/lmplz_main.cc: In function ‘int main(int, char**)’:
/home/***/LM/kenlm/lm/builder/lmplz_main.cc:55:72: error: no matching function for call to ‘value(uint64_t*)’
/home/***/LM/kenlm/lm/builder/lmplz_main.cc:55:72: note: candidates are:
/usr/local/include/boost/program_options/detail/value_semantic.hpp:175:5: note: template<class T> boost::program_options::typed_value<T>* boost::program_options::value()
/usr/local/include/boost/program_options/detail/value_semantic.hpp:183:5: note: template<class T> boost::program_options::typed_value<T>* boost::program_options::value(T*)

"g++"  -ftemplate-depth-128 -O3 -finline-functions -Wno-inline -Wall -pthread  -DKENLM_MAX_ORDER=6 -DNDEBUG  -I"." -I"util/double-conversion" -c -o "/home/***/LM/kenlm/lm/builder/bin/gcc-4.6/release/link-static/threading-multi/lmplz_main.o" "/home/***/LM/kenlm/lm/builder/lmplz_main.cc"

...failed gcc.compile.c++ /home/***/LM/kenlm/lm/builder/bin/gcc-4.6/release/link-static/threading-multi/lmplz_main.o...

So, what's wrong with my compilation since I'm new to this?

2-gram discount out of range for adjusted count

I have two files. One file works fine with kenlm, the other gives the following error:

jbg-hackintosh:qblearn jbg$ lmplz -o 2 -S 2G -T -kndiscount /tmp < bl > scratch/Literature/10393.comb.arpa
=== 1/5 Counting and sorting n-grams ===
Reading stdin
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100

Unigram tokens 2366 types 1168
=== 2/5 Calculating and sorting adjusted counts ===
Chain sizes: 1:14016 2:2147469568
/Users/jbg/repositories/kenlm/lm/builder/adjust_counts.cc:50 in void lm::builder::::StatCollector::CalculateDiscounts() threw BadDiscountException because `discounts_[i].amount[j] < 0.0 || discounts_[i].amount[j] > j'.
ERROR: 2-gram discount out of range for adjusted count 3: -0.402645
Abort trap: 6

The only difference between the two files is that one ends with the sentence:

Lord Melbourne offered him a lordship, which he declined

I've also sent the full files to Kenneth via e-mail.

Does not compile due to a linking error with Boost

Part of the output copied below. According to the home page, "Estimation and filtering require Boost at least 1.36.0 and zlib." I have boost 1.46.1 and get the following linking error.

So basically my question is: what version of boost works?

...failed gcc.link util/bin/gcc-4.6/release/link-static/threading-multi/bit_packing_test...
...skipped <putil/bin/gcc-4.6/release/link-static/threading-multi>bit_packing_test.passed for lack of <putil/bin/gcc-4.6/release/link-static/threading-multi>bit_packing_test...
gcc.link util/bin/gcc-4.6/release/link-static/threading-multi/joint_sort_test
util/bin/gcc-4.6/release/link-static/threading-multi/joint_sort_test.o: In function `main':
joint_sort_test.cc:(.text.startup+0xb): undefined reference to `boost::unit_test::unit_test_main(bool (*)(), int, char**)'
collect2: ld returned 1 exit status

    "g++"    -o "util/bin/gcc-4.6/release/link-static/threading-multi/joint_sort_test" -Wl,--start-group "util/bin/gcc-4.6/release/link-static/threading-multi/joint_sort_test.o" "util/bin/gcc-4.6/release/link-static/threading-multi/parallel_read.o" "util/bin/gcc-4.6/release/link-static/threading-multi/read_compressed.o" "util/double-conversion/bin/gcc-4.6/release/link-static/threading-multi/diy-fp.o" "util/double-conversion/bin/gcc-4.6/release/link-static/threading-multi/fixed-dtoa.o" "util/double-conversion/bin/gcc-4.6/release/link-static/threading-multi/bignum.o" "util/double-conversion/bin/gcc-4.6/release/link-static/threading-multi/strtod.o" "util/double-conversion/bin/gcc-4.6/release/link-static/threading-multi/double-conversion.o" "util/double-conversion/bin/gcc-4.6/release/link-static/threading-multi/bignum-dtoa.o" "util/double-conversion/bin/gcc-4.6/release/link-static/threading-multi/fast-dtoa.o" "util/double-conversion/bin/gcc-4.6/release/link-static/threading-multi/cached-powers.o" "util/bin/gcc-4.6/release/link-static/threading-multi/bit_packing.o" "util/bin/gcc-4.6/release/link-static/threading-multi/ersatz_progress.o" "util/bin/gcc-4.6/release/link-static/threading-multi/exception.o" "util/bin/gcc-4.6/release/link-static/threading-multi/file.o" "util/bin/gcc-4.6/release/link-static/threading-multi/file_piece.o" "util/bin/gcc-4.6/release/link-static/threading-multi/mmap.o" "util/bin/gcc-4.6/release/link-static/threading-multi/murmur_hash.o" "util/bin/gcc-4.6/release/link-static/threading-multi/pool.o" "util/bin/gcc-4.6/release/link-static/threading-multi/scoped.o" "util/bin/gcc-4.6/release/link-static/threading-multi/string_piece.o" "util/bin/gcc-4.6/release/link-static/threading-multi/usage.o"  -Wl,-Bstatic -lboost_system-mt -lboost_system-mt -lboost_unit_test_framework-mt -lboost_thread-mt -lz -Wl,-Bdynamic -lSegFault -lrt -Wl,--end-group -pthread 


...failed gcc.link util/bin/gcc-4.6/release/link-static/threading-multi/joint_sort_test...
...skipped <putil/bin/gcc-4.6/release/link-static/threading-multi>joint_sort_test.passed for lack of <putil/bin/gcc-4.6/release/link-static/threading-multi>joint_sort_test...
...failed updating 12 targets...
...skipped 16 targets...

Using KenLM with ngram order greater than 6

I rebuilt my kenlm with max order set to 10 by using:
cmake .. -DKENLM_MAX_ORDER=10
during the build process
, while building, and by updating the setup.py file:
ARGS = ['-O3', '-DNDEBUG', '-DKENLM_MAX_ORDER=10'].

Now, I'm able to use lmplz without an error to build a 7 gram Language model.

However, when trying to use the python interface, I still get the following error:

IOError: Cannot read model '../models/LM_7gram.klm' (lm/model.cc:49 in void lm::ngram::detail::(anonymous namespace)::CheckCounts(const std::vector<uint64_t> &) threw FormatLoadException because counts.size() > 6'. This model has order 7 but KenLM was compiled to support up to 6. If your build system supports changing KENLM_MAX_ORDER, change it there and recompile. In the KenLM tarball or Moses, use e.g. bjam --max-kenlm-order=6 -a'. Otherwise, edit lm/max_order.hh.)

Compilation fails on ubuntu 12.04 because a missing header

File "lm/builder/output.cc" employs std::cer and it results in a compilation error on ubuntu 12.04. Simply, adding "#include " to the file solves the problem.

Is this known? Should I make a pull request?

Process aborted during ngram estimation

This is the full log:

➜  bin/lmplz -o 5 -S 50% -T /tmp <~/data/enwiki-latest-pages-articles >text.arpa 

=== 1/5 Counting and sorting n-grams ===
Reading /home/deeppixel/data/enwiki-latest-pages-articles
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
****************************************************************************************************
Unigram tokens 4027024634 types 8571832
=== 2/5 Calculating and sorting adjusted counts ===
Chain sizes: 1:102861984 2:1237697920 3:2320683776 4:3713093888 5:5414928896
Statistics:
1 8571831 D1=0.682343 D2=1.02373 D3+=1.37025
2 208792530 D1=0.747714 D2=1.07416 D3+=1.35152
3 871078563 D1=0.826502 D2=1.1214 D3+=1.3274
4 1692737525 D1=0.88864 D2=1.18124 D3+=1.33282
5 2308548475 D1=0.874941 D2=1.29421 D3+=1.3912
Memory estimate for binary LM:
type     GB
probing 100 assuming -p 1.5
probing 116 assuming -r models -p 1.5
trie     53 without quantization
trie     31 assuming -q 8 -b 8 quantization 
trie     46 assuming -a 22 array pointer compression
trie     24 assuming -a 22 -q 8 -b 8 array pointer compression and quantization
=== 3/5 Calculating and sorting initial probabilities ===
Chain sizes: 1:102861972 2:1329358848 3:2492548096 4:3988076544 5:5815945216
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
####################################################################################################
=== 4/5 Calculating and writing order-interpolated probabilities ===
Chain sizes: 1:102861972 2:910337664 3:1706883200 4:2731013120 5:3982727424
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
####################################################################################################
=== 5/5 Writing ARPA model ===
----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
**********************************************************************---Last input should have been poison.
[1]    6802 abort      bin/lmplz -o 5 -S 50% -T /tmp < ~/data/enwiki-latest-pages-articles >

error: 'features.h' file not found when `pip install` on mac

    clang -fno-strict-aliasing -fno-common -dynamic -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -I/usr/local/include -I/usr/local/opt/openssl/include -I/usr/local/opt/sqlite/include -I/usr/local/Cellar/python/2.7.10_2/Frameworks/Python.framework/Versions/2.7/include/python2.7 -c util/file.cc -o build/temp.macosx-10.11-x86_64-2.7/util/file.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
    util/file.cc:32:10: fatal error: 'features.h' file not found
    #include <features.h>
             ^
    1 error generated.
    error: command 'clang' failed with exit status 1

Feature: Build KenLM model from n-gram counts file

Hi 👋 ,

it would be nice to learn language models on existing count files that contain n-gram counts. Similar to the ngram-count parameter -read from SRILM. The ability to only load those counts, enables the use of essentially unlimited n-gram statistics like skip-ngram.

.

words distribution about language model

here i have a question about KenLM, i want to use following function:
assume i have a trained 3-gram language model, i want to
get probabilities of all words in vocabulary given a two-words sequence,
say, i have a two-words sequence "A B".
i want to get:
P(A|A B) P(B|A B) P(C|A B) P(D|A B) P(E|A B) and so on
does c++ or python provide this interface ? thanks a lot.

Add support for --help flag

It would be nice if the various executables supported --help and -help flags.

Segmentation faults with a small corpus

Hi,

I can't get KenLM working on my corpus.

I've followed the usual steps:
./bin/lmplz -T /tmp/ --text corpus.txt --arpa myarpa.arpa
./bin/build_binary myarpa.arpa my_probing_model.mmap

Then I tried the snippet from here:
https://kheafield.com/code/kenlm/developers/

With a TrieModel, it always ends with a segfault, regardless of MAX_ORDER. The error occurs here:

lm::ngram::trie::TrieSearch<lm::ngram::DontQuantize, lm::ngram::trie::DontBhiksha>::SetupMemory(unsigned char*, std::vector<unsigned long, std::allocator<unsigned long> > const&, lm::ngram::Config const&) ()

With a ProbingModel, I get a segfault only for MAX_ORDER < 5:

lm::ngram::detail::GenericModel<lm::ngram::detail::HashedSearch<lm::ngram::BackoffValue>, lm::ngram::ProbingVocabulary>::ResumeScore(unsigned int const*, unsigned int const*, unsigned char, unsigned long&, float*, unsigned char&, lm::FullScoreReturn&)

For MAX_ORDER = 5, the C++ program runs only with a couple of Valgrind errors:

==3445== Invalid write of size 8
==3445==    at 0x411B1A: lm::ngram::detail::GenericModel<lm::ngram::detail::HashedSearch<lm::ngram::BackoffValue>, lm::ngram::ProbingVocabulary>::GenericModel(char const*, lm::ngram::Config const&) (in /home/romain/dev/keukeyVoice/corpora/kenlm/ktest)
==3445==    by 0x409920: lm::ngram::ProbingModel::ProbingModel(char const*, lm::ngram::Config const&) (model.hh:136)

Invalid write of size 8
==3445==    at 0x43A06B: lm::ngram::detail::HashedSearch<lm::ngram::BackoffValue>::SetupMemory(unsigned char*, std::vector<unsigned long, std::allocator<unsigned long> > const&, lm::ngram::Config const&) (in /home/romain/dev/keukeyVoice/corpora/kenlm/ktest)
==3445==    by 0x411515: lm::ngram::detail::GenericModel<lm::ngram::detail::HashedSearch<lm::ngram::BackoffValue>, lm::ngram::ProbingVocabulary>::SetupMemory(void*, std::vector<unsigned long, std::allocator<unsigned long> > const&, lm::ngram::Config const&) (in /home/romain/dev/keukeyVoice/corpora/kenlm/ktest)
==3445==    by 0x411FC0: lm::ngram::detail::GenericModel<lm::ngram::detail::HashedSearch<lm::ngram::BackoffValue>, lm::ngram::ProbingVocabulary>::GenericModel(char const*, lm::ngram::Config const&) (in /home/romain/dev/keukeyVoice/corpora/kenlm/ktest)

But a JNA wrapper around the same snippet raises a "malloc(): memory corruption" when loading the model.

I tried with and without pruning, with order 2 and 3, both with KenLM from the download section and this of github. The size of the corpus is about 1Gb.
One peculiarity of the vocabulary is that it contains A LOT of words that are substring of other words of the vocabulary.

I'm aware that it's probably not enough information for proper debugging, but I would be interested to know whether the valgrind errors are ok and if you can suggest me anything to help me find the problem.

My system is Mint 17. The compilation succeeded with no warning.

Compilation Problem

I'm getting this error while trying to compile kenlm with make -j 4, please help.

undefined reference to `boost::unit_test::ut_detail::normalize_test_case_name

Possibility that smaller LM has better perplexity?

Hi,

I'm running KenLM on LM1B data (Language-Modeling 1 Billion), and for some weird reason, we see some perplexity goes down for extremely small model:

unigram tokens unigram types with OOV exclude OOV
38023755 | 337972 | 156.5 | 148.27
3807417 | 107563 | 247.5 | 215.76
380918 | 32879 | 398.2 | 283.43
37438 | 8406 | 522.2 | 253.29
3728 | 1640 | 392.2 | 118.1

As you can notice that when unigram tokens drop low (smallest model), perplexity magically dropped to 392.2.

How does KenLM calculate perplexity including OOV and excluding OOV?

cppcheck

Hello! I'm new in open source, and like to help)
I check kenlm project on Cppcheck. Cppcheck is a static analysis tool for C/C++ code.
All error linked to "jam-files/engine". Can I fix these errors over pull request?
Or it is not used the code?

[jam-files/engine/compile.c:69]: (error) Buffer is accessed out of bounds.
[jam-files/engine/hcache.c:146]: (error) Common realloc mistake: 'buf' nulled but not freed upon failure
[jam-files/engine/lists.c:104]: (error) Pointer to local array variable returned.
[jam-files/engine/lists.c:135]: (error) Pointer to local array variable returned.
[jam-files/engine/lists.c:35]: (error) Allocation with malloc, return doesnt release it.
[jam-files/engine/make1.c:121]: (error) Allocation with malloc, return doesnt release it.
[jam-files/engine/mkjambase.c:73]: (error) Resource leak: fout
[jam-files/engine/modules/order.c:85]: (error) Memory leak: colors
[jam-files/engine/object.c:262]: (error) Memory leak: m
[jam-files/engine/regexp.c:255]: (error) Memory leak: r
[jam-files/engine/regexp.c:520]: (error) Uninitialized variable: classend
[jam-files/engine/regexp.c:521]: (error) Uninitialized variable: classr
[jam-files/engine/rules.c:552]: (error) Buffer is accessed out of bounds.
[jam-files/engine/yyacc.c:166]: (error) Memory leak: key.string
[jam-files/engine/yyacc.c:195]: (error) Resource leak: grammar_source_f

full list: http://pastebin.com/0AjCPcD

kpu / kenlm Goto Github PK

kenlm's People

Contributors

Stargazers

Watchers

Forkers

kenlm's Issues

Using 'darwin' toolset.

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

define spaces(x) ( " " + ( x > 20 ? 0 : 20-x ) )

Recommend Projects

Recommend Topics

Recommend Org