julius-speech / julius Goto Github PK

Open-Source Large Vocabulary Continuous Speech Recognition Engine

License: BSD 3-Clause "New" or "Revised" License

C 85.12% Perl 0.06% Shell 3.22% Makefile 1.49% C++ 5.67% Lex 0.01% Yacc 0.08% CMake 0.16% Python 0.30% M4 0.09% Batchfile 0.01% HTML 0.30% SAS 0.02% CLIPS 0.05% Pascal 0.62% Ada 0.80% Assembly 1.26% C# 0.50% DIGITAL Command Language 0.24% Module Management System 0.01%

speech recognition audio-processing speech-recognition

julius's Introduction

Julius: Open-Source Large Vocabulary Continuous Speech Recognition Engine

Copyright (c) 1991-2020 Kawahara Lab., Kyoto University
Copyright (c) 2005-2020 Julius project team, Lee Lab., Nagoya Institute of Technology
Copyright (c) 1997-2000 Information-technology Promotion Agency, Japan
Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology

About Julius

"Julius" is a high-performance, small-footprint large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers. Based on word N-gram and context-dependent HMM, it can perform real-time decoding on various computers and devices from micro-computer to cloud server. The algorithm is based on 2-pass tree-trellis search, which fully incorporates major decoding techniques such as tree-organized lexicon, 1-best / word-pair context approximation, rank/score pruning, N-gram factoring, cross-word context dependency handling, enveloped beam search, Gaussian pruning, Gaussian selection, etc. Besides search efficiency, it is also modularized to be independent from model structures, and wide variety of HMM structures are supported such as shared-state triphones and tied-mixture models, with any number of mixtures, states, or phone sets. It also can run multi-instance recognition, running dictation, grammar-based recognition or isolated word recognition simultaneously in a single thread. Standard formats are adopted for the models to cope with other speech / language modeling toolkit such as HTK, SRILM, etc. Recent version also supports Deep Neural Network (DNN) based real-time decoding.

The main platform is Linux and other Unix-based system, as well as Windows, Mac, Androids and other platforms.

Julius has been developed as a research software for Japanese LVCSR since 1997, and the work was continued under IPA Japanese dictation toolkit project (1997-2000), Continuous Speech Recognition Consortium, Japan (CSRC) (2000-2003) and Interactive Speech Technology Consortium (ISTC).

The main developer / maintainer is Akinobu Lee ([email protected]).

Features

An open-source LVCSR software (BSD 3-clause license).
Real-time, hi-speed, accurate recognition based on 2-pass strategy.
Low memory requirement: less than 32MBytes required for work area (<64MBytes for 20k-word dictation with on-memory 3-gram LM).
Supports LM of N-gram with arbitrary N. Also supports rule-based grammar, and word list for isolated word recognition.
Language and unit-dependent: Any LM in ARPA standard format and AM in HTK ascii hmm definition format can be used.
Highly configurable: can set various search parameters. Also alternate decoding algorithm (1-best/word-pair approx., word trellis/word graph intermediates, etc.) can be chosen.
List of major supported features:
- On-the-fly recognition for microphone and network input
- GMM-based input rejection
- Successive decoding, delimiting input by short pauses
- N-best output
- Word graph output
- Forced alignment on word, phoneme, and state level
- Confidence scoring
- Server mode and control API
- Many search parameters for tuning its performance
- Character code conversion for result output.
- (Rev. 4) Engine becomes Library and offers simple API
- (Rev. 4) Long N-gram support
- (Rev. 4) Run with forward / backward N-gram only
- (Rev. 4) Confusion network output
- (Rev. 4) Arbitrary multi-model decoding in a single thread.
- (Rev. 4) Rapid isolated word recognition
- (Rev. 4) User-defined LM function embedding
DNN-based decoding, using front-end module for frame-wise state probability calculation for flexibility.

Quick Run

How to test English dictation with Julius and English DNN model. The procedure is for Linux but almost the same for other OS.

(For Japanese dictation, Use dictation kit)

1. Build latest Julius

% sudo apt-get install build-essential zlib1g-dev libsdl2-dev libasound2-dev
% git clone https://github.com/julius-speech/julius.git
% cd julius
% ./configure --enable-words-int
% make -j4
% ls -l julius/julius
-rwxr-xr-x 1 ri lab 746056 May 26 13:01 julius/julius

2. Get English DNN model

Go to JuliusModel page and download the English model(LM+DNN-HMM) named "ENVR-v5.4.Dnn.Bin.zip". Unzip it and cd to there.

% cd ..
% unzip /some/where/ENVR-v5.4.Dnn.Bin.zip
% cd ENVR-v5.4.Dnn.Bin

3. Modify config file

Edit the dnn.jconf file in the unzipped folder to fit the latest version of Julius:

(edit dnn.jconf)
@@ -1,5 +1,5 @@
 feature_type MFCC_E_D_A_Z
-feature_options -htkconf wav_config -cvn -cmnload ENVR-v5.3.norm -cmnstatic
+feature_options -htkconf wav_config -cvn -cmnload ENVR-v5.3.norm -cvnstatic
 num_threads 1
 feature_len 48
 context_len 11
@@ -21,3 +21,4 @@
 output_B ENVR-v5.3.layerout_bias.npy
 state_prior_factor 1.0
 state_prior ENVR-v5.3.prior
+state_prior_log10nize false

4. Recognize audio file

Recognize "mozilla.wav" included in the zip file.

% ../julius/julius/julius -C julius.jconf -dnnconf dnn.jconf

You'll get tons of messages, but the final result of the first speech part will be output like this:

sentence1: <s> without the data said the article was useless </s>
wseq1: <s> without the data said the article was useless </s>
phseq1: sil | w ih dh aw t | dh ax | d ae t ah | s eh d | dh iy | aa r t ah k ah l | w ax z | y uw s l ah s | sil
cmscore1: 0.785 0.892 0.318 0.284 0.669 0.701 0.818 0.103 0.528 1.000
score1: 261.947144

"test.dbl" contains list of audio files to be recognized. Edit the file and run again to test with another files.

5. Run with live microphone input

To run Julius on live microphone input, save the following text as "mic.jconf".

-input mic
-htkconf wav_config
-h ENVR-v5.3.am
-hlist ENVR-v5.3.phn
-d ENVR-v5.3.lm
-v ENVR-v5.3.dct
-b 4000
-lmp 12 -6
-lmp2 12 -6
-fallback1pass
-multipath
-iwsp
-iwcd1 max
-spmodel sp
-no_ccd
-sepnum 150
-b2 360
-n 40
-s 2000
-m 8000
-lookuprange 5
-sb 80
-forcedict

and run Julius with the mic.jconf instead of julius.jconf

% ../julius/julius/julius -C mic.jconf -dnnconf dnn.jconf

Download

The latest release version is 4.6, released on September 2, 2020. You can get the released package from the Release page. See the "Release.txt" file for full list of updates. Run with "-help" to see full list of options.

Install / Build Julius

Follow the instructions in INSTALL.txt.

Tools and Assets

There are also toolkit and assets to run Julius. They are maintained by the Julius development team. You can get them from the following Github pages:

Japanese Dictation Kit

A set of Julius executables and Japanese LM/AM. You can test 60k-word Japanese dictation with this kit. For AM, triphone HMMs of both GMM and DNN are included. For DNN, a front-end DNN module, separated from Julius, computes the state probabilities of HMM for each input frame and send them to Julius via socket to perform real-time DNN decoding. For LM, 60k-word 3-gram trained by BCCWJ corpus is included. You can get it from its GitHub page.

Recognition Grammar Toolkit

Documents, sample files and conversion tools to use and build a recognition grammar for Julius. You can get it from the GitHub page.

Speech Segmentation Toolkit

This is a handy toolkit to do phoneme segmentation (aka phoneme alignments) for speech audio file using Julius. Given pairs of speech audio file and its transcription, this toolkit perform Viterbi alignment to get the beginning and ending time of each phoneme. This toolkit is available at its GitHub page.

Prompter

Prompter is a perl/Tkx based tiny program that displays recognition results of Julius in a scrolling caption style.

About Models

Since Julius itself is a language-independent decoding program, you can make a recognizer of a language if given an appropriate language model and acoustic model for the target language. The recognition accuracy largely depends on the models. Julius adopts acoustic models in HTK ascii format, pronunciation dictionary in almost HTK format, and word 3-gram language models in ARPA standard format (forward 2-gram and reverse N-gram trained from same corpus).

We had already examined English dictations with Julius, and another researcher has reported that Julius has also worked well in English, Slovenian (see pp.681--684 of Proc. ICSLP2002), French, Thai language, and many other Languages.

Here you can get Japanese and English language/acoustic models.

Japanese

Japanese language model (60k-word trained by balanced corpus) and acoustic models (triphone GMM/DNN) are included in the Japanese dictation kit. More various types of Japanese N-gram LM and acoustic models are available at CSRC. For more detail, please contact [email protected].

English

There are some user-contributed English models for Julius available on the Web.

JuliusModels hosts English and Polish models for Julius. All of the models are based on HTK modelling software and data sets available freely on the Internet. They can be downloaded from a project website which I created for this purpose. Please note that DNN version of these models require minor changes which the author included in a modified version of Julius on Github at https://github.com/palles77/julius .

The VoxForge-project is working on the creation of an open-source acoustic model for the English language. If you have any language or acoustic model that can be distributed as a freeware, would you please contact us? We want to run dictation kit on various languages other than Japanese, and share them freely to provide a free speech recognition system available for various languages.

Documents

References

Official web site (Japanese)
Old development site, having old releases
Publications:
- A. Lee and T. Kawahara. "Recent Development of Open-Source Speech Recognition Engine Julius" Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2009.
- A. Lee, T. Kawahara and K. Shikano. "Julius --- an open source real-time large vocabulary recognition engine." In Proc. European Conference on Speech Communication and Technology (EUROSPEECH), pp. 1691--1694, 2001.
- T. Kawahara, A. Lee, T. Kobayashi, K. Takeda, N. Minematsu, S. Sagayama, K. Itou, A. Ito, M. Yamamoto, A. Yamada, T. Utsuro and K. Shikano. "Free software toolkit for Japanese large vocabulary continuous speech recognition." In Proc. Int'l Conf. on Spoken Language Processing (ICSLP) , Vol. 4, pp. 476--479, 2000.

Moved to UTF-8

We are going to move to UTF-8.

The master branch after the release of 4.5 (2019/1/2) has codes converted to UTF-8. All files were converted to UTF-8, and future update will be commited also in UTF-8.

For backward compatibility and log visibility, we are keeping the old encoding codes at branch "master-4.5-legacy". The branch keeps legacy encoding version of version 4.5. If you want to inspect the code progress before the release of 4.5 (2019/1/2), please checkout the branch.

License and Citation

This code is made available under the modified BSD License (BSD-3-Clause License).

Over and above the legal restrictions imposed by this license, when you publish or present results by using this software, we would highly appreciate if you mention the use of "Large Vocabulary Continuous Speech Recognition Engine Julius" and provide proper reference or citation so that readers can easily access the information of the software. This would help boost the visibility of Julius and then further enhance Julius and the related software.

Citation to this software can be a paper that describes it,

A. Lee, T. Kawahara and K. Shikano. "Julius --- An Open Source Real-Time Large Vocabulary Recognition Engine". In Proc. EUROSPEECH, pp.1691--1694, 2001.

A. Lee and T. Kawahara. "Recent Development of Open-Source Speech Recognition Engine Julius" Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 2009.

or a direct citation to this software,

A. Lee and T. Kawahara: Julius v4.5 (2019) https://doi.org/10.5281/zenodo.2530395

or both.

julius's People

Contributors

Stargazers

Watchers

Forkers

imace kuronekodaisuki noymer franklixuefei twistedmove takeyamajp digideskio jongyoonb yama07 domyoriginal abubakr88 lijian8 halspeech felipebetancur jameslinus aki33524 dreadlord1984 brainstormers seem-sky georgesstavracas smartnetguru palles77 jelkawasaki atyenoria mayanksuman chiepomme anujsrc hadeshacker sekine-lab ming-hai iclwy524 chagge ilibx progyuan wellido vck g10dras frankiegu dc500 gyjsuccess mimoccc wow999 lemonhall developer119korea peterzhousz sandy-slin davidulloag weizy1981 ericyao2013 vubahuan icewwn shu-i reloadbrain xiaozhuo12138 hmilysls rarecode trainkin gttmy dfordivam praveenmunagapati kyokuheki lqlstudio1 cosmo0920 praaline brackwang charlieyqin xiaowenth kitter gyley donadonny nadiamostafa sallygam unnonouno hj3938 denjiry superstylin04 jsbrique radamar denysegh alongwithyou machangfengjia asthanam dylancao antoniopessotti sanyaade-speechtools saldi tdautc19841202 dejbug cambricorp lazyuncle google-front-end-weekend-projects aymannedaa mgsong neozephyrus hcxiong mygotone ndf14685 lancezhangsf carlosgili tsingping

julius's Issues

Memory allocation problem

I consitently seem to be getting problems with memory allocation on Windows. I tried to change value of MYBMALLOC_BLOCK_SIZE from 5000 to smaller or higher, to see how memory fragmentation is affecting the allocation, but for some reason I get the same memory block size problem with a message:
Error: mymalloc: failed to allocate 3576400 bytes
It does not matter how long the input WAVE file is.
Below is my sample JCONF file:

-input file
-filelist Models\test.dbl
-htkconf Models\wav_config
-h Models\PLPL-v4.2.am
-hlist Models\PLPL-v4.2.phn
-d Models\PLPL-v4.2.lm
-v Models\PLPL-v4.2.dct
-b 2000
-lmp 12 -4
-lmp2 12 -4
-walign
-fallback1pass
-multipath
-iwsp
-norealtime
-iwcd1 max
-spmodel sp
-spsegment
-gprune none
-no_ccd
-sepnum 150
-b2 360
-n 40
-s 2000
-m 8000
-lookuprange 5
-sb 80
-forcedict

Compliling Julius with pulseaudio?

@nitslp-ri Suggestion: each time I compile Julius I forget how to ask Julius to promote my pulseaudio server to the top so that Julius uses it by default. My memory falsely tells me it should be ./configure --mictype=pulseaudio when it should be --with-mictype=pulseaudio (which works well and promotes pulseaudio as required). I don't know how common an issue this is, but in the past there have been a number of questions asked on forums related to Julius about getting the right sound backend/device in place, some of which do not seem to end in a satisfactory resolution. I wonder if it might be helpful to have a note added to the output of ./configure --help addressing the possible options such as --with-mictype?

Build breakage on AVX2

The latest libsent from the master branch is broken because it uses AVX2 target without actually adding that target.

Julius 4.4 compilation error on Cyginw

Julius 4.4 cannot be compiled on Cygwin. I had to modify calc_dnn.c in lines:
14: #undef AVX
381: float x = 0.0f;

to get it compiling on Cygwin with the following config:
bash configure --enable-sp-segment --enable-words-int --with-mictype=portaudio

failed to begin input stream on raspberry pi2

I install julius on raspberry just with ./configure option and wen I try to run an example tested on Ubuntu I get this error on raspberry: padsp julius -C julian.jconf | ./command.py

failed to begin input stream

so I try to install julius --with-mictype=alsa on raspberry this step works well, but make step stop here:

collect2: error: ld returned 1 exit status
Makefile:57: recipe for target 'julius' failed
make[1]: *** [julius] Error 1
make[1]: Leaving directory '/home/pi/julius/julius'
Makefile:56: recipe for target 'julius' failed
make: *** [julius] Error 2

Somebody can help please.

[Documentation Request] Tutorial on dictation mode

I'm curious to try Julius running on dictation mode. It'd be great to have:

Step-by-step tutorial on setting up Julius to run on dictation mode
Examples using the available Japanese data
Tips on how to improve that

adinrec : a couple of very strange problems

Until now I have been using sox with the silence parameter to record prompts for HTK, but I think the VAD in adinrec is better for this purpose once a couple of problems are eliminated.

on my openSUSE Linux with Gnome I have to be very careful how to specify the command. For example
padsp adinrec test.wav
will generate a lot of
Warning: adin_oss: no data fragment after 300 msec?
messages and fragmented audio is recorded if the Gnome pulseaudio settings window is not open. Open this window, make no setting changes, and the problem goes away, recordings ok.
attempting to use the "-input pulseaudio" option (legal according to help output) results in
STAT: ###### initialize input device STAT: AD-in thread created ......................................Assertion 'pa_atomic_load(&(s)->_ref) >= 1' failed at pulse/stream.c:335, function pa_stream_get_state(). Aborting.
the tailmargin option value is a bit off. For example
-tailmargin 8000 results in silence after the audio of about 1000 ms.

Working around these issues means that adinrec is a good and valuable tool.

Request: diagnostic option to check and exit

When using Julius and making frequent changes to audio and language models, from time to time the startup of Julius will exit abnormally due to missing diphones or triphones, for example. I'm wondering if it would be helpful to have a diagnostic option which would allow Julius to start, run through all its checks loading the models and parameters, but before beginning any recognition would simply exit with an error code, 0 for all checks passed, 1 for error found, and ideally a special return value for possible errors so that you know what to look for in the output.

Right now I launch Julius expecting it to run, but sometimes it does not due to a mistake in my setup. It would be cool to add a diagnostic to my model build routine to ensure no surprises. It's possible this diagnostic already exists and that I am using the various channels like stdout and stderr ineffectively - would be interested to know.

Missing flags for nextword and mkbingram

I am compiling Julius under Cygwin. I noticed that 2 flags were missing from mkbingram/Makefile.in and gramtools/nextword/Makefile.in which prevented proper compilation
The first one was missing a library '-libiconv'
The second one was missing a library '-lncurses'
Could you please add these to repository?

Thx
Leslaw

Dictionary format

The dictionary contains word definition per line. A word is defined by its LM referer, output label and HMM name sequence. Each field is separated by space or tab. The simplest word entry with N-gram can be

Tokyo  t o: ky o:

where the first field is LM entry string, and the rests are HMM name sequence. At recognition, Julius will consult the given N-gram with string "Tokyo" and use the prob as LM score of this word.

You can also specify output string dependent from the N-gram entry.

Tokyo_city [Tokyo]   t o: ky o:

This means that this word will be output as "Tokyo" when recognized, and its LM prob will be assigned by the prob of Tokyo_city in the N-gram.

At grammar, the usage of the first field changes. When used with grammar, the first field is a category number (integer, input of grammar DFA) where this word belongs to.

Moreover, you can insert TWO optional fields between the first field and the second field. They are (1) additional per-word LM log10 score preceded by '@', and (2) LM output string to be displayed when recognized as LM entry. This optional feature is typically used for applying class N-gram at Julius, where N-gram is built upon a class tag and in-class word probs are specified in dictionary. For example, a word entry

<place> @-2.0 Tokyo_city [Tokyo]   t o: ky o:

means that the output string is "Tokyo" (and it's LM output string will be "Tokyo_city") and be assigned the prob of N-gram word <place> in N-gram, and then add an additional score of "-2.0" to the LM score.

The optional per-word additional score "@..." can be used in grammar, as word-dependent insertion scores.

How to get recognition results back to adintool/adinnet

This is a question about how to get the recognition result back when we use adintool ( –out adinnet) to send a file to Julius (-input adinnet)

If I send a .wav file to Julius with adintool:

echo "file.wav" | adintool -in file -out adinnet -server localhost

(Julius started on localhost with -input adinnet)

Then I can see the recognition results on Julius' standard output, but I don't get any result on the client side (adintool).

How can I get the recognition results back to the client/adintool?

Thanks.

julius-simple doesn't recognize -help flag

I have grabbed the precompiled Linux binaries for Julius (julius-4.3.1-linuxbin.tar.gz), built and tried to run julius-simple to see Julius at work, but I wasn't able to do anything with it. It does react to -setting, but -help doesn't seem to work. My system is Debian Jessie x86-64.

$ ./julius-simple 
Julius rev.4.3.1 - based on 
JuliusLib rev.4.3.1 (fast)  built for x86_64-unknown-linux-gnu

Copyright (c) 1991-2013 Kawahara Lab., Kyoto University
Copyright (c) 1997-2000 Information-technology Promotion Agency, Japan
Copyright (c) 2000-2005 Shikano Lab., Nara Institute of Science and Technology
Copyright (c) 2005-2013 Julius project team, Nagoya Institute of Technology

Try '-setting' for built-in engine configuration.
Try '-help' for run time options.

$ ./julius-simple -help
ERROR: m_options: wrong argument: "-help"
Try `-help' for more information.

$ ./julius-simple --help
ERROR: m_options: wrong argument: "--help"
Try `-help' for more information.

$ ./julius-simple help
ERROR: m_options: wrong argument: "help"
Try `-help' for more information.

$ ./julius-simple -setting
JuliusLib rev.4.3.1 (fast)

Engine specification:
 -  Base setup   : fast
 -  Supported LM : DFA, N-gram, Word
 -  Extension    :
 -  Compiled by  : gcc -O6 -fomit-frame-pointer

Library configuration: version 4.3.1
 - Audio input
    primary A/D-in driver   : alsa (Advanced Linux Sound Architecture)
    available drivers       : alsa oss
    wavefile formats        : RAW and WAV only
    max. length of an input : 320000 samples, 150 words
 - Language Model
    class N-gram support    : yes
    MBR weight support      : yes
    word id unit            : short (2 bytes)
 - Acoustic Model
    multi-path treatment    : autodetect
 - External library
    file decompression by   : zlib library
 - Process hangling
    fork on adinnet input   : no

Try `-help' for more information.

I found a bug.

Here is commit message.
By unknown reason, the file with .grammer as its extention is
"can't open".
[caz@ventus tophat]$ mkdfa.pl kaden
cannot open "kaden.grammar" at mkdfa.pl line 69.

diff --git a/gramtools/mkdfa/mkdfa.pl.in b/gramtools/mkdfa/mkdfa.pl.in
index 495c329..ea347d6 100755
--- a/gramtools/mkdfa/mkdfa.pl.in
+++ b/gramtools/mkdfa/mkdfa.pl.in
@@ -56,7 +56,7 @@ foreach $arg (@argv) {
if ($gramprefix eq "") {
usage();
}
-$gramfile = "$ARGV[$#ARGV].grammar";
+$gramfile = "$ARGV[$#ARGV].grm";
$vocafile = "$ARGV[$#ARGV].voca";
$dfafile = "$ARGV[$#ARGV].dfa";
$dictfile = "$ARGV[$#ARGV].dict";

How can I submit this kind of bug and improvement?
-caz, caz at caztech dot com
P.S. I am integrating julius into glide computer called Tophat which is used by glider pilot.

Julius stability

Julius has quite a few problems when it comes to decoding longer files ( > 5mins). I created my own fork of Julius which has some fixes around this area. Maybe it would be good to pull these fixes across to main Julius code branch?

julius -input adinnet protocol

Hi,

I am trying to connect to julius server socket from a java program.

In the documentation it's written that "The protocol is a specific one sending just a sequence of audio sample streams per a small packet. There are no detailed document for the procotol, but it’s a basic and very simple one, since it has no encription or encode/decode features. adintool implements the protocol."

So I am reading adintool.c (not used to c...) and I have found adin_send_end_of_segment() that seems to send an empty char at the end of a segment, so I tried to do that. I can send a few packets (sometimes only one), but after that, I always get "Connection reset by peer: socket write error" on the client.

While I am looking into adintool.c, ....does someone out there know the detailed protocol?

Thanks.

How does the -inactive option work?

If I try to run Julius (JuliusLib rev.4.4.2 (fast)) with the -inactive option such as
julius -inactive -C some.jconf then the result is an infinite loop of


WARNING: pause requested but no pause function specified
WARNING: engine will resume now immediately

No easy way to check whether JuliusLib is installed

JuliusLib, the static library distributed by Julius, doesn't provide pkg-config files so that build projects can transparently check whether the target system has Julius installed or not.

Training recipe for DNN-HMMs compatible with Julius

Hello

It is possible to publish a training recipe for generation of DNN-HMMs compatible with Julius? I am specifically interested in steps leading from triphone GMM-HMMs to DNN-HMMs which are a part of dictation toolkit. I am using HTK for training DNN-HMMs but it contains a very general recipe which I am not sure how compatible is with Julius acoustic model requirements.

Kind Regards
Leslaw

-lasound missing when doing make all

make[1]: Entering directory '/home/john/.local/share/Trash/files/julius-4.4.3.1/julius'
gcc -m32 -I. -I../libjulius/include -I../libsent/include  `../libsent/libsent-config --cflags` `../libjulius/libjulius-config --cflags` -o julius main.o recogloop.o module.o output_module.o output_stdout.o output_file.o record.o charconv.o charconv_iconv.o -L../libjulius `../libjulius/libjulius-config --libs` -L../libsent `../libsent/libsent-config --libs`  
/usr/bin/ld: cannot find -lasound
collect2: error: ld returned 1 exit status
Makefile:58: recipe for target 'julius' failed
make[1]: *** [julius] Error 1

I am using Ubuntu Xenail 64 bit and i am stuck at missing lasound..

i have gone thrue all thies steps and there all fixed:

.Bash_profile does not exist (VoxForge for Julius STT)
Finding .bash_profile and adding path's
yum install alsa-lib-devel.i686 (yum and alsa-lib-devel.i686 does not exist as package)
installation or configuration problem: C compiler cannot create executables.
Jcode.pm missing

OBS: I am doing a installing problem thread too so if you want i can message you the rest upcoming problems

What "no data fragment" means

Sometimes I get a series of messages saying
Warning: adin_alsa: no data fragment after 300 msec?
Is this just telling me it hears nothing above the noise threshold, or is something wrong? Julius does not recognize anything I say while this is going on.

I am running this on a RaspberryPi3 using ALSA for input through a USB microphone. I found that Pulseaudio has too much overhead, probably due to process switching, to use on a RaspberryPi, and it introduces gaps. So I am staying with ALSA.

If I stop Julius and restart it, sometimes the message do not recur.

In order to be decoupled from Julius, I run Julius in "module" mode and communicate over a TCP link. It would be nice if there was a way to detect that this problem is happening besides just not receiving recognition output.

using mkdfa the grammar compiler

Hi i am trying to use an application that I've used many times to control the music player, but when trying to compile grammar:

hiddenotebook@rules:~/julius/test-python$ mkdfa mediaplayer
The " mkdfa " program is not installed. You can install it by typing:
sudo apt install julius

This is not an error but I have Julius on /home/user/julius
How I can redirect mkdfa to that directory?
I'm missing something?

I hope you can forgive me for the lack of experience. Thank you very much.

Error: adin_alsa: cannot set PCM hardware parameters (Invalid argument)

I have downloaded and compiled the newest version of Julius and english dictation kit on raspberry pi but I am having a problem using the cirrus audio logic board. I tried to start Julius like this:

ALSADEV="plughw:1,0" ~/julius-4.4.2/julius/julius -C ~/julius-kits/dictation-kit-v4.3.1-linux/main.jconf -C ~/julius-kits/dictation-kit-v4.3.1-linux/am-gmm.jconf -nostrip

On the end I get the following:

### read waveform input
Stat: adin_alsa: device name from ALSADEV: "plughw:1,0"
Stat: capture audio at 16000Hz
Stat: adin_alsa: latency set to 32 msec (chunk = 512 bytes)
Error: adin_alsa: cannot set PCM hardware parameters (Invalid argument)
failed to begin input stream

I think I selected the hardware correctly, since the output of: cat /proc/asound/cards is

 0 [ALSA           ]: bcm2835 - bcm2835 ALSA
                      bcm2835 ALSA
 1 [sndrpiwsp      ]: snd_rpi_wsp - snd_rpi_wsp
                      snd_rpi_wsp

What is the problem? Why is it trying to pass an invalid argument to the function that sets PCM hardware parameters?

Error compiling julius

Hi there!
Im trying to install julius using pulseaudio in Ubuntu.16 using this comand and I get this error message:

This is the output when I run this command:

~/julius-4.2.2$ ./configure --prefix=/usr --with-mictype=pulseaudio

loading cache ./config.cache
checking host system type... x86_64-unknown-linux
checking host specific optimization flag... no
checking for gcc... (cached) gcc
checking whether the C compiler (gcc ) works... yes
checking whether the C compiler (gcc ) is a cross-compiler... no
checking whether we are using GNU C... (cached) yes
checking whether gcc accepts -g... (cached) yes
checking how to run the C preprocessor... (cached) gcc -E
checking for a BSD compatible install... (cached) /usr/bin/install -c
checking for rm... (cached) /bin/rm
checking for Cygwin environment... (cached) no
checking for mingw32 environment... (cached) no
checking for executable suffix... (cached) no
creating ./config.status
creating Makefile
creating mkbingram/Makefile
creating mkbinhmm/Makefile
creating adinrec/Makefile
creating adintool/Makefile
creating mkss/Makefile
creating generate-ngram/Makefile
creating jclient-perl/Makefile
creating man/Makefile
configuring in mkgshmm
running /bin/sh ./configure --prefix=/usr --with-mictype=pulseaudio --cache-file=.././config.cache --srcdir=.
loading cache .././config.cache
checking for a BSD compatible install... (cached) /usr/bin/install -c
checking for rm... (cached) /bin/rm
checking for perl... (cached) /usr/bin/perl
checking for Cygwin environment... (cached) no
checking for mingw32 environment... (cached) no
checking for executable suffix... (cached) no
creating ./config.status
creating Makefile
creating mkgshmm
configuring in gramtools
running /bin/sh ./configure --prefix=/usr --with-mictype=pulseaudio --cache-file=.././config.cache --srcdir=.
loading cache .././config.cache
checking host system type... x86_64-unknown-linux
checking host-specific optimization flag... no
checking for gcc... (cached) gcc
checking whether the C compiler (gcc ) works... yes
checking whether the C compiler (gcc ) is a cross-compiler... no
checking whether we are using GNU C... (cached) yes
checking whether gcc accepts -g... (cached) yes
checking how to run the C preprocessor... (cached) gcc -E
checking for a BSD compatible install... (cached) /usr/bin/install -c
checking for Cygwin environment... (cached) no
checking for mingw32 environment... (cached) no
checking for executable suffix... (cached) no
checking host specific optimization flag... skipped
checking for rm... (cached) /bin/rm
checking for perl... (cached) /usr/bin/perl
checking for iconv... (cached) /usr/bin/iconv
checking for Jcode module in perl... Can't locate Jcode.pm in @inc (you may need to install the Jcode module) (@inc contains: /etc/perl /usr/local/lib/x86_64-linux-gnu/perl/5.22.2 /usr/local/share/perl/5.22.2 /usr/lib/x86_64-linux-gnu/perl5/5.22 /usr/share/perl5 /usr/lib/x86_64-linux-gnu/perl/5.22 /usr/share/perl/5.22 /usr/local/lib/site_perl /usr/lib/x86_64-linux-gnu/perl-base .).
BEGIN failed--compilation aborted.
configure: warning: no Jcode module in perl, gram2sapixml.pl may not work
checking for readline/readline.h... (cached) no
checking for malloc.h... (cached) yes
creating ./config.status
creating Makefile
creating mkdfa/Makefile
creating mkdfa/mkdfa.pl
creating mkdfa/mkfa-1.44-flex/Makefile
creating dfa_minimize/Makefile
creating generate/Makefile
creating accept_check/Makefile
creating nextword/Makefile
creating yomi2voca/Makefile
creating yomi2voca/yomi2voca.pl
creating gram2sapixml/Makefile
creating gram2sapixml/gram2sapixml.pl
creating dfa_determinize/Makefile
configuring in jcontrol
running /bin/sh ./configure --prefix=/usr --with-mictype=pulseaudio --cache-file=.././config.cache --srcdir=.
loading cache .././config.cache
checking host system type... x86_64-unknown-linux
checking for gcc... (cached) gcc
checking whether the C compiler (gcc ) works... yes
checking whether the C compiler (gcc ) is a cross-compiler... no
checking whether we are using GNU C... (cached) yes
checking whether gcc accepts -g... (cached) yes
checking how to run the C preprocessor... (cached) gcc -E
checking for a BSD compatible install... (cached) /usr/bin/install -c
checking for rm... (cached) /bin/rm
checking for Cygwin environment... (cached) no
checking for mingw32 environment... (cached) no
checking for executable suffix... (cached) no
checking for gethostbyname... (cached) yes
checking for connect... (cached) yes
creating ./config.status
creating Makefile
configuring in julius
running /bin/sh ./configure --prefix=/usr --with-mictype=pulseaudio --cache-file=.././config.cache --srcdir=.
loading cache .././config.cache
checking host system type... x86_64-unknown-linux
checking host-specific optimization flag... no
checking for gcc... (cached) gcc
checking whether the C compiler (gcc ) works... yes
checking whether the C compiler (gcc ) is a cross-compiler... no
checking whether we are using GNU C... (cached) yes
checking whether gcc accepts -g... (cached) yes
checking how to run the C preprocessor... (cached) gcc -E
checking for a BSD compatible install... (cached) /usr/bin/install -c
checking for rm... (cached) /bin/rm
checking for Cygwin environment... (cached) no
checking for mingw32 environment... (cached) no
checking for executable suffix... (cached) no
checking for ANSI C header files... (cached) yes
checking for working const... (cached) yes
checking for winnls.h... (cached) no
checking for iconv... (cached) yes
checking for iconv declaration... (cached)
extern size_t iconv (iconv_t cd, char * *inbuf, size_t *inbytesleft, char * *outbuf, size_t *outbytesleft);
checking for charset conversion... iconv
creating ./config.status
creating Makefile
creating config.h
config.h is unchanged
configuring in libjulius
running /bin/sh ./configure --prefix=/usr --with-mictype=pulseaudio --cache-file=.././config.cache --srcdir=.
loading cache .././config.cache
checking host system type... x86_64-unknown-linux
checking host-specific optimization flag... no
checking for gcc... (cached) gcc
checking whether the C compiler (gcc ) works... yes
checking whether the C compiler (gcc ) is a cross-compiler... no
checking whether we are using GNU C... (cached) yes
checking whether gcc accepts -g... (cached) yes
checking how to run the C preprocessor... (cached) gcc -E
checking for a BSD compatible install... (cached) /usr/bin/install -c
checking for rm... (cached) /bin/rm
checking for ar... (cached) /usr/bin/ar
checking for ranlib... (cached) ranlib
checking for Cygwin environment... (cached) no
checking for mingw32 environment... (cached) no
checking for executable suffix... (cached) no
checking for ANSI C header files... (cached) yes
checking for working const... (cached) yes
checking return type of signal handlers... (cached) void
checking for dlopen... (cached) no
checking for dlopen in -ldl... (cached) yes
checking for POSIX thread library in -lpthread... yes
creating ./config.status
creating Makefile
creating libjulius-config
creating libjulius-config-dist
creating src/version.c
creating doxygen.conf.ver
creating include/julius/config.h
include/julius/config.h is unchanged
configuring in libsent
running /bin/sh ./configure --prefix=/usr --with-mictype=pulseaudio --cache-file=.././config.cache --srcdir=.
loading cache .././config.cache
checking host system type... x86_64-unknown-linux
checking host specific optimization flag... no
checking for gcc... (cached) gcc
checking whether the C compiler (gcc ) works... yes
checking whether the C compiler (gcc ) is a cross-compiler... no
checking whether we are using GNU C... (cached) yes
checking whether gcc accepts -g... (cached) yes
checking how to run the C preprocessor... (cached) gcc -E
checking for a BSD compatible install... (cached) /usr/bin/install -c
checking for rm... (cached) /bin/rm
checking for ar... (cached) /usr/bin/ar
checking for ranlib... (cached) ranlib
checking for Cygwin environment... (cached) no
checking for mingw32 environment... (cached) no
checking for executable suffix... (cached) no
checking for ANSI C header files... (cached) yes
checking for unistd.h... (cached) yes
checking whether byte ordering is bigendian... (cached) no
checking for working const... (cached) yes
checking for socklen_t... yes
checking for gethostbyname... (cached) yes
checking for connect... (cached) yes
checking for strcasecmp... (cached) yes
checking for sleep... (cached) yes
checking for alsa/asoundlib.h... (cached) no
checking for sys/asoundlib.h... (cached) no
checking for sys/soundcard.h... (cached) yes
checking for esd.h... (cached) no
checking for pa_simple_new in -lpulse-simple... (cached) no
configure: error: no PulseAudio header!
configure: error: ./configure failed for libsent

I don't know why my system cannot see the simple.h - it is in the right place
But, the simple.h file is in /usr/include/pulse/simple.h

If someone knows the reason of this error or makes any suggestion please answer me.

[Documentation request] Tutorial on importing DNN models

We would benefit from tutorial how to convert DNN models trained either by HTK 3.5.2-BETA or by Kaldi. I am specifically interested how you converted DNN models trained by Kaldi later used in Japanese speech dictation published here.

test

endt is not initialized in second pass

In file search_bestfirst_v1.c in line 596 endt is not initialized, which causes runtime exception in Windows in VS2013 in debug mode. However it works on Linux/Cygwin. This variable should be initialized to something. On Windows in release mode not initializing it leads to unpredictable decoding results.

[Announce] new branch adintool-gui

Hi, just announcing an experimental feature.

A new branch "adintool_gui" has been created which includes an experimental GUI version of adintool.
You can see how audio is captured and voice is detected on live screen. You can also adjust the trigger level threshold manually using keyboard, and tell server to force audio segmentation, on the fly.

You can try another very experimental feature, automatic trigger level adjuster. To test it, define AUTO_ADJUST_THRESHOLD in adintool.h. The algorithm is based on simple amplitude mean/variance. It is just a tiny hack and not stable, so do not use this on a critical system.

Checkout the "adintool-gui" branch, configure & make, then execute adintool/adintool. You need SDL2 library to compile. See "adintool/GUI_SDL2.txt" for details. I've tested it on Windows and Linux.

Exercising specific words from the grammar

Sometimes during testing of an audio model a particular word will emerge as problematic. Once the phonemes and base audio have been thoroughly checked, it becomes a matter of adding audio samples to give the model more practice with the qualifying sentences. The "generate" utility is handy for this, except for the fact that you can't ask for samples related to a specific word. You can generate a batch and then run a filter to extract those generated samples that include the word, but this can be expensive with large models, with the result that it is less effort to build examples by hand.

Just wondering what techniques or tools are available to automate this process of generating qualifying prompts which contain a specific string? What do other researchers use?

About new Spanish language/acoustic model

Hello all folks:
I and @dataf3l wanted to base a new product on your LVCSR engine, but needed Spanish support, could you tell us me how much would be needed in hours-man or money to do this?
I cannot see what I need to do/know to do this task, could you point me some guide?

trace_backptr: sentence length exceeded ( > 150)

Good afternoon,

When I try to process a sound file that is longer than about 20 seconds, I get this error:

trace_backptr: sentence length exceeded ( > 150)

Is there a way to increase this value of 150 so that I can process bigger files?
Do I need to recompile Julius with a specific option to change this value?

(I have tried to change -sepnum - in the jconf file - which default value is precisely 150 but it doesn't change anything...)

Thanks

License Conditions

I think Julius's license has many problems.

[ja]
3. 本ソフトウェアを利用して得られた知見に関して発表を行なう際には、
「大語彙連続音声認識エンジン Julius」を利用したことを明記して下さい。
[en]
3. When you publish or present any results by using the Software, you
must explicitly mention your use of "Large Vocabulary Continuous
Speech Recognition Engine Julius".

This clause causes a big problem.
According to GPL-FAQ,
Todays copyright law does not allow you to place such a requirement on the output of software,
regardless of whether it is licensed under the terms of the GPL or some other license.
http://www.gnu.org/licenses/gpl-faq.ja.html#RequireCitation
In addition, this condition is not adaption with Open-Source Deifinition.(i.e. Debian Free Software Guideline),
since this license contaminate other works.

9. License Must Not Restrict Other Software
The license must not place restrictions on other software that is distributed along with the licensed software. For example, the license must not insist that all other programs distributed on the same medium must be open-source software.

And license has another problem.
For instance, this clause is not compatible with GPL.

[ja]
5. 本ソフトウェアの利用に関しては、日本国の法律を準拠法とし、京都地方
裁判所を第一審の専属管轄裁判所とします。
[en]
5. This license of use of the Software shall be governed by the laws
of Japan, and the Kyoto District Court shall have exclusive primary
jurisdiction with respect to all disputes arising with respect
thereto.

Would you change license to other famous FLOSS licese?

Regards,

Polish speech models

Hello

This really is not much of an issue, more of an annoucement which may help others.
In https://sourceforge.net/projects/skrybotdomowy/files/Models/ you can find professional quality speech models for Polish language done by me a couple of year ago. They are ARPA and HTK models for Julius. They provide WER in the region of around 13%. Please feel free to use them - they are based on MIT licence.

Kind Regards

Leslaw Pawlaczyk

Julius 4.4 compilation error on raspberry pi2 with Raspbian Jessie

sudo ./configure --with-mictype=alsa

Everything was fine until the shell drop the following error:

checking how to run the C preprocessor... gcc -E
checking for SIMD AVX instruction... no
checking for SIMD AVX instruction with -mavx... configure: error: no support for SIMD AVX instruction
configure: error: ./configure failed for libsent

Anyone has any solution for this?

thank you

-LM language model: some comments

I recently used IRSTLM to successfully build a working LM for Julius, but it was a bit of a struggle. Here are some observations:

what is a dictionary when dealing with a grammar and an LM? Coming from the grammar side I naively assumed that the references to the .dict were the same format for both, but that is not so. I think the docs are a bit thin on what format the dict should be in for a LM; the grammar is well covered already with mkdfa.
In particular it is important to note that the text corpus used to build the model must contain the silences ~~and~~ . IRSTLM will go ahead and happily build iARPA model and convert it to ARPA without them, but when mkbingram tries to make a binary from the result, the process fails with

Stat: init_ngram: reading in ARPA forward n-gram from jdata2.ilm
Stat: ngram_read_arpa: this is 3-gram file
Stat: ngram_read_arpa: reading 1-gram part...
Stat: ngram_read_arpa: read 10 1-gram entries
Stat: ngram_read_arpa: reading 2-gram part...
Stat: ngram_read_arpa: 2-gram read 0 (0%)
Error: ngram_read_arpa: 2-gram #1: "-1.07918 ONE -0.30103": "" not exist in 2-gram
Stat: ngram_read_arpa: 2-gram read 0 (0%)
Error: init_ngram: failed to read "jdata2.ilm"

I think at one point it was possible to omit the silence tags but not now, it seems.
If the proper standard is for the silences to be in the data, then this becomes an issue for IRSTLM and I will post an issue there.

ERROR: m_chkparam:

I've installed julius 4.3 from source file.
When i try to run :
julius -C julius.jconf -input mic
I get :

STAT: include config: julius.jconf

And then nothing happen. Now if i check julius.log file i get :

ERROR: m_chkparam: cannot access sample.dict
ERROR: m_chkparam: cannot access sample.dfa
ERROR: m_chkparam: cannot access hmm15/hmmdefs
ERROR: m_chkparam: cannot access tiedlist
ERROR: m_chkparam: could not pass parameter check

What could cause that problem?
Thanks.

julius was installed with name "julius.dSYM" on Mac OS X

Julius rev.4.3.1
Mac OS X 10.6.8 (Snow Leopard)
Mac OS X 10.11.3 (EL Capitan)

*** Build ***
./configure
make

sudo make install

Build with no error
/usr/local/bin/julius.dSYM is work.

Unable to process wav file

The file and rawfile input methods don't seem to work, and there seems to be almost zero documentation on how to use them.

Running:

julius -input rawfile -filelist sample.wav -C Sample.jconf

gives me mangled output like:

Error: adin_file: failed to read speech data: "�d�t"
�rror: adin_file: failed to open �
 3
�rror: adin_file: failed to read speech data: "�
 3"
Error: adin_file: failed to open b���q���=�]�
                                             �o��������
Error: adin_file: failed to read speech data: "b���q���=�]�
                                                           �o��������"
Error: adin_file: failed to open ��Z
Error: adin_file: failed to read speech data: "��Z"
Error: adin_file: failed to open �����F�����\�����W�p�8���:

However, if I omit the -filelist, julius prompts me for a filename, and if I enter sample.wav, then it produces the correct output:

<s> DIAL TWO ONE TWO THREE FOUR EIGHT ONE TWO ONE TWO THREE FOUR
pass1_best_wordseq: 0 3 5 5 5 5 5 5 5 5 5 5 5 5
pass1_best_phonemeseq: sil | d ay ah l | t uw | w ah n | t uw | th r iy | f ao r | ey t | w ah n | t uw | w ah n | t uw | th r iy | f ao r
pass1_best_score: -28273.152344
### Recognition: 2nd pass (RL heuristic best-first)
STAT: 00 _default: 11810 generated, 5094 pushed, 1694 nodes popped in 940
sentence1: <s> DIAL TWO ONE TWO OH OH TWO OH OH FOUR TWO ONE TWO THREE FOUR </s>
wseq1: 0 3 5 5 5 5 5 5 5 5 5 5 5 5 5 5 1
phseq1: sil | d ay ah l | t uw | w ah n | t uw | ow | ow | t uw | ow | ow | f ao r | t uw | w ah n | t uw | th r iy | f ao r | sil
cmscore1: 1.000 0.823 0.998 0.016 0.001 0.000 0.000 0.000 0.020 0.034 0.025 0.965 0.777 0.991 0.997 0.999 1.000
score1: -28880.394531

Step by step tutorial to build a language model for Julius?

I have a corpus made of WAV files and transcriptions TXT files.
Is there somewhere a(n easy to follow) step by step tutorial to build the language model / acoustic model files needed by Julius in the JCONF configuration file, starting with those WAV + TXT files?
(BTW I am trying to read the HTK book too...)
Thanks

adintool: force split after n seconds

Hi.

Thanks for the useful adintool!
Is there an easy way to force a split after a given number of seconds if no pause has been detected before?

Greetings
Sven

Julius server behaviour on disconnect

My Julius socket server seems to behave correctly while it is running, performing decodes and responding to commands pause and so on. However when a client disconnects, the server goes into an unknown state, and on attempt to reconnect to the same socket id the server process shuts down. I can get it to shut down properly if I send a PAUSE and then disconnect with the client. Ideally the server would recognize the disconnect and then reset for a new connection to come later.

Is the intention that the Julius server waits for only one connection, or am I disconnecting improperly?

Julius High Word Error Rate (WER)

Hello,

I installed Julius 4.3.1 on windows/Cygwin and used my own Arabic acoustic model HTK 3.4, corpus 8.5 hours, ngram LM, tied.list. The decoder is working well. However, the results is far away from HTK HDecode. I have HDecode HTK WER = 30% while in Julius it is 90% which I think there is something wrong in Julius decoder configuration.

The configurations are:
-input mfcfile
-filelist filelist/test.scp
-nlr test.lm
-v test.dic
-h hmmsdef.mmf # acoustic HMM (ascii or Julius binary)
-hlist tied.list # HMMList to map logical phone to physical
-b 10000
-s 20000 # hypotheses stack size on 2nd pass (#hypo)
-m 10000 # hypotheses overflow threshold (#hypo)
-n 100 # num of sentences to find

I also applied the LM scale factor (10) that I'm using in HTK as follows:
-imp 10.0 0.0
the log file as attached.
log.txt

If you kindly help tweaking the decoder as I think it should give closed WER to HTK.

Appreciate your response.

Regards

Abdo

Duplicate line in startup report output

Just a tiny thing, but in the startup report there is a duplicate line relating to "long-term DC removal". I think the line involved is in https://github.com/julius-speech/julius/blob/master/libjulius/src/m_info.c where there is a test for a flag; the test can produce DC removal off, then after the test there is a second printing of DC removal off. I think one of the lines is superfluous.

I tried to submit this as a pull request, but on completion of the process that would authorize me to make a pull request from my own fork of Julius I was informed that "Sorry, something went wrong." So I am able to edit my own fork, but not able to make the comparison between my fork and the Julius master to initiate a PR. Sorry, I tried.

Some WER results

Hello All

I spent many years working with Julius and recently performed an interesting experiment, where I manipulated various config options to get best recognition rate. I found that the best settings are available for changes related to beam "-b", insertion and deletion flags "-lmp", "-lmp2". Julius is especially sensitive to those 3 settings. You can find more details about WER and decoding times from the CSV in attachment. I hope this will help some of you.

Some of the results:

best WER for beam = 500 is 20.50%, INS = 8, DEL = 4, Speed roughly 50 % of RT
best WER for beam = 1000 is 13.24%, INS = 10, DEL = 0, Speed roughly 100 % of RT
best WER for beam = 2000 is 11.91%, INS = 10, DEL = 0, Speed roughly 130 % of RT
best WER for beam = 4000 is 11.22%, INS = 12, DEL = -4, Speed roughly 140 % of RT
best WER for beam = 6000 is 11.06%, INS = 12, DEL = -4, Speed roughly 300 % of RT

p.s. In the attachment are detailed results plus configs for Julius.

Kind Regards
Leslaw

TestingResults_20151229_2238.txt
julius_config.txt
wav_config.txt

Port visualizer to Gtk3

Julius has an old builtin word trellis visualizer written for Gtk 1.2. Since Gtk 1.2 is not shipped by any major Unix-like system, it's sad that this module isn't ported to newer versions of Gtk+ 3.

The patches below are a tentative effort to modernize Julius word trellis visualizer.

Tools to create acoustic models

I need to create an acoustic model for Julius in English. The README says that Julius will accept AMs either in HTK format or in ARPA format.

The HTK toolkit looks very old; last web site update was in 2009. I can not get it to build properly on my x86-64but Linux system.

What tools are there for creating models in ARPA format?

While VoxForge claims to have English models for Julius, that website can not actually deliver the files - speed slows to zero.

New type of Acoustic Model

Hello

I know that Julius started supporting different type of acoustic models over the last year or so. I could not find any examples anywhere at all how to create such types of acoustic models. It would be good if at least a sample model was provided or some technical info about how such models should be structured. I can create usual HMM models using HTK toolkit but have no idea how to approach the new types of acoustic models. I presume that the new models are more efficient than HMMs?

Thx
Leslaw

how to compile for ARM Android

It is documented in http://tech-sketch.jp/2012/11/juliusandroid-1.html. But it is for 4.2.2. The same script does not work for 4.3.1. Always build for x86. Does anyone ever build 4.3.1 for Android?
-caz, caz at caztech dot com

New Version of Julius Book

Hello Dr.Lee,

Thanks for developing this great project. I wonder if you have time to update the Julius book to the latest version?

Thanks!

Memory leak in dictionary memory release

There is a memory leak, as property 'wseq' is not released in function void word_info_free(WORD_INFO *winfo). It should contain an entry after line 72 in file voca_malloc.c which says:
if (winfo->wseq!= NULL) free(winfo->wseq);

Convert the library to NDK build for Android use

Hi,
Anyone is integrate the "julius" library for android , please list the step how to integrate or How to build the ndk library.

julius-speech / julius Goto Github PK

julius's Introduction

Julius: Open-Source Large Vocabulary Continuous Speech Recognition Engine

About Julius

Features

Quick Run

1. Build latest Julius

2. Get English DNN model

3. Modify config file

4. Recognize audio file

5. Run with live microphone input

Download

Install / Build Julius

Tools and Assets

About Models

Japanese

English

Documents

References

Moved to UTF-8

License and Citation

julius's People

Contributors

Stargazers

Watchers

Forkers

julius's Issues

sudo make install

Recommend Projects

Recommend Topics

Recommend Org