[Julius-cvs 285] CVS update: julius4/libjulius

Back to archive index

sumom****@users***** sumom****@users*****
2008年 10月 2日 (木) 17:23:47 JST


Index: julius4/libjulius/jconf.man
diff -u julius4/libjulius/jconf.man:1.1 julius4/libjulius/jconf.man:removed
--- julius4/libjulius/jconf.man:1.1	Tue Dec 18 23:09:23 2007
+++ julius4/libjulius/jconf.man	Thu Oct  2 17:23:47 2008
@@ -1,1053 +0,0 @@
-.TH "jconf " "5 "   
-.SH NAME
-jconf
-\- Jconf configuration file specification 
-.SH DESCRIPTION
-The variables that can be written in Jconf file are organized as follows.
-.TP 0.2i
-\(bu
-Global options
-.TP 0.2i
-\(bu
-Instance declaration
-.TP 0.2i
-\(bu
-Language model instance
-.TP 0.2i
-\(bu
-Acoustic model and speech analysis instance
-.TP 0.2i
-\(bu
-Recognizer and search instance
-.PP
-The details are described in the followings.
-.SH EXAMPLE
-These are examples of jconf file.
-.PP
-First example is a simple one with no instance declaration. When
-no instance declaration is found, Julius assumes there are only
-one AM, LM and recognition process instance. In this case, the
-default instance will be named "\fB_default\fR", and
-option order does not matter. This is equivalent to older version
-of Julius, except for GMM handling (see below).
-.PP
-\fBExample of Jconf file: no instance declaration\fR
-.PP
-.nf
-
-      \-C jconffile
-      (\fIOther global options\fR...)
-      (\fIAM and analysis options\fR...)
-      (\fILM options\fR...)
-      (\fISearch options\fR...)
-    
-.fi
-.PP
-This is an example using two acoustic models and three language
-models of different types. Three recognition process instance is
-defined for each combination of AM and LM. The LM type (ngram /
-grammar / word) is determined by the arguments. The Global
-options are placed at the top in the example, but actually it can
-be placed anywhere in the file.
-.PP
-\fBExample of Jconf file: multi model decoding\fR
-.PP
-.nf
-
-      \-C jconffile
-      (\fIOther global options\fR...)
-      \-AM am1
-      (\fIAM and analysis options for am1\fR...)
-      \-AM am2
-      (\fIAM and analysis options for am2\fR...)
-      \-LM lm_ngram
-      \-d ngram \-v dictfile
-      (\fILM options for lm1\fR...)
-      \-LM lm_grammar
-      \-gram grammarprefix
-      (\fILM options for lm2\fR...)
-      \-LM lm_word
-      \-w dictfile
-      (\fILM options for lm3\fR...)
-      \-SR recog_ngram am1 lm_ngram
-      (\fISearch options for recog_ngram\fR...)
-      \-SR recog_grammar am1 lm_grammar
-      (\fISearch options for recog_ngram\fR...)
-      \-SR recog_word am2 lm_word
-      (\fISearch options for recog_ngram\fR...)
-    
-.fi
-.PP
-This is another example using GMM for frontend processing. Note
-that from Rev.4.0 Julius has independent MFCC calculation scheme
-for GMM. This means that you should explicitly specify the
-acoustic analysis condition for GMM, not only the AM.
-.PP
-Option \fB\-AM_GMM\fR switch the current AM configuration
-to the one prepared internally for GMM. You can place AM configuration
-after the option to specify MFCC computation parameter for GMM.
-If you define exactly the same condition as AM for recognition,
-the same MFCC calculation instance will be shared among AM and GMM.
-Else, each MFCC will be computed independently.
-.PP
-\fBExample with GMM\fR
-.PP
-.nf
-
-      \-C jconffile
-      (\fIOther global options\fR...)
-      \-gmm gmmdefs \-gmmreject noise
-      \-AM_GMM
-      (\fIanalysis options for GMM\fR...)
-      \-AM am1
-      (\fIAM and analysis options for am1\fR...)
-      \-LM lm_ngram
-      \-d ngram \-v dictfile
-      (\fILM options for lm1\fR...)
-      \-SR recog_ngram am1 lm_ngram
-    
-.fi
-.SH "JCONF VARIABLES"
-The full list of options and variables that can be specified in jconf
-file is listed below.
-.SS "GLOBAL OPTIONS "
-.RS 
-.SS "Misc. options"
-.RE
-.TP 
-\fB\-C \fR\fIjconffile\fR 
-Load a jconf file. The options written in the file are
-expanded at the point. This option can be used within
-other jconf file.
-.TP 
-\fB\-version \fR
-Print version information to standard error, and exit.
-.TP 
-\fB\-setting \fR
-Print engine setting information to standard error, and exit.
-.TP 
-\fB\-quiet \fR
-Output less log. For result, only the best word sequence will be 
-printed.
-.TP 
-\fB\-debug \fR
-(For debug) output enoumous internal message and debug
-information to log.
-.TP 
-\fB\-check \fR\fB{wchmm|trellis|triphone}\fR 
-For debug, enter interactive check mode.
-.RS 
-.SS "Audio input"
-.RE
-.TP 
-\fB\-input \fR\fB{mic|rawfile|mfcfile|adinnet|stdin|netaudio} \fR
-Choose speech input source. 'file' or 'rawfile' for waveform
-file, 'htkparam' or 'mfcfile' for HTK parameter file. Users will
-be prompted to enter the file name from stdin, or you can use
-"\-filelist" option to specify list of files to process.
-
-\&'mic' is to get audio input from live microphone device, and
-\&'adinnet' means receiving waveform data via tcpip network from 
-an adinnet client. 'netaudio' is from DatLink/NetAudio input, 
-and 'stdin' means data input from standard input.
-
-For waveform file input, only WAV (no
-compression) and RAW (noheader, 16bit,
-big endian) are supported by default. Other format can be read
-when compiled with \fBlibsnd\fR library. To see
-what format is actually supported, see the help message using
-option "\-help". For stdin input, only WAV and RAW is
-supported. (default: mfcfile)
-.TP 
-\fB\-filelist \fR\fIfilename\fR 
-(With \-input rawfile|mfcfile) perform recognition on all files
-listed in the file. The file should contain an input file
-per line. Engine ends when all of the files are processed.
-.TP 
-\fB\-notypecheck \fR
-By default, Julius checks the input parameter type whether it
-matches the AM or not. This option will disable the check and
-use the input vector as is.
-.TP 
-\fB\-48 \fR
-Record input with 48kHz sampling, and down\-sample it to 16kHz
-on\-the\-fly. This option is valid for 16kHz model only. The
-down\-sampling routine was ported from sptk.
-(Rev. 4.0)
-.TP 
-\fB\-NA \fR\fIdevicename\fR 
-Host name for DatLink server input (\fB\-input netaudio\fR).
-.TP 
-\fB\-adport \fR\fIport_number\fR 
-With \fB\-input adinnet\fR, specify adinnet port
-number to listen. (default: 5530)
-.TP 
-\fB\-nostrip \fR
-Julius by default removes successive zero samples in input
-speech data. This option inhibits this removal.
-.TP 
-\fB\-zmean \fR, \fB\-nozmean \fR
-This option enables/disables DC offset removal of input
-waveform. Offset will be estimated from the whole input. For
-microphone / network input, zero mean of the first 48000
-samples (3 seconds in 16kHz sampling) will be used for the
-estimation. (default: disabled)
-
-This option uses static offset for the channel. See also
-\fB\-zmeansource\fR for frame\-wise offset removal.
-.RS 
-.SS "Speech segment detection by level and zero\-cross"
-.RE
-.TP 
-\fB\-cutsilence \fR, \fB\-nocutsilence \fR
-Turn on / off the speech detection by level and zero\-cross.
-Default is on for mic / adinnet input, off for files.
-.TP 
-\fB\-lv \fR\fIthres\fR 
-Level threshold for speech input detection. Values should be
-from 0 to 32767.
-.TP 
-\fB\-zc \fR\fIthres\fR 
-Zero crossing threshold per second. Only waves over the level
-threshold (\fB\-lv\fR) will be counted. (default: 60)
-.TP 
-\fB\-headmargin \fR\fImsec\fR 
-Silence margin at the start of speech segment in
-milliseconds. (default: 300)
-.TP 
-\fB\-tailmargin \fR\fImsec\fR 
-Silence margin at the end of speech segment in milliseconds.
-(default: 400)
-.TP 
-\fB\-rejectshort \fR\fImsec\fR 
-Reject input shorter than specified milliseconds. Search will
-be terminated and no result will be output.
-.RS 
-.SS "Input rejection by average power"
-.RE
-.PP
-This feature will be enabled by
-\fB\-\-enable\-power\-reject\fR on compilation. Should be
-used with Decoder VAD or GMM VAD. Valid for real\-time input only.
-.TP 
-\fB\-powerthres \fR\fIthres\fR 
-Reject the inputted segment by its average energy. If the
-average energy of the last recognized input is below the
-threshold, Julius will reject the input. (Rev.4.0)
-
-This option is valid when
-\fB\-\-enable\-power\-reject\fR is specified
-at compilation time.
-.RS 
-.SS "Gaussian mixture model"
-.RE
-.PP
-GMM will be used for input rejection by accumurated score, or for
-GMM\-based frontend VAD when \fB\-\-enable\-gmm\-vad\fR is specified.
-.PP
-NOTE: You should also set the proper MFCC parameters required for the
-GMM, specifying the acoustic parameters described in AM section
-\fB\-AM_GMM\fR.
-.TP 
-\fB\-gmm \fR\fIhmmdefs_file\fR 
-GMM definition file in HTK format. If specified, GMM\-based
-input verification will be performed concurrently with the 1st
-pass, and you can reject the input according to the result as
-specified by \fB\-gmmreject\fR. The GMM should be
-defined as one\-state HMMs.
-.TP 
-\fB\-gmmnum \fR\fInumber\fR 
-Number of Gaussian components to be computed per frame on GMM
-calculation. Only the N\-best Gaussians will be computed for
-rapid calculation. The default is 10 and specifying smaller
-value will speed up GMM calculation, but too small value (1 or
-2) may cause degradation of identification performance.
-.TP 
-\fB\-gmmreject \fR\fIstring\fR 
-Comma\-separated list of GMM names to be rejected as invalid
-input. When recognition, the log likelihoods of GMMs
-accumulated for the entire input will be computed concurrently
-with the 1st pass. If the GMM name of the maximum score is
-within this string, the 2nd pass will not be executed and the
-input will be rejected.
-.TP 
-\fB\-gmmmargin \fR\fIframes\fR 
-Head margin for GMM\-based VAD in frames. (Rev.4.0)
-
-This option will be valid only if compiled with 
-\fB\-\-enable\-gmm\-vad\fR.
-.RS 
-.SS "Decoding option"
-.RE
-.PP
-Real\-time processing means concurrent processing of MFCC computation
-1st pass decoding. By default, real\-time processing on the pass is on
-for microphone / adinnet / netaudio input, and for others.
-.TP 
-\fB\-realtime \fR, \fB\-norealtime \fR
-Explicitly switch on / off real\-time (pipe\-line) processing on
-the first pass. The default is off for file input, and on for
-microphone, adinnet and NetAudio input. This option relates
-to the way CMN and energy normalization is performed: if off,
-they will be done using average features of whole input. If
-on, MAP\-CMN and energy normalization to do rea\-time processing.
-.SS "INSTANCE DECLARATION FOR MULTI DECODING "
-The following arguments will create a new configuration set with
-default parameters, and switch current set to it. Jconf parameters
-specified after the option will be set into the current set.
-.PP
-To do multi\-model decoding, these argument should be specified at
-the first of each model / search instances with different names.
-Any options before the first instance definition will be IGNORED.
-.PP
-When no instance definition is found (as older version of Julius),
-all the options are assigned to a default instance named "_default".
-.PP
-Please note that decoding with a single LM and multiple AMs is not
-fully supported. For example, you may want to construct the
-jconf file as following.
-
-.nf
-
- \-AM am_1 \-AM am_2
- \-LM lm (LM spec..)
- \-SR search1 am_1 lm
- \-SR search2 am_2 lm
-.fi
-
-This type of model sharing is not supported yet, since some part
-of LM processing depends on the assigned AM. Instead, you can
-get the same result by defining the same LMs for each AM, like this:
-
-.nf
-
- \-AM am_1 \-AM am_2
- \-LM lm_1 (LM spec..)
- \-LM lm_2 (same LM spec..)
- \-SR search1 am_1 lm_1
- \-SR search2 am_2 lm_2
-.fi
-
-.TP 
-\fB\-AM \fR\fIname\fR 
-Create a new AM configuration set, and switch current to the
-new one. You should give a unique name. (Rev.4.0)
-.TP 
-\fB\-LM \fR\fIname\fR 
-Create a new LM configuration set, and switch current to the
-new one. You should give a unique name. (Rev.4.0)
-.TP 
-\fB\-SR \fR\fIname\fR \fIam_name\fR \fIlm_name\fR 
-Create a new search configuration set, and switch current to
-the new one. The specified AM and LM will be assigned to it.
-The \fIam_name\fR and
-\fIlm_name\fR can be either name or ID
-number. You should give a unique name. (Rev.4.0)
-.TP 
-\fB\-AM_GMM \fR
-A special command to switch AM configuration set for
-specifying speech analysis parameters of GMM. The current AM
-will be switched to the GMM specific one already reserved, so
-be careful not to confuse with normal AM configurations.
-(Rev.4.0)
-.SS "LANGUAGE MODEL (\-LM) "
-Only one type of LM can be specified for a LM configuration.
-If you want to use multi model, you should define them one by one,
-each as a new LM.
-.RS 
-.SS N\-gram
-.RE
-.TP 
-\fB\-d \fR\fIbingram_file\fR 
-Use binary format N\-gram. An ARPA N\-gram file can be
-converted to Julius binary format by
-mkbingram.
-.TP 
-\fB\-nlr \fR\fIarpa_ngram_file\fR 
-A forward, left\-to\-right N\-gram language model in standard
-ARPA format. When both a forward N\-gram and backward N\-gram
-are specified, Julius uses this forward 2\-gram for the 1st
-pass, and the backward N\-gram for the 2nd pass.
-
-Since ARPA file often gets huge and requires a lot of time to
-load, it may be better to convert the ARPA file to Julius
-binary format by mkbingram. Note that if
-both forward and backward N\-gram is used for recognition, they
-together should be converted to a single binary.
-
-When only a forward N\-gram is specified by this option and no
-backward N\-gram specified by \fB\-nrl\fR, Julius
-performs recognition with only the forward N\-gram. The 1st 
-pass will use the 2\-gram entry in the given N\-gram, and
-The 2nd pass will use the given N\-gram, with converting
-forward probabilities to backward probabilities by Bayes rule.
-(Rev.4.0)
-.TP 
-\fB\-nrl \fR\fIarpa_ngram_file\fR 
-A backward, right\-to\-left N\-gram language model in standard
-ARPA format. When both a forward N\-gram and backward N\-gram
-are specified, Julius uses the forward 2\-gram for the 1st
-pass, and this backward N\-gram for the 2nd pass.
-
-Since ARPA file often gets huge and requires a lot of time to
-load, it may be better to convert the ARPA file to Julius
-binary format by mkbingram. Note that if
-both forward and backward N\-gram is used for recognition, they
-together should be converted to a single binary.
-
-When only a backward N\-gram is specified by this option and no
-forward N\-gram specified by \fB\-nlr\fR, Julius
-performs recognition with only the backward N\-gram. The 1st
-pass will use the forward 2\-gram probability computed from the
-backward 2\-gram using Bayes rule. The 2nd pass fully use the
-given backward N\-gram. (Rev.4.0)
-.TP 
-\fB\-v \fR\fIdict_file\fR 
-Word dictionary file.
-.TP 
-\fB\-silhead \fR\fIword_string\fR \fB\-siltail \fR\fIword_string\fR 
-Silence word defined in the dictionary, for silences at
-the beginning of sentence and end of sentence. (default:
-"<s>", "</s>")
-.TP 
-\fB\-iwspword \fR
-Add a word entry to the dictionary that should correspond to
-inter\-word pauses. This may improve recognition accuracy in
-some language model that has no explicit inter\-word pause
-modeling. The word entry to be added can be changed by
-\fB\-iwspentry\fR.
-.TP 
-\fB\-iwspentry \fR\fIword_entry_string\fR 
-Specify the word entry that will be added by
-\fB\-iwspword\fR. (default: "<UNK> [sp] sp
-sp")
-.TP 
-\fB\-sepnum \fR\fInumber\fR 
-Number of high frequency words to be isolated from the lexicon
-tree, to ease approximation error that may be caused by the
-one\-best approximation on 1st pass. (default: 150)
-.RS 
-.SS Grammar
-.RE
-.PP
-Multiple grammars can be specified by using \fB\-gram\fR and
-\fB\-gramlist\fR. When you specify grammars using these
-options multiple times, all of them will be read at startup. Note
-that this is unusual behavior from other options (in normal Julius
-option, last one override previous ones). You can use
-\fB\-nogram\fR to reset the already specified grammars at
-that point.
-.TP 
-\fB\-gram \fR\fBgramprefix1[,gramprefix2[,gramprefix3,...]] \fR
-Comma\-separated list of grammars to be used. the argument
-should be prefix of a grammar, i.e. if you have
-\fBfoo.dfa\fR and
-\fBfoo.dict\fR, you can specify them by single
-argument \fBfoo\fR. Multiple grammars can be
-specified at a time as a comma\-separated list.
-.TP 
-\fB\-gramlist \fR\fIlist_file\fR 
-Specify a grammar list file that contains list of grammars to
-be used. The list file should contain the prefixes of
-grammars, each per line. A relative path in the list file
-will be treated as relative to the list file, not the current
-path or configuration file.
-.TP 
-\fB\-dfa \fR\fIdfa_file\fR \fB\-v \fR\fIdict_file\fR 
-An old way of specifying grammar files separately.
-.TP 
-\fB\-nogram \fR
-Remove the current list of grammars already specified by
-\fB\-gram\fR, \fB\-gramlist\fR,
-\fB\-dfa\fR and \fB\-v\fR.
-.RS 
-.SS "Isolated word"
-.RE
-.PP
-Multiple dictionary can be specified by using \fB\-w\fR and
-\fB\-wlist\fR. When you specify multiple times, all of them
-will be read at startup. You can use \fB\-nogram\fR to
-reset the already specified dictionaries at that point.
-.TP 
-\fB\-w \fR\fIdict_file\fR 
-Word dictionary for isolated word recognition. File format
-is the same as other LM. (Rev.4.0)
-.TP 
-\fB\-wlist \fR\fIlist_file\fR 
-Specify a dictionary list file that contains list of
-dictionaries to be used. The list file should contain the
-file name of dictionaries, each per line. A relative path in
-the list file will be treated as relative to the list file,
-not the current path or configuration file. (Rev.4.0)
-.TP 
-\fB\-nogram \fR
-Remove the current list of dictionaries already specified by
-\fB\-w\fR and \fB\-wlist\fR.
-.TP 
-\fB\-wsil \fR\fIhead_sil_model_name\fR \fItail_sil_model_name\fR \fIsil_context_name\fR 
-On isolated word recognition, silence models will be appended
-to the head and tail of each word at recognition. This option
-specifies the silence models to be appended.
-\fIsil_context_name\fR is the name of the
-head sil model and tail sil model as a context of word head
-phone and tail phone. For example, if you specify
-\fB\-wsil silB silE sp\fR, a word with phone
-sequence \fBb eh t\fR will be translated as
-\fBsilB sp\-b+eh b\-eh+t eh\-t+sp silE\fR.
-(Rev.4.0)
-.RS 
-.SS "User\-defined LM"
-.RE
-.TP 
-\fB\-userlm \fR
-Declare to use user LM defined in program. This option should be
-specified if you use user\-defined LM function. (Rev.4.0)
-.RS 
-.SS "Misc LM options"
-.RE
-.TP 
-\fB\-forcedict \fR
-Ignore dictionary errors and force running. Words with errors
-will be skipped at startup.
-.SS "ACOUSTIC MODEL AND SPEECH ANALYSIS (\-AM) (\-AM_GMM) "
-Acoustic analysis parameters are included in this section, since the
-AM defines the required parameter. You can use different MFCC type
-for each AM. For GMM, the same parameter should be specified after
-\fB\-AM_GMM\fR
-.PP
-When using multiple AM, the values of \fB\-smpPeriod\fR,
-\fB\-smpFreq\fR, \fB\-fsize\fR and
-\fB\-fshift\fR should have the same value among all AMs.
-.RS 
-.SS "acoustic HMM and parameters"
-.RE
-.TP 
-\fB\-h \fR\fIhmmdef_file\fR 
-Acoustic HMM definition file. File should be in HTK ascii
-format, or Julius binary format. You can convert HTK ascii hmmdefs
-to Julius binary format by mkbinhmm.
-.TP 
-\fB\-hlist \fR\fIhmmlist_file\fR 
-HMMList file for phone mapping. This options is required when
-using a triphone model. This file provides a mapping between
-logical triphone names genertated from the dictionary and defined
-HMM names in hmmdefs.
-.TP 
-\fB\-tmix \fR\fInumber\fR 
-Specify the number of top Gaussians to be calculted in a
-mixture codebook. Small number will speed up the acoustic
-computation namely in a tied\-mixture model, but AM accuracy may
-get worse on too small value. (default: 2)
-.TP 
-\fB\-spmodel \fR\fIname\fR 
-Specify an HMM name that corresponds to short\-pause model in
-HMM. This option will affect various aspects in recognition:
-short\-pause skipping process on grammar recognition, word\-end
-short\-pause model insertion with \fB\-iwsp\fR on
-N\-gram recognition, or short\-pause segmentation
-(\fB\-spsegment\fR). (default: "sp")
-.TP 
-\fB\-multipath \fR
-Enable multi\-path mode. Multi\-path mode expand state
-transition availability to allow model\-skipping, or multiple
-output/input transitions in HMMs. However, since defining
-additional word begin / end node and perform extra transition
-check on decoding, the beam width may be required to set larger
-and recognition becomes a bit slower.
-
-By default (without this option), Julius automatically check
-the transition type of specified HMMs, and enable the
-multi\-path mode if required. You can force Julius to enable multi\-path
-mode with this option. (rev.4.0)
-.TP 
-\fB\-gprune \fR\fB{safe|heuristic|beam|none|default} \fR
-Set Gaussian pruning algotrihm to use. The default setting
-will be set according to the model type and engine setting.
-"default" will force accepting the default setting. Set this
-to "none" to disable pruning and perform full
-computation. "safe" gualantees the top N Gaussians to be
-computed. "heuristic" and "beam" do more aggressive
-computational cosst reduction, but may result in small loss of
-accuracy model (default: 'safe' (standard), 'beam' (fast) for
-tied mixture model, 'none' for non tied\-mixture model).
-.TP 
-\fB\-iwcd1 \fR\fB{max|avg|best number} \fR
-Select method to approximate inter\-word triphone on the head
-and tail of a word in the first pass.
-
-"max" will apply the maximum likelihood of the same context
-triphones. "avg" will apply the average likelihood of the
-same context triphones. "best number" will apply the average
-of top N\-best likelihoods of the same context
-triphone.
-
-Default is "best 3" for use with N\-gram, and "avg" for grammar
-and word. When this AM is shared by LMs of both type,
-latter one will be chosen.
-.TP 
-\fB\-iwsppenalty \fR\fIfloat\fR 
-Short pause insertion penalty for appended short pauses by
-\fB\-iwsp\fR.
-.TP 
-\fB\-gshmm \fR\fIhmmdef_file\fR 
-If this option is specified, Julius performs Gaussian Mixture
-Selection for efficient decoding. The hmmdefs should be a
-monophone model generated from an ordinary monophone HMM
-model, using mkgshmm.
-.TP 
-\fB\-gsnum \fR\fInumber\fR 
-On GMS, specify number of monophone state from top to 
-compute the detailed corresponding triphones. (default: 24)
-.RS 
-.SS "Speech analysis parameters"
-.RE
-.TP 
-\fB\-smpPeriod \fR\fIperiod\fR 
-Set sampling frequency of input speech by its sampling period,
-in unit of 100 nanoseconds. Sampling rate can also be
-specified by \fB\-smpFreq\fR. Please note that the
-input frequency should be the same as trained conditions of
-acoustic model you use. (default: 625 = 16000Hz)
-
-This option corresponds to the HTK Option "SOURCERATE".
-The same value can be given to this option.
-
-When using multiple AM, this value should be the same among all
-AMs.
-.TP 
-\fB\-smpFreq \fR\fIHz\fR 
-Set sampling frequency of input speech in Hz. Sampling rate
-can also be specified using "\-smpPeriod". Please note that
-this frequency should be the same as the trained conditions of
-acoustic model you use. (default: 16000)
-
-When using multiple AM, this value should be the same among all
-AMs.
-.TP 
-\fB\-fsize \fR\fIsample_num\fR 
-Window size in number of samples. (default: 400)
-
-This option corresponds to the HTK Option "WINDOWSIZE",
-but value should be in samples (HTK value / smpPeriod).
-
-When using multiple AM, this value should be the same among all
-AMs.
-.TP 
-\fB\-fshift \fR\fIsample_num\fR 
-Frame shift in number of samples. (default: 160)
-
-This option corresponds to the HTK Option "TARGETRATE",
-but value should be in samples (HTK value / smpPeriod).
-
-When using multiple AM, this value should be the same among all
-AMs.
-.TP 
-\fB\-preemph \fR\fIfloat\fR 
-Pre\-emphasis coefficient. (default: 0.97)
-
-This option corresponds to the HTK Option "PREEMCOEF".
-The same value can be given to this option.
-.TP 
-\fB\-fbank \fR\fInum\fR 
-Number of filterbank channels. (default: 24)
-
-This option corresponds to the HTK Option "NUMCHANS".
-The same value can be given to this option.
-Be aware that the default value differs from HTK (22).
-.TP 
-\fB\-ceplif \fR\fInum\fR 
-Cepstral liftering coefficient. (default: 22)
-
-This option corresponds to the HTK Option "CEPLIFTER".
-The same value can be given to this option.
-.TP 
-\fB\-rawe \fR, \fB\-norawe \fR
-Enable/disable using raw energy before pre\-emphasis (default: disabled)
-
-This option corresponds to the HTK Option "RAWENERGY".
-Be aware that the default value differs from HTK (enabled at HTK,
-disabled at Julius).
-.TP 
-\fB\-enormal \fR, \fB\-noenormal \fR
-Enable/disable normalizing log energy. On live input, this
-normalization will be approximated from the average of last
-input. (default: disabled)
-
-This option corresponds to the HTK Option "ENORMALISE".
-Be aware that the default value differs from HTK (enabled at HTK,
-disabled at Julius).
-.TP 
-\fB\-escale \fR\fIfloat_scale\fR 
-Scaling factor of log energy when normalizing log
-energy. (default: 1.0)
-
-This option corresponds to the HTK Option "ESCALE".
-Be aware that the default value differs from HTK (0.1).
-.TP 
-\fB\-silfloor \fR\fIfloat\fR 
-Energy silence floor in dB when normalizing log energy.
-(default: 50.0)
-
-This option corresponds to the HTK Option "SILFLOOR".
-.TP 
-\fB\-delwin \fR\fIframe\fR 
-Delta window size in number of frames. (default: 2)
-
-This option corresponds to the HTK Option "DELTAWINDOW".
-The same value can be given to this option.
-.TP 
-\fB\-accwin \fR\fIframe\fR 
-Acceleration window size in number of frames. (default: 2)
-
-This option corresponds to the HTK Option "ACCWINDOW".
-The same value can be given to this option.
-.TP 
-\fB\-hifreq \fR\fIHz\fR 
-Enable band\-limiting for MFCC filterbank computation: set
-upper frequency cut\-off. Value of \-1 will disable it.
-(default: \-1)
-
-This option corresponds to the HTK Option "HIFREQ".
-The same value can be given to this option.
-.TP 
-\fB\-lofreq \fR\fIHz\fR 
-Enable band\-limiting for MFCC filterbank computation: set
-lower frequency cut\-off. Value of \-1 will disable it.
-(default: \-1)
-
-This option corresponds to the HTK Option "LOFREQ".
-The same value can be given to this option.
-.TP 
-\fB\-zmeanframe \fR, \fB\-nozmeanframe \fR
-With speech input, this option enables/disables frame\-wise DC
-offset removal. This corresponds to HTK configuration
-ZMEANSOURCE. This cannot be used with "\-zmean".
-(default: disabled)
-.RS 
-.SS "Real\-time cepstral mean normalization"
-.RE
-.TP 
-\fB\-cmnload \fR\fIfile\fR 
-Load initial cepstral mean vector from file on startup. The
-file shoudld be one saved by \fB\-cmnsave\fR.
-Loading an initial cepstral mean enables Julius to better
-recognize the first utterance on a microphone / network input.
-.TP 
-\fB\-cmnsave \fR\fIfile\fR 
-Save cepstral mean vector at each input. The parameters will
-be saved to the file at each input end, so the output file
-always keeps the last cepstral mean. If output file already
-exist, it will be overridden.
-.TP 
-\fB\-cmnupdate \fR\fB\-cmnnoupdate \fR
-Control whether to update the cepstral mean at each input on
-microphone / network input. Disabling this and specifying
-\fB\-cmnload\fR will make engine to use the initial
-cepstral mean parmanently.
-.TP 
-\fB\-cmnmapweight \fR\fIfloat\fR 
-Specify weight of initial cepstral mean for MAP\-CMN. Specify
-larger value to retain the initial cepstral mean for a longer
-period, and smaller value to rely more on the current input.
-(default: 100.0)
-.RS 
-.SS "Spectral subtraction"
-.RE
-.TP 
-\fB\-sscalc \fR
-Perform spectral subtraction using head part of each file.
-Valid only for raw speech file input. Conflict with
-\fB\-ssload\fR.
-.TP 
-\fB\-sscalclen \fR\fImsec\fR 
-With \fB\-sscalc\fR, specify the length of head part
-silence in milliseconds. (default: 300)
-.TP 
-\fB\-ssload \fR\fIfile\fR 
-Perform spectral subtraction for speech input using
-pre\-estimated noise spectrum from file. The noise spectrum
-should be computed beforehand by mkss.
-Valid for all speech input. Conflict with
-\fB\-sscalc\fR.
-.TP 
-\fB\-ssalpha \fR\fIfloat\fR 
-Alpha coefficient of spectral subtraction for
-\-sscalc and \-ssload.
-Noise will be subtracted stronger as this value gets larger,
-but distortion of the resulting signal also becomes
-remarkable. (default: 2.0)
-.TP 
-\fB\-ssfloor \fR\fIfloat\fR 
-Flooring coefficient of spectral subtraction. The spectral
-power that goes below zero after subtraction will be
-substituted by the source signal with this coefficient
-multiplied. (default: 0.5)
-.RS 
-.SS "Misc AM options"
-.RE
-.TP 
-\fB\-htkconf \fR\fIfile\fR 
-Parse the given HTK Config file, and set corresponding
-parameters to Julius. When using this option, the default
-parameter values are switched from Julius defaults to HTK
-defaults.
-.SS "RECOGNIZER AND SEARCH (\-SR) "
-Default values for beam width and LM weights will change according to
-compile\-time setup of JuliusLib and model specification. Please see
-the startup log for the actual values.
-.RS 
-.SS "General parameters"
-.RE
-.TP 
-\fB\-inactive \fR
-Start this recognition process instance with inactive state. (Rev.4.0)
-.TP 
-\fB\-1pass \fR
-Perform only the first pass. This mode is automatically set
-at isolated word recognition. 
-.TP 
-\fB\-no_ccd \fR, \fB\-force_ccd \fR
-Normally Julius determines whether the specified acoustic
-model is a context\-dependent model from the model names, i.e.,
-whether the model names contain character \fB+\fR
-and \fB\-\fR. You can explicitly specify by these
-options to avoid mis\-detection. These option will override
-automatic detection.
-.TP 
-\fB\-cmalpha \fR\fIfloat\fR 
-Smoothing patemeter for confidence scoring. (default: 0.05)
-.TP 
-\fB\-iwsp \fR
-(Multi\-path mode only) Enable inter\-word context\-free short
-pause handling. This option appends a skippable short pause
-model for every word end. The added model will be skipped on
-inter\-word context handling. The HMM model to be appended can
-be specified by \fB\-spmodel\fR.
-.TP 
-\fB\-transp \fR\fIfloat\fR 
-Additional insertion penalty for transparent words. (default:
-0.0)
-.TP 
-\fB\-demo \fR
-Equivalent to \fB\-progout \-quiet\fR.
-.RS 
-.SS "1st pass parameters"
-.RE
-.TP 
-\fB\-lmp \fR\fIweight\fR \fIpenalty\fR 
-(N\-gram) Language model weights and word insertion penalties
-for the first pass.
-.TP 
-\fB\-penalty1 \fR\fIpenalty\fR 
-(Grammar) word insertion penalty for the first pass. (default: 0.0)
-.TP 
-\fB\-b \fR\fIwidth\fR 
-Beam width for rank beam in number of HMM nodes on the first
-pass. This value defines search width on the 1st pass, and
-has great effect on the total processing time. Smaller width
-will speed up the decoding, but too small value will result in
-a substantial increase of recognition errors due to search
-failure. Larger value will make the search stable and will
-lead to failure\-free search, but processing time and memory
-usage will grow in proportion to the width.
-
-The default value is dependent on acoustic model type: 400
-(monophone), 800 (triphone), or 1000 (triphone, setup=v2.1)
-.TP 
-\fB\-nlimit \fR\fInum\fR 
-Upper limit of token per node. This option is valid when
-\fB\-\-enable\-wpair\fR and
-\fB\-\-enable\-wpair\-nlimit\fR are enabled at
-compilation time.
-.TP 
-\fB\-progout \fR
-Enable progressive output of the partial results on the first pass.
-.TP 
-\fB\-proginterval \fR\fImsec\fR 
-Set the output time interval of \fB\-progout\fR in
-milliseconds.
-.RS 
-.SS "2nd pass parameters"
-.RE
-.TP 
-\fB\-lmp2 \fR\fIweight\fR \fIpenalty\fR 
-(N\-gram) Language model weights and word insertion penalties
-for the second pass.
-.TP 
-\fB\-penalty2 \fR\fIpenalty\fR 
-(Grammar) word insertion penalty for the second pass. (default: 0.0)
-.TP 
-\fB\-b2 \fR\fIwidth\fR 
-Envelope beam width (number of hypothesis) in second pass. If
-the count of word expantion at a certain length of hypothesis
-reaches this limit while search, shorter hypotheses are not
-expanded further. This prevents search to fall in
-breadth\-first\-like status stacking on the same position, and
-improve search failure. (default: 30)
-.TP 
-\fB\-sb \fR\fIfloat\fR 
-Score envelope width for enveloped scoring. When calculating
-hypothesis score for each generated hypothesis, its trellis
-expansion and viterbi operation will be pruned in the middle
-of the speech if score on a frame goes under the width.
-Giving small value makes the second pass faster, but
-computation error may occur. (default: 80.0)
-.TP 
-\fB\-s \fR\fInum\fR 
-Stack size, i.e. the maximum number of hypothesis that can be
-stored on the stack during the search. A larger value may
-give more stable results, but increases the amount of memory
-required. (default: 500)
-.TP 
-\fB\-m \fR\fIcount\fR 
-Number of expanded hypotheses required to discontinue the
-search. If the number of expanded hypotheses is greater then
-this threshold then, the search is discontinued at that point.
-The larger this value is, The longer Julius gets to give up
-search. (default: 2000)
-.TP 
-\fB\-n \fR\fInum\fR 
-The number of candidates Julius tries to find. The search
-continues till this number of sentence hypotheses have been
-found. The obtained sentence hypotheses are sorted by score,
-and final result is displayed in the order (see also the
-\fB\-output\fR). The possibility that the optimum
-hypothesis is correctly found increases as this value gets
-increased, but the processing time also becomes longer. The
-default value depends on the engine setup on compilation time:
-10 (standard) or 1 (fast or v2.1)
-.TP 
-\fB\-output \fR\fInum\fR 
-The top N sentence hypothesis to be output at the end of
-search. Use with \fB\-n\fR (default: 1)
-.TP 
-\fB\-lookuprange \fR\fIframe\fR 
-When performing word expansion on the second pass, this option
-sets the number of frames before and after to look up next
-word hypotheses in the word trellis. This prevents the
-omission of short words, but with a large value, the number of
-expanded hypotheses increases and system becomes
-slow. (default: 5)
-.TP 
-\fB\-looktrellis \fR
-(Grammar) Expand only the words survived on the first pass
-instead of expanding all the words predicted by grammar. This
-option makes second pass decoding slightly faster especially
-for large vocabulary condition, but may increase deletion
-error of short words. (default: disabled)
-.RS 
-.SS "Short\-pause segmentation"
-.RE
-.PP
-When compiled with \fB\-\-enable\-decoder\-vad\fR, the
-short\-pause segmentation will be extended to support decoder\-based
-VAD.
-.TP 
-\fB\-spsegment \fR
-Enable short\-pause segmentation mode. Input will be segmented
-when a short pause word (word with only silence model in
-pronunciation) gets the highest likelihood at certain
-successive frames on the first pass. When detected segment
-end, Julius stop the 1st pass at the point, perform 2nd pass,
-and continue with next segment. The word context will be considered 
-among segments. (Rev.4.0)
-
-When compiled with \fB\-\-enable\-decoder\-vad\fR,
-this option enables decoder\-based VAD, to skip long silence.
-.TP 
-\fB\-spdur \fR\fIframe\fR 
-Short pause duration length to detect end of input segment, in
-number of frames. (default: 10)
-.TP 
-\fB\-pausemodels \fR\fIstring\fR 
-A comma\-separated list of pause model names to be used at short\-pause
-segmentation. The word with only the pause models will be treated
-as "pause word" for pause detectionin. If not specified, name
-of \fB\-spmodel\fR, \fB\-silhead\fR and
-\fB\-siltail\fR will be used. (Rev.4.0)
-.TP 
-\fB\-spmargin \fR\fIframe\fR 
-Backstep margin at trigger up for decoder\-based VAD. (Rev.4.0)
-
-This option will be valid only if compiled with 
-\fB\-\-enable\-decoder\-vad\fR.
-.TP 
-\fB\-spdelay \fR\fIframe\fR 
-Trigger decision delay frame at trigger up for decoder\-based
-VAD. (Rev.4.0)
-
-This option will be valid only if compiled with 
-\fB\-\-enable\-decoder\-vad\fR.
-.RS 
-.SS "Lattice / confusion network output"
-.RE
-.TP 
-\fB\-lattice \fR, \fB\-nolattice \fR
-Enable / disable generation of word graph. Search
-algorithm also has changed to optimize for better word graph
-generation, so the sentence result may not be the same as
-normal N\-best recognition. (Rev.4.0)
-.TP 
-\fB\-confnet \fR, \fB\-noconfnet \fR
-Enable / disable generation of confusion network. Enabling
-this will also activates \fB\-lattice\fR internally.
-(Rev.4.0)
-.TP 
-\fB\-graphrange \fR\fIframe\fR 
-Merge same words at neighbor position at graph generation. If
-the position of same words differs smaller than this value,
-they will be merged. The default is 0 (allow merging on
-exactly the same location) and specifying larger value will
-result in smaller graph output. Setting to \-1 will disable
-merging, in that case same words on the same location of
-different scores will be left as they are. (default: 0)
-.TP 
-\fB\-graphcut \fR\fIdepth\fR 
-Cut the resulting graph by its word depth at post\-processing
-stage. The depth value is the number of words to be allowed
-at a frame. Setting to \-1 disables this feature. (default:
-80)
-.TP 
-\fB\-graphboundloop \fR\fIcount\fR 
-Limit the number of boundary adjustment loop at
-post\-processing stage. This parameter prevents Julius from
-blocking by infinite adjustment loop by short word
-oscillation. (default: 20)
-.TP 
-\fB\-graphsearchdelay \fR, \fB\-nographsearchdelay \fR
-When "\-graphsearchdelay" option is set, Julius modifies its
-graph generation alogrithm on the 2nd pass not to terminate
-search by graph merging, until the first sentence candidate is
-found. This option may improve graph accuracy, especially
-when you are going to generate a huge word graph by setting
-broad search. Namely, it may result in better graph accuracy
-when you set wide beams on both 1st pass \fB\-b\fR
-and 2nd pass \fB\-b2\fR, and large number for
-\fB\-n\fR. (default: disabled)
-.RS 
-.SS "Multi\-gram / multi\-dic output"
-.RE
-.TP 
-\fB\-multigramout \fR, \fB\-nomultigramout \fR
-On grammar recognition using multiple grammars, Julius will
-output only the best result among all grammars. Enabling this
-option will make Julius to output result for each grammar.
-(default: disabled)
-.RS 
-.SS "Forced alignment"
-.RE
-.TP 
-\fB\-walign \fR
-Do viterbi alignment per word units for the recognition
-result. The word boundary frames and the average acoustic
-scores per frame will be calculated.
-.TP 
-\fB\-palign \fR
-Do viterbi alignment per phone units for the recognition
-result. The phone boundary frames and the average acoustic
-scores per frame will be calculated.
-.TP 
-\fB\-salign \fR
-Do viterbi alignment per state for the recognition result.
-The state boundary frames and the average acoustic scores per
-frame will be calculated.


Julius-cvs メーリングリストの案内
Back to archive index