-
Notifications
You must be signed in to change notification settings - Fork 48
/
options_lists.txt
155 lines (137 loc) · 3.37 KB
/
options_lists.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
### InitializeLanguageModel
##### Required
inputTextPath
outputLmPath
##### Additional Options
minCharCount
insertLongS
charNgramLength
alternateSpellingReplacementPaths
##### Rarely Used Options
removeDiacritics
pKeepSameLanguage
languagePriors
lmPower
explicitCharacterSet
lmCharCount
### InitializeFont
##### Required
inputLmPath
outputFontPath
##### Additional Options
allowedFontsPath
##### Rarely Used Options
numFontInitThreads
spaceMaxWidthFraction
spaceMinWidthFraction
templateMaxWidthFraction
templateMinWidthFraction
### TrainFont
##### Main Options
inputDocPath
inputDocListPath
inputFontPath
inputLmPath
numDocs
numDocsToSkip
numEMIters
continueFromLastCompleteIteration
outputPath
outputFormats
outputFontPath
##### Additional Options
extractedLinesPath
updateDocBatchSize
These options affect the speed of font training
emissionEngine
beamSize
markovVerticalOffset
##### Glyph Substitution Model Options
Glyph substitution is the feature that allows Ocular to use a probabilistic mapping from modern orthography (as used in the language model training text) to the orthography seen in the documents. If the glyph substitution feature is used, Ocular will jointly produce dual transcriptions: one that is an exact transcription of the document, and one that is a normalized version of the text.
allowGlyphSubstitution
inputGsmPath
updateGsm
outputGsmPath
##### Language Model Training Options
updateLM
outputLmPath
##### Line Extraction Options
binarizeThreshold
crop
##### Evaluate During Training
evalInputDocPath
evalNumDocs
evalExtractedLinesPath
evalFreq
evalBatches
##### Rarely Used Options
allowLanguageSwitchOnPunct
cudaDeviceID
decodeBatchSize
gsmElideAnything
gsmElisionSmoothingCountMultiplier
gsmNoCharSubPrior
gsmPower
gsmSmoothingCount
paddingMaxWidth
paddingMinWidth
uniformLineHeight
numDecodeThreads
numEmissionCacheThreads
numMstepThreads
### Transcribe
##### Main Options
inputDocPath
inputDocListPath
inputFontPath
inputLmPath
numDocs
numDocsToSkip
skipAlreadyTranscribedDocs
outputPath
outputFormats
##### Additional Options
extractedLinesPath
failIfAllDocsAlreadyTranscribed
These options affect the speed of transcription
emissionEngine
beamSize
markovVerticalOffset
##### Glyph Substitution Model Options
Glyph substitution is the feature that allows Ocular to use a probabilistic mapping from modern orthography (as used in the language model training text) to the orthography seen in the documents. If the glyph substitution feature is used, Ocular will jointly produce dual transcriptions: one that is an exact transcription of the document, and one that is a normalized version of the text.
allowGlyphSubstitution
inputGsmPath
##### Model Updating Options
updateDocBatchSize
For updating the font model
updateFont
outputFontPath
For updating the glyph substitution model
updateGsm
outputGsmPath
For updating the language model
updateLM
outputLmPath
##### Line Extraction Options
binarizeThreshold
crop
##### Evaluate During Training
evalInputDocPath
evalNumDocs
evalBatches
evalExtractedLinesPath
##### Rarely Used Options
allowLanguageSwitchOnPunct
cudaDeviceID
decodeBatchSize
gsmElideAnything
gsmElisionSmoothingCountMultiplier
gsmNoCharSubPrior
gsmPower
gsmSmoothingCount
paddingMaxWidth
paddingMinWidth
uniformLineHeight
numDecodeThreads
numEmissionCacheThreads
numMstepThreads