비교사 분할 및 병합으로 얻은 의사형태소 단위에 기반한 대어휘 연속 음성인식 [韩语论文]

资料分类免费韩语论文 责任编辑:金一助教更新时间:2017-04-27
提示:本资料为网络收集免费论文,存在不完整性。建议下载本站其它完整的收费论文。使用可通过查重系统的论文,才是您毕业的保障。

We propose a new method to determine the recognition units for large vocabulary continuous speech recognition (LVCSR) in Korean by applying unsupervised segmentation and merging. In the proposed method, a text sentence is segmented into morphemes and ...

We propose a new method to determine the recognition units for large vocabulary continuous speech recognition (LVCSR) in Korean by applying unsupervised segmentation and merging. In the proposed method, a text sentence is segmented into morphemes and position information is added to morphemes. Then submorpheme units are obtained by splitting the morpheme units through the maximization of posterior probability terms. The posterior probability terms are computed from the morpheme frequency distribution, the morpheme length distribution, and the morpheme frequency-of-frequency distribution. Finally, the recognition units are obtained by sequentially merging the submorpheme pair with the highest frequency. Computer experiments are conducted using a Korean LVCSR with a 100k word vocabulary and a trigram language model obtained by a 300 million eojeol (word phrase) corpus. The proposed method is shown to reduce the out-of-vocabulary rate to 1.8% and to reduce the syllable error rate relatively by 14.0%.

免费韩语论文韩语论文范文
免费论文题目: