비교사 분할 및 병합으로 얻은 의사형태소 단위에 기반한 대어휘 연속 음성인식 [韩语论文]-外语论文网

We propose a new method to determine the recognition units for large vocabulary continuous speech recognition (LVCSR) in Korean by applying unsupervised segmentation and merging. In the proposed method, a text sentence is segmented into morphemes and position information is added to morphemes. Then submorpheme units are obtained by splitting the morpheme units through the maximization of posterior probability terms. The posterior probability terms are computed from the morpheme frequency distribution, the morpheme length distribution, and the morpheme frequency-of-frequency distribution. Finally, the recognition units are obtained by sequentially merging the submorpheme pair with the highest frequency. Computer experiments are conducted using a Korean LVCSR with a 100k word vocabulary and a trigram language model obtained by a 300 million eojeol (word phrase) corpus. The proposed method is shown to reduce the out-of-vocabulary rate to 1.8% and to reduce the syllable error rate relatively by 14.0%.

，免费韩语论文，韩语论文范文

영어권 학습자를 위한 한국어 교재 구성	형태 초점 접근법을 활용한 한국어 대조	한국과 독일의 중등교육단계에서의 진로
韩国电影剧本中会话含义的略论探讨	영어 문장구조에 대한 이해가 읽기와 듣	汉韩常用颜色词对比探讨
도시지역 여성결혼이민자의 재사회화	모야모야 환아의 수술 후 자기효능감,	한·중 사동 표현의 대조 연구
항공사의 지각된 서비스품질이 실용적	중국인 학습자를 위한 한국어 거절 화행	高职院校韩语系建设的几点思考
TV 포맷의 새로운 유형화 : 이야기, 놀이	깔뱅의 기도론 연구	韩国跆拳道运动的文化价值观探讨