통계기반 기계번역 도구와 조건부 랜덤필드 도구를 이용한 한국어 형태소 분석 [韩语论文]

资料分类免费韩语论文 责任编辑:金一助教更新时间:2017-04-27
提示:本资料为网络收集免费论文,存在不完整性。建议下载本站其它完整的收费论文。使用可通过查重系统的论文,才是您毕业的保障。

Morphological analysis is a basic step for natural language processing. Morphological analysis produces multiple analysis results for a word phrase. Korean is not only an agglutinative language, but also has the characteristic of inflectional language...

Morphological analysis is a basic step for natural language processing. Morphological analysis produces multiple analysis results for a word phrase. Korean is not only an agglutinative language, but also has the characteristic of inflectional language which needs complicated processing for morphological analysis.
In general, three basic operations are needed for Korean morphological analysis: morpheme restoration (R), morpheme segmentation (S), and morpheme tagging (T). The morphological analysis is done in various analysis process unit such as Eojeol(word phrase), syllable, Jaso(alphabet) etc. Various methods have been developed with the combinations of the three basic operations and various analysis process units.
In this , we define syllable-based probabilistic models for Korean morphological analysis, and implemented them with cascading statistical machine translation (SMT) model and conditional random fields (CRFs) models. The lexical forms of morphemes are restored by SMT model, and the morpheme sequences are segmented into each unit, and the POS tags are attached to each unit by CRFs models. They are implemented with currently available machine learning tools such as Moses, SRILM and CRF//. As these well-known tools have been already verified by many researchers in various areas, we think they are more objective and reliable than locally developed programs. For the integration, we used Beam search by using the limited number of output in each steps. For the more proper integration, we rescaled the output ranks and probabilities because the SMT and CRFs tools produce different scales and ranks. The rescaling improved the 10-best recall performance respectively about 4.79% (R-ST), 6.042% (R-S-Ts) and 7.165% (R-S-Tm).

韩语论文网站韩语论文
免费论文题目: