俄语基本名词短语识别及翻译[俄语论文]

资料分类免费俄语论文 责任编辑:阿米更新时间:2017-05-18
提示:本资料为网络收集免费论文,存在不完整性。建议下载本站其它完整的收费论文。使用可通过查重系统的论文,才是您毕业的保障。
摘要:(摘要内容经过系统自动伪原创处理以避免复制,下载原文正常,内容请直接查看目录。)

根本名词短语是一种构造绝对简略的句法单位,其外部包括了绝对完全的语义信息,在句子组成中普遍运用且占领主要位置。完成对分歧说话根本名词短语的主动辨认和翻译,可以或许在很年夜水平上为懂得分歧说话供给赞助和参考。俄语根本名词短语的辨认和翻译任务对于跨说话检索和俄汉机械翻译等应用具有直接的指点意义和应用价值。本文将俄语作为研究对象,进修并总结俄语的说话特色和语法特点,基于规矩与统计办法相联合的思惟,完成了俄语根本名词短语的辨认,偏重点针对俄语语料的标注价值成绩提出了一种主动构建CRF练习语料的办法。别的,在传统的统计机械翻译流程中,经由过程将隐含在词形变更中的俄语说话特点显式表现,完成了俄语根本名词短语翻译质量的晋升。完成了一个俄语根本名词短语辨认和翻译的完全体系。重要任务包含:第一,基于规矩与统计相联合的思惟,完成了俄语根本名词短语的辨认。而且针对俄语语料缺少且标注价值年夜的现状,俄语专业论文,提出了一种主动构建练习语料的办法。该办法基于起源于收集的俄汉辞书资本,在统计获得的俄语根本名词短语词性搭配形式库指点下,主动构建CRF所需的练习语料,进而运用练习获得的模子在形式库基于最年夜正向婚配准绳停止的BaseNP候选项标注基本上,俄语论文,完成俄语BaseNP的辨认。第二,提出了一种基于隐常识的俄语根本名词短语翻译办法。所谓“隐常识”,是指以词形的变更隐含在俄语单词中的说话特点,例如词性、格、性、数等。将这些说话特点显式的表现在语料傍边,然后基于参加了特点的语料停止翻译,可以从很年夜水平上处理形状变更丰硕的俄语语料数据稀少的成绩,并能在必定水平上改良词对齐的成果,终究晋升翻译体系对俄语根本名词短语的翻译质量。本文的辨认办法在节俭语料标注价值的基本上对于俄语根本名词短语的辨认成果F值为84.14%。显式运用说话特点的翻译办法对于俄语根本名词短语的翻译成果BLEU值为0.4257,较传统的基于短语的机械翻译办法进步了年夜约10个百分点。

Abstract:

The basic noun phrase is a kind of absolute and simple syntactic unit, which includes the absolute and complete semantic information, and it is widely used in sentence composition and occupy the main position. To complete the active identification and translation of the basic noun phrases in different words, may be able to provide a reference for understanding at a very large level. The recognition and translation of the basic noun phrases in the Russian language, such as cross language retrieval and translation of Russian and chinese. In this paper, the Russian language as a research object, learning and summarizing the characteristics of Russian speaking and grammar, based on the combination of rules and statistics method, completed the identification of Russian basic noun phrase, focusing on the value of the Russian corpus of the value of the performance of the initiative to build a CRF training data. Other, in the traditional statistical machine translation process, through the process of implied in the morphological changes of Russian speech characteristic of explicitly, completed the Russian basic noun phrase translation quality promotion. Complete system of the identification and translation of a Russian basic noun phrase. Important tasks include: first, based on the rules and statistics of the combination of thinking, the completion of the Russian basic noun phrase recognition. And for the present situation of the lack of Russian corpus and the value of the label, a method of constructing the corpus of training is presented. The origin based on the collection of Russian and Chinese Dictionary capital, obtained in the statistics of Russian basic noun phrase speech collocation Library under the guidance, take the initiative to build for CRF training corpus, and application practice model obtained in the form of Library Based on most of the eve of the forward mating principle of BaseNP candidate item label, the completion of the Russian BaseNP identification. Second, a translation method of the basic noun phrase based on the implicit knowledge is proposed. The so-called "hidden knowledge", is refers to the change of morphology implicit in the Russian word speech characteristic, such as part of speech, case, number and gender. These speech characteristics significantly the sideways in the corpus, then based in the characteristics of corpus translation, from large processing shape change rich Russian corpus data scarce achievements, and in a certain level improved word alignment results, ultimately, Jin l translation system of Russian basic noun phrase translation quality. This paper identifies the value of the identification method in the value of the basic recognition of the Russian language basic noun phrase recognition results F-measure is 84.14%. Explicit application of the characteristics of the translation of the translation methods on the Russian basic noun phrases BLEU value of 0.4257, compared with the traditional phrase based machine translation method to improve the eve of the 10 percentage points.

目录:

免费论文题目: