번역 모델을 이용한 다국어 문장 정렬 [韩语论文]-外语论文网

Recently, to overcome limitation of rule-based machine translation, many researchers have studied about statistical machine translation. Statistical machine translation is the method for deciphering an input document, written in a source language, using probabilities. For training, we get to conditional probabilities for words of two languages from parallel corpora, consisted of a set of pairs of two sentences written in different languages but these are same meaning, and we get to context probabilities from a target language. In this process, we need to a lot of parallel corpora for the good result of translation, but it needs a lot of times to collect parallel corpora manually. But it is very easy to collect bilingual corpora, we need to sentence alignment for converting from bilingual corpora to parallel corpora automatically.
Sentence alignment is a task to find to the corresponding sentence between two documents which consists of different languages. The traditional way is the length-based method. This method only depends on the fact that the lengths of aligned sentences in a source and target language are highly correlated. So it cannot guarantee same meaning sentence about result of sentence alignment. For solving this problem, the lexical-based method, used lexical information within input documents, is proposed. But this method is very slower than the length-based method. And it cannot guarantee good result for different languages which have different language’s structures, like to Korean and English. For solving this problem, others use to bilingual dictionary instead of lexical information within input documents. This method cannot guarantee a good result if the document is appeared that multiple words of a source language correspond to one word of a target language, vice versa.
In this , for solving the problems of previous sentence alignment, we propose a new method that combines length based method and lexical information. The proposed method is follows: (1) We translate a source document and a target document into English using the existing machine translation system. (2) We use a monolingual sentence alignment method. In this method, we use lexical information instead of case penalty of beads. Then (3) we convert the result of (2) into an original source language and target language.
As a result, in sentence alignment between Korean and English, we can see the performance of 96.20% using the F-1 measure. This result is higher than all of previous method. Also, to prove generality on this method, we experimented on multilingual language pairs, consisted of total 34 pairs. In this experiment, we can see that our method have about 2.27% higher than previous length-based methods on average.

，韩语论文，韩语论文网站

高职院校韩语系建设的几点思考	韩国跆拳道运动的文化价值观探讨	도시지역 여성결혼이민자의 재사회화
영어 문장구조에 대한 이해가 읽기와 듣	汉韩常用颜色词对比探讨	형태 초점 접근법을 활용한 한국어 대조
TV 포맷의 새로운 유형화 : 이야기, 놀이	韩国电影剧本中会话含义的略论探讨	한국과 독일의 중등교육단계에서의 진로
모야모야 환아의 수술 후 자기효능감,	영어권 학습자를 위한 한국어 교재 구성	항공사의 지각된 서비스품질이 실용적
한·중 사동 표현의 대조 연구	중국인 학습자를 위한 한국어 거절 화행	깔뱅의 기도론 연구