역사 자료 형태분석에서 미등록어 추정과 분석 중의성 해소 [韩语论文]-外语论文网

Research on the historical data corpus has a long history. Although the researchers of the history of Mandarin have mastered a lot of digital Mandarin historical data, they can’t make good use of them through digital operations of computer. At present, there are many historical data corpora, including Sejong historical data corpus, collected by institutions and individuals. But when compared with the constructions of corpora and researches on other fields, the R&D work of utilizing these corpora efficiently has obvious deficiencies. This research’s purpose is to research and develop the historical data of the original corpus analysis tool. The R&D of historical data lexical analyzer is not only beneficial for obtaining the vocabulary data which used to research the historical data vocabulary quickly, but can also cut down the expense. Meanwhile, it benefits the compilation and R&D of Korean history dictionary. To achieve the research purposes above, this research takes the printed ancient novels corpus with almost 1.6 million basic rhythmic units and the analysis result (formal analysis) of it as first data to compile the dictionary, which is the basis of the R&D of vocabulary analysis. The first chapter mainly focuses on the scale of the constructed historical data corpus and its usage situation. The second chapter focuses on the existent basic approaches of tagging and morphological analysis and the research situation of historical data. And there will be an introduction of these historical data used in this research. The third chapter mainly introduces the construction and method of dictionary used for vocabulary analysis. Dictionary is mainly composed of language dictionary, grammar dictionary, Stem dictionary and appellations dictionary, and the dictionaries can be updated. These dictionaries can be used to deal with unknown words, and the part that can’t be dealt with can be improved accurately by perfecting dictionary. The forth chapter is based on Hidden Markov Model and explains how to eliminate the vocabularies’ lexical ambiguity by Viterbi algorithm. In the process of eliminating the lexical ambiguity, the stochastic model has been constructed. When the frequentness is zero, smoothing is used to reduce the effect on the result. The fifth chapter discusses the construction and usage of historical data lexical analyzer system. The sixth chapter mainly introduces the result of lexical analyzer’s analyze.

，韩语论文，韩语毕业论文

高职院校韩语系建设的几点思考	韩国电影剧本中会话含义的略论探讨	韩国跆拳道运动的文化价值观探讨
깔뱅의 기도론 연구	영어권 학습자를 위한 한국어 교재 구성	형태 초점 접근법을 활용한 한국어 대조
도시지역 여성결혼이민자의 재사회화	항공사의 지각된 서비스품질이 실용적	한·중 사동 표현의 대조 연구
TV 포맷의 새로운 유형화 : 이야기, 놀이	모야모야 환아의 수술 후 자기효능감,	한국과 독일의 중등교육단계에서의 진로
汉韩常用颜色词对比探讨	영어 문장구조에 대한 이해가 읽기와 듣	중국인 학습자를 위한 한국어 거절 화행