This proposes a new method for automatic mapping of KorLexNoun(KLN) and Sejong semantic classes(SJSC) by considering semantic similarity. KLN is a part of Korean Lexico-semantic network(KorLex) and also is a Korean Word which contains rich wo...
This proposes a new method for automatic mapping of KorLexNoun(KLN) and Sejong semantic classes(SJSC) by considering semantic similarity. KLN is a part of Korean Lexico-semantic network(KorLex) and also is a Korean Word which contains rich words with semantic relationships. However, it has not enough information for semantic analysis and syntactic analysis.
Sejong electronic dictionary(SJD) is made for general purpose of Korean natural language processing with specific semantic and syntactic information and is based on SJSC. But it relatively contains very few words compared with KorLex.
Therefore, For the purpose of improving the technology of semantic analysis and syntactic analysis, the research about manual mapping between KorLex and SJD has been conducted, but the manual mapping conducted by people consumes very large cost, time, and high-quality human resources. Furthermore, it also has a weakness such as complicate maintenance for the relationship between two language resources because the two language resources, KorLex and SJD, have been constantly developed and expanded.
This presents two methods for automatic mapping for KLN and SJSC to overcome the weaknesses of manual mapping by considering semantic similarity between two resources with using linguistic information and statistical information. First, the method using linguistic information for automatic mapping considers semantic similarity based on monosemy/polysemy in KLN, nouns of SJD and word senses of KLN, semantically related words or synonym set in SJD, KLN. Second, the method using statistical information for automatic mapping considers semantic similarity with calculating , Mutual information, Information gain between words belonged SJSC and word senses of synonym sets in KLN.
The experiment on the ground of result by manual mapping as correct answer showed better performance than existing automatic mapping between two language resources. This research obtained Recall 0.838, Precision 0.718 and F1-measeure 0.774 with using linguistic information and Recall 0.826, Precision 0.712 and F1-measure 0.765 with using statistical information and heuristic.
,韩语毕业论文,韩语论文 |