Phonetic similarity calculation algorithm can be used to find the equivalence classes of the transliterated foreign words to improve the effectiveness of information retrieval system. KODEX and EKODEX are phonetic similarity calculation algorithms for...
Phonetic similarity calculation algorithm can be used to find the equivalence classes of the transliterated foreign words to improve the effectiveness of information retrieval system. KODEX and EKODEX are phonetic similarity calculation algorithms for Korean words. KODEX is based on English phonetic similarity calculation algorithm SOUNDEX, which converts consonant phonetic information to a sequence of codes to find phonetically similar words. KODEX detects many variants of a transliterated word (high recall), but it also incorrectly detects many other non-relevant transliterated words (low precision). EKODEX is an improved version of KODEX, which uses some vowels information, null sound 'ㅇ‘(ieung) and etc. It increased the precision a little bit more but not as much as we expect. |