泰语文语转换系统中的文本略论措施探讨Analysis of text to text system conversion in Thai




Thailand is located in southern China, and in the middle of Southeast Asia is China's friendly neighbor, Thailand and China commercial intercourse is frequent, and China in economic and political areas is the main collaborative relationship. Thai is Thailand's official language. There are currently 60 million population of Thai, Thai is talking to an analysis of the type, isolated type, it from monosyllabic words composed of basic vocabulary. At the same time, Thai is a voice tone, can be used to identify differences in vocabulary and grammar. Southeast Asia market contains great potential, Southeast Asia will become the research focus to the category of speech decomposition. Main task of this paper includes:. From a professional Thai online dictionaries download entry package, the selection contains frequently used words, compound words, place names, the number of words, words such as entry, and through the process of online dictionaries manual query and add scale phonetic and grammatical information, constructing Thai phrase book. 2. From the Thai professional books and websites selection often use the statement, and the corpus to stop further selection, length of removal, pattern is not suitable for the statement was initially left behind as a front-end Wentian post analysis of the corpus. In the construction of good Thai dictionary basically, stop Thai TTS system front-end Wentian job analysis, for Thai speech characteristic, design based on Thai dictionary before back to the most of the eve of the matching algorithm stop word segmentation, the segmentation results of the application dictionary change into corresponding Thai syllable information. 4. For Thai dictionary based on backward maximum matching algorithm to single to (i.e., unknown words) words, design based on Thai syllable spelling rules of solution to stop the disposal of. 5. Design a modified Thai Rome of coding schemes, based on the completion of the process, the coding scheme of Thai text to stop encoding, and Thai pronunciation scale compared. In this paper, through the process of backward matching algorithm of Thai text stop Thai dictionary based segmentation, once can perhaps the dictionary include accurate word segmentation, of unknown words to stop the disposal of the text as much as possible into syllables situation, the experimental research, the initial segmentation accuracy rate reached 78%; speech synthesis system needs to be text Romanized to extract the syllable and the tone information is also need to Thai stop Roman, through improved Thai Roman encoding scheme can perhaps more accurate express Thai syllable and tone information.


摘要   3-4   Abstract   4-5   第一章 绪论   8-13       1.1 引言   8-10       1.2 语音合成技术的发展历史   10-11       1.3 文本略论处理在语音合成中的重要性   11       1.4 探讨思路以及论文的主要工作   11-12       1.5 论文组织   12-13   第二章 泰语文语转换系统   13-23       2.1 泰语简介   13-15       2.2 探讨近况   15-16       2.3 文语转换的关键技术   16-20           2.3.1 TTS系统概述   16-17           2.3.2 文语转换系统中的语音波形合成措施   17-20       2.4 构建泰语文语转换系统的过程   20-22       2.5 本章小结   22-23   第三章 泰语文本略论措施   23-35       3.1 文本略论概述   23-24       3.2 分词   24-25           3.2.1 分词技术概述   24-25       3.3 实验方案   25-32           3.3.1 构建泰语词典   25-27           3.3.2 泰语自动分词处理   27-31           3.3.3 未知词处理   31-32       3.4 实验结果与略论   32-34           3.4.1 算法复杂度略论   32-33           3.4.2 泰语分词结果略论   33-34       3.5 本章总结   34-35   第四章 泰语文本的罗马化   35-48       4.1 罗马化概述及背景   35-36           4.1.1 音译   35-36           4.1.2 转录   36       4.2 罗马化的运用   36       4.3 泰语罗马化编码标准   36-41       4.4 实验方案   41-46           4.4.1 改进的泰语罗马化编码方案   41-44           4.4.2 音调修改方案   44-45           4.4.3 特殊声调处理方案   45-46           4.4.4 特殊符号处理   46       4.5 实验结果与略论   46-47       4.6 本章总结   47-48   第五章 总结与展望   48-51       5.1 总结   48-49       5.2 展望   49-51   附录   51-53   参考文献   53-57   致谢   57  
