In this , we compare the performance of several speaker adaptation methods for HMM-based Korean speech synthesis system with small amount of adaptation data. According to the objective and subjective evaluations, a hybrid method of constrained st... In this , we compare the performance of several speaker adaptation methods for HMM-based Korean speech synthesis system with small amount of adaptation data. According to the objective and subjective evaluations, a hybrid method of constrained structural maximum a posteriori linear regression (CSMAPLR) and maximum a posteriori (MAP) adaptation shows better performance than other methods such as maximum likelihood linear regression (MLLR), constrained MLLR (CMLLR), and CSMAPLR, when only 5 minutes of adaptation data are available for the target speaker. During the objective evaluation, we find that the duration models are not as well adapted to the target speaker as the spectral envelope and pitch models. To alleviate the problem, we propose the duration rectification method and the duration interpolation method. Both the objective and subjective evaluations reveal that the incorporation of the proposed two methods into the conventional speaker adaptation method is effective in improving the performance of the duration model adaptation. ,韩语论文题目,韩语论文范文 |