Sentiment analysis is one of the most popular technique in text-mining area. There are two main approaches in sentiment analysis, lexicon-based approach and machine learning-based approach. In this , we use machine learning-based approach to clas... Sentiment analysis is one of the most popular technique in text-mining area. There are two main approaches in sentiment analysis, lexicon-based approach and machine learning-based approach. In this , we use machine learning-based approach to classify the Korean text's sentiment. Classification with supervised learning has high accuracy than unsupervised learning, but using supervised learning we need a lot of labeled data. To get the labeled data is not an easy task, Korean open source corpus with sentiment labeling is not provided much, and making labeling corpus needs a lot of time and manpower. In this , we will classify the Korean corpus using supervised learning , in order to do this task, we first labeling the unlabeled corpus to use at supervised learning. Labeling will use semi-supervised approach, although many researches using unsupervised learning with unlabeled corpus, but if we have few labeled data, semi-supervised approach is better than unsupervised approach. In out , we make the few labeled data manually and use self-training. Another contribution in this is vector representation for documents. We use SVM to classify the documents, and SVM needs input data as a documents sentiment feature value. We use document vector generated by doc2vec model, and also get a polarity vector of a document and concatenate the two kinds of vector to make the concatenate vector as the document feature. ,免费韩语论文,韩语论文 |