时光和事宜的辨认义务最近几年来获得了普遍的存眷和疾速的成长。作为时光关系研究的基本,时光与事宜及其属性的辨认同样成为天然说话处置中的一个主要义务。时光和事宜的辨认在2010年作为两个零丁的子义务被列入了TempEval评测会议,该会议的评测触及六种说话:汉语、英语、意年夜利、法语、韩语和西班牙语,介入评测的体系中多为英语和西班牙语。本文重要任务是针对TempEval-2评测义务睁开的,分离对时光和事宜及其相干属性停止了辨认,试验语料则采取的是评测会议所给出的中文评测语料,触及的标注则采取了TimeML尺度。详细的研究任务以下:1研究成绩的剖析本文具体界说和剖析了时光和事宜的辨认成绩,并剖析了辨认任务的详细内容和辨认难点,为以后辨认办法的提出和成绩的处理做了年夜量预备任务。2时光及其类型的辨认对时光辨认成绩做了详细周全的引见,包含辨认的思惟和辨认的具体进程。该部门重要任务有时光表达式辨认和时光类型的辨认,时光表达式辨认采取了基于词性构建时光单位规矩库的办法,时光类型则运用了最年夜熵模子来分类辨认。个中时光表达式辨认的精确率、召回率和F值分离为85.16%、83.16%和84.17%,时光类型的准确率为93.02%。由此可知,基于规矩的时光表达式辨认办法和基于最年夜熵类型辨认两种办法均是有用的。最初,针对试验成果和毛病停止了深刻剖析和总结。3事宜及其属性的辨认事宜辨认方面重要任务有事宜辨认和事宜属性辨认,个中属性辨认重要针对时态属性停止辨认。事宜用基于依存剖析和规矩的办法停止辨认,事宜的时态属性则采取了规矩办法来辨认。事宜辨认的精确率、召回率和F值分离为89.2%、82.8%、85.9%,事宜时态辨认准确率为76.9%。文章在比较和剖析试验成果缺乏和毛病的同时,还对辨认进程中存在的成绩停止了深刻剖析。 Abstract: The identification of time and matter has been a common concern in recent years. As the basic of the research of the time, the time and the property of the identification also become a main obligation of the natural speech processing. Time and matters identified in 2010 as two separate sub duties were included in the tempeval evaluation conference, the evaluation of the conference hit six words: Chinese, English, Italian, French, Korean and Spanish and intervention evaluation system for English and Spanish. In this paper, an important task is to open the TempEval-2 evaluation, the separation of the time and the related properties and the relevant properties are identified, the experimental data is taken in the evaluation of the Chinese evaluation data given by the conference, which is touched by the TimeML scale. Detailed discussion the following tasks: 1 study results of analysis for the specific definition and analyzes the recognition results for time and matters, and analyzes the identification task details and difficulties of identification, large amount of preparation for later recognition method is proposed and the results of processing. 2 time and the type of identification of time identification results to do a detailed introduction, including identification of the thinking and identification of the specific process. The important task of the Department has time identification and time type identification, time expression recognition method based on the construction of time units of the rules and regulations, time type is applied to the most large entropy model to identify the. The accuracy rate, recall rate, and F-measure were separated into 84.17%, 83.16% and 85.16%, and the accuracy rate of the time type was 93.02%. From this, we can know that the method based on the rule of time expression recognition method and based on the maximum entropy type identification of the two methods are useful. At first, the test results and problems of a deep analysis and summary. 3 matters and their attributes identified in the identification of important tasks to identify and identify matters relating to property identification, the identification of the important attributes for the identification of temporal properties. Matters are based on the analysis of the rules and regulations to stop the identification, the issue of the temporal properties are taken to identify the rules. The accuracy rate, recall rate and F-measure separation were 89.2%, 82.8% and 85.9%, and the accuracy rate was 76.9%. In the comparison and analysis of the experimental results of the lack of and problems, but also to identify the process of the existence of a profound analysis of the results. 目录: 摘要 8-9 ABSTRACT 9-10 第一章 引言 11-15 1.1 探讨背景 11 1.2 探讨近况 11-13 1.2.1 时间及其属性识别探讨近况 11-12 1.2.2 事件及其属性识别探讨近况 12-13 1.3 论文主要工作 13-14 1.4 论文结构 14-15 第二章 时间、事件识别问题略论 15-23 2.1 时间、事件相关定义 15-17 2.2 时间、事件识别内容和规范 17 2.3 问题及难点略论 17-21 2.3.1 时间识别任务 17-19 2.3.2 事件识别任务 19-21 2.4 本章小结 21-23 第三章 时间表达式及其类型的识别 23-31 3.1 基于规则的时间表达式识别 23-25 3.1.1 基于词性构建时间单元规则库 23-24 3.1.2 识别过程 24-25 3.2 基于最大熵的时间表达式类型识别 25-27 3.2.1 特征的选取 25-26 3.2.2 识别过程 26-27 3.3 实验结果略论 27-30 3.3.1 实验语料的介绍 27 3.3.2 实验结果说明及略论 27-30 3.4 本章小结 30-31 第四章 事件及其属性的识别 31-41 4.1 基于依存句法和规则的事件识别 31-34 4.1.1 依存句法略论工具 31-32 4.1.2 识别规则 32 4.1.3 识别措施和过程 32-34 4.2 基于规则的事件时态属性识别 34-37 4.2.1 实验语料的准备 34 4.2.2 规则库的获取 34-35 4.2.3 识别措施和过程 35-37 4.3 实验结果略论 37-40 4.3.1 训练和测试语料的介绍 37 4.3.2 实验结果说明及略论 37-40 4.4 本章小结 40-41 第五章 结论与展望 41-43 5.1 结论 41 5.2 展望 41-43 参考文献 43-47 攻读学位期间取得的探讨成果 47-49 致谢 49-51 个人简况及联系方式 51-55 |