首页 | 本学科首页   官方微博 | 高级检索  
     检索      

面向中文科技文献非结构化摘要的知识元表示与抽取研究——基于知识元本体理论
引用本文:郑梦悦,秦春秀,马续补.面向中文科技文献非结构化摘要的知识元表示与抽取研究——基于知识元本体理论[J].情报理论与实践,2020,43(2):157-163.
作者姓名:郑梦悦  秦春秀  马续补
作者单位:西安电子科技大学经济与管理学院,陕西西安710071;西安电子科技大学经济与管理学院,陕西西安710071;西安电子科技大学经济与管理学院,陕西西安710071
基金项目:国家自然科学基金项目“知识社区中的资源语义空间及其检索研究”的成果,项目编号:71573199
摘    要:目的/意义]近年来,科技文献资源呈爆炸性增长,海量科技文献中依旧存在大量非结构化摘要。非结构化摘要一方面不利于学者阅读与理解;另一方面不利于对摘要内部信息进行知识的自动化抽取和相应的检索。研究科技文献非结构化摘要的知识表示模型及其自动化抽取方法,对学者快速阅读和机器自动化处理具有重要意义。方法/过程]文章在分析科技文献非结构化摘要结构的基础上,结合知识元本体理论,构建了一个面向科技文献非结构化摘要的知识元本体模型。通过分析非结构化摘要的写作特征,将文本按句子级划分为目的、方法、结果或结论三个要素,统计每个要素句中的线索词、句型和位置,建立相关规则库,根据本体模型和规则库构建相关抽取算法。最后,下载《计算机技术与发展》中的部分文献进行实验。结果/结论]通过增加句型集和线索词集,完善了非结构化摘要的要素,构建了非结构化摘要知识元本体模型。实验结果表明,根据本文提出的模型能有效地对非结构化摘要中的知识元进行抽取。局限]实验的不足之处是需要人工对摘要中的句型和线索词进行归纳总结。

关 键 词:科技文献  非结构化摘要  知识表示  知识抽取  知识元  本体模型

Research on Knowledge Unit Representation and Extraction for Unstructured Abstracts of Chinese Scientific and Technical Literature: Ontology Theory Based on Knowledge Unit
Abstract:Purpose/significance]In recent years,the resources of scientific and technological literature are increasing explosively,and there are still a large number of unstructured abstracts in the massive scientific literature.On the one hand,unstructured abstract is not conducive to the reading and understanding of scholars,and on the other hand,it is not conducive to the automatic extraction and corresponding retrieval of knowledge of the internal information of the abstract.It is of great significance for scholars to quickly read and automate the processing of knowledge representation models and their automated extraction methods for the unstructured abstracts of scientific literature.Method/process]Based on the analysis of the unstructured abstract structure of scientific literature,this paper constructs a knowledge unit ontology model for the unstructured abstract of scientific literature based on the knowledge unit ontology theory.By analyzing the writing characteristics of unstructured abstracts,the text is divided into three units:purpose,method,result or conclusion according to the sentence level.The clue words,sentence patterns and positions in each element sentence are counted,and the relevant rule base is established.According to the ontology,the model and rule base construct a correlation extraction algorithm.Finally,download some of the literature in Computer Technology and Development for experimentation.Result/conclusion]This paper improves the units of unstructured abstracts by adding sentence patterns and clues,and constructs an unstructured abstract knowledge unit ontology model.The experimental results show that the model proposed in this paper can effectively extract the knowledge units in the unstructured abstract.Limitations]The shortcoming of the experiment is that the sentence patterns and clue words in the abstract need to be summarized manually.
Keywords:scientific literature  unstructured abstract  knowledge representation  knowledge extraction  knowledge unit  ontology model
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号