首页 | 本学科首页   官方微博 | 高级检索  
     检索      

面向学科新兴主题探测的多源科技文献时滞计算及启示——以农业学科领域为例
引用本文:杨金庆,陆伟,吴乐艳.面向学科新兴主题探测的多源科技文献时滞计算及启示——以农业学科领域为例[J].情报学报,2021(1):21-29.
作者姓名:杨金庆  陆伟  吴乐艳
作者单位:武汉大学信息管理学院;武汉大学信息检索与知识挖掘研究所
基金项目:国家自然科学基金项目“基于多语义信息融合的学术文献引文推荐研究”(71673211);国家社科基金重大项目“基于认知计算的学术论文评价理论与方法研究”(17ZDA292)。
摘    要:为探究面向学科新兴主题探测领域多源科技文献融合过程中的时滞性问题,本文设计了多源科技文献时滞计算方案。首先,从获取的4种科技文献数据集中提取学科主题,计算学科主题间的相似度,构建相似矩阵;其次,基于匈牙利最优匹配算法寻求相似度损耗最小条件下的最优组合;最后,构建线性方程模型并拟合计算时滞程度。本文以2009-2016年农业学科领域337790篇摘要文本为实验数据,抽取基金项目文本学科主题为250个、专利文献为260个、期刊论文为260个、会议论文为240个,利用上述多源科技文献时滞计算方案实验。结果表明:期刊论文滞后于基金项目文本和会议论文1年,专利文献滞后于期刊论文1年,结合以往对不同学科领域数据的研究结果,验证了多源科技文献时滞计算方案的可行性和有效性,同时也为多源科技文献融合策略的制定提供新思路。

关 键 词:多源融合  时滞计算  新兴主题探测  科技文献

Time-lag Calculation and Enlightenment of Multi-source Science and Technology Literature Fusion for the Detection of Emerging Research Topic:A Case Study in the Field of Agriculture
Yang Jinqing,Lu Wei,Wu Leyan.Time-lag Calculation and Enlightenment of Multi-source Science and Technology Literature Fusion for the Detection of Emerging Research Topic:A Case Study in the Field of Agriculture[J].Journal of the China Society for Scientific andTechnical Information,2021(1):21-29.
Authors:Yang Jinqing  Lu Wei  Wu Leyan
Institution:(School of Information Management,Wuhan University,Wuhan 430072;Institute for Information Retrieval and Knowledge Mining,Wuhan University,Wuhan 430072)
Abstract:To explore the time lag in the emerging topic detection of multi-source data fusion, this paper designs a scheme to calculate time lag. First, research topics are extracted from four kinds of scientific and technological literature datasets,then a similarity matrix is constructed by calculating the similarity between those research topics. Second, optimal combination under the condition of minimum similarity loss is found based on the Hungarian optimal matching algorithm. Finally, the linear equation model is constructed and time lag is calculated by fitting the model. Using the experimental data of 337,790 abstract texts in agricultural disciplines from 2009 to 2016, the number of the research topics extracted from fund projects, patents, journal articles, and conference papers is 250, 260, 260, and 240 respectively. Using the above-mentioned time-lag calculation method of scientific and technological literature, we find the following results: journal articles lag behind fund project text and conference papers for one year and patent documents lag behind journal articles for one year.Combining with the previous research results in different disciplines, the feasibility and effectiveness of the time-lag calculation method for multi-source scientific and technological literature are verified, and a new idea for the formulation of a multi-source data fusion strategy is also provided.
Keywords:multi-source fusion  time-lag computation  emerging topic detection  scientific and technological literature
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号