首页 | 本学科首页   官方微博 | 高级检索  
     检索      

利用引文构建的主题模型研究进展
引用本文:邹丽雪,王丽,刘细文.利用引文构建的主题模型研究进展[J].图书情报工作,2019,63(23):131-138.
作者姓名:邹丽雪  王丽  刘细文
作者单位:1. 中国科学院文献情报中心 北京 100190; 2. 中国科学院大学经济与管理学院图书情报与档案管理系 北京 100190
基金项目:本文系中国科学院文献情报中心青年人才领域前沿项目"基于引用内容关联的多维主题演化研究"(项目编号:G1726)研究成果之一。
摘    要:目的/意义] 概率主题模型算法在不断得到改进与扩展,本文对国内外已有的利用引文构建的主题模型进行研究,分析和对比不同模型的生成过程与算法,并探讨利用引文构建的主题模型在科技文本分析中的应用与可扩展的研究方向。方法/过程] 通过Web of Science数据库和CNKI数据库获取国内外利用引文构建主题模型的相关文献,经人工判读后筛选出具有代表性的文献,对这些文献中利用引文构建的主题模型,从建模思想、生成过程、参数估计与推断算法等方面进行对比与分析。结果/结论] 目前国内外利用引文构建的主题模型主要包括研究主题与引文分布的主题模型、研究被引与施引主题间关系的主题模型,以及基于引用内容的引用主题模型;主题模型中引入引文信息后,能够获得更完整的主题内容和特定主题下的重要文献,并可识别施引文献和被引文献之间主题间的关系及影响;已有的模型多集中在概率潜在语义分析(Probabilistic Latent Semantic Analysis,PLSA)和潜在狄利克雷分配(Latent Dirichlet Allocation,LDA)主题模型基础上进行扩展。未来可扩展研究引入引用内容的主题模型、模型的性能优化和评价方法、模型的应用研究等。

关 键 词:主题模型  引文  主题识别  引用内容  
收稿时间:2019-01-28
修稿时间:2019-06-25

Research Advances of Citation Based Topic Models
Zou Lixue,Wang Li,Liu Xiwen.Research Advances of Citation Based Topic Models[J].Library and Information Service,2019,63(23):131-138.
Authors:Zou Lixue  Wang Li  Liu Xiwen
Institution:1. National Science Library, Chinese Academy of Sciences, Beijing 100190; 2. Department of Library, Information and Archives Management, School of Economics and Management, University of Chinese Academy of Sciences, Beijing 100190
Abstract:Purpose/significance] A wide variety of topic models has been developed with improved algorithm. This paper aims to study the research advances, generation process and algorithm of citation based topic models. Additionally, we discuss the application in the text of academic articles and research areas in the future.Method/process] Based on the data of Web of Science and CNKI database, we collected articles of citation based topic models. In these articles, we selected several representative articles after manual interpretation to analyze the generative process, parameter estimation and inference methods in these citation based topic models.Result/conclusion] Currently, there are mainly three types of citation based topic models. This includes the topic models which focus on the topic-citation distribution, while other topic models mainly study the relationship between the citing documents and the cited documents. Besides, citation context based topic models are also available. Additionally, more complete topic content can be detected after introducing citation information into the topic models. Moreover, most of the models are the variants of LDA and PLSA. In future, incorporating citation context information into topic models, improving the inference methods and applying the models are some of the future directions.
Keywords:topic model  citation  topic detection  citation context  
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号