首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
基于概率主题模型的文献知识挖掘   总被引:1,自引:0,他引:1  
对海量的科技文献资源进行知识挖掘能够发现大量有价值的、潜在的知识,有效地提高文献信息的可用性。作者前期研究验证了使用LDA主题模型进行文献知识挖掘的可行性。本文提出了一种新的概率主题模型:Topic-Author模型,该模型对文献的文本信息和作者信息进行联合建模,在分析文献主题同时,发现相关主题方向的研究者分布。基于Topic-Author模型,提出了多维度文献知识挖掘的方法,包括主题挖掘,专家发现,文献标注,重要文献挖掘,文献相似度分析,研究趋势分析和主题关系挖掘。基于教育技术学文献数据集,进行了实验研究。  相似文献   

2.
以纳米技术主题领域网络文献为研究对象,利用Ahavista搜索引擎检索纳米技术相关网络文献的数量分布,然后利用统计学中的时间序列分析方法和SPSS软件对数据进行拟合分析。得到纳米技术主题领域网络文献增长曲线的回归方程,验证了普赖斯科学文献指数定律在该领域网络文献增长中的适应性。  相似文献   

3.
王平 《图书情报工作》2014,58(22):70-77
自动挖掘科技文献主题并识别主题变化对于科研工作者及时获取相关领域的最新研究动态有着重要作用.针对科技文献主题多样、动态性强等特点,分析科技文献主题发现及演化具体方法,基于层次概率主题模型hLDA,采用Gibbs抽样来进行模型参数估计,并运用互信息的方法对主题词进行筛选,以提取高质量的主题词.最后,利用先/后离散分析方法研究主题随时间的演化问题.实验结果验证了主题发现及演化方法的可行性及有效性.  相似文献   

4.
识别不同学科间共有的研究内容是学科交叉知识发现的一种研究思路。学科间具有相似语义的研究内容,能够更好地体现学科之间知识的融合、交流现象。针对从科技文献数据中获取语义相似学科交叉研究主题的问题,本文提出了一种基于无监督对比学习的科技文献及关键词语义相似关系表示学习方法,构建了一种语义相似学科交叉主题识别模型。该模型将Spearman相关系数作为评价学科交叉主题的指标,解决了现有研究缺少学科交叉研究数据集的问题。研究结果表明,本文模型较好地获取了科技文献及其关键词之间的语义相似关系,能够较好地反映两个学科之间的交叉态势。  相似文献   

5.
[目的/意义]为全面、客观、高效、直观地掌握科技领域主题的发展规律和演变趋势,提出一种基于多源数据的领域主题演化路径识别和分析框架。[方法/过程]获取不同来源的科技文献数据,利用多维样本有序聚类方法辅助时间切片,基于改进的词袋构建方法,提升LDA模型主题识别效果,借助Louvain社区发现算法在主题层进行多源数据的融合,分析领域主题演化路径。[结果/结论]利用美国太赫兹研究领域基金项目、论文和专利3种来源的数据进行实证研究,结果表明,3种数据源能够清晰划分出4个时间窗口,改进的词袋构建方法能够表征更准确的领域信息内涵,主题社区有助于从多源数据复杂的演化网络中厘清主题演化脉络。  相似文献   

6.
在分析文献在不同研究阶段用词时间特征的倾向性基础上,提出一种基于主题模型的研究发展阶段识别方法。重点阐述该方法的构建过程,包括时间特征抽取、发展阶段界定、主题冷热变化分析等步骤。为验证该方法的有效性,针对词频统计法和主题模型方法在主题演化分析中的效果进行比较分析。结果表明,该方法能在识别主题热点和发展趋势的同时,有效地区分不同主题所反映的研究发展阶段。  相似文献   

7.
分析节点在网络中的位置和关系是网络分析的重要内容,也是为科学评价问题提供了有益的借鉴.从评价的角度,作者和文献之间存在正向的相互影响效应,因此提构建了由作者和文献构成的异质二分网络评价模型,应用PageRank和HITS算法的思想,建立作者和文献的协同评价.基于混合网络模型的协同评价,综合了合作网络和引文网络的结构特征,能够提供更为均衡的度量指标.以情报和图书馆学领域为样本,对模型的参数特征及收敛性进行了分析,通过对比分析说明了算法的有效性.  相似文献   

8.
刘涛 《采.写.编》2022,(3):114-115
本文以中国知网为文献来源,对国内利用民族志方法进行研究的文献进行梳理,对文献进行了计量分析和主题分析.目前民族志方法在中国传播学研究中2010-2016年出现大量研究成果,除了民族志方法以外,研究者多运用综合性的研究方法进行研究.此外,"网络民族志"也成为近年热点话题.但民族志方法在中国传播学研究中的运用也存在田野调查...  相似文献   

9.
[目的/意义] 概率主题模型算法在不断得到改进与扩展,本文对国内外已有的利用引文构建的主题模型进行研究,分析和对比不同模型的生成过程与算法,并探讨利用引文构建的主题模型在科技文本分析中的应用与可扩展的研究方向。[方法/过程] 通过Web of Science数据库和CNKI数据库获取国内外利用引文构建主题模型的相关文献,经人工判读后筛选出具有代表性的文献,对这些文献中利用引文构建的主题模型,从建模思想、生成过程、参数估计与推断算法等方面进行对比与分析。[结果/结论] 目前国内外利用引文构建的主题模型主要包括研究主题与引文分布的主题模型、研究被引与施引主题间关系的主题模型,以及基于引用内容的引用主题模型;主题模型中引入引文信息后,能够获得更完整的主题内容和特定主题下的重要文献,并可识别施引文献和被引文献之间主题间的关系及影响;已有的模型多集中在概率潜在语义分析(Probabilistic Latent Semantic Analysis,PLSA)和潜在狄利克雷分配(Latent Dirichlet Allocation,LDA)主题模型基础上进行扩展。未来可扩展研究引入引用内容的主题模型、模型的性能优化和评价方法、模型的应用研究等。  相似文献   

10.
[目的/意义]梳理当前情报学中涉及时间序列分析的研究,并总结常见问题,为情报学研究的模型化、预测化发展提供借鉴。[研究设计/方法]从任务、过程与问题视角,对情报学研究中时间序列分析的应用任务场景、研究过程以及存在的问题进行归纳分析。[结论/发现]从任务视角来看,已有研究已经在包括学科主题演化、学术影响力评价、网络舆情分析、技术趋势分析任务场景得到了很好的应用,应用内容主要包括历史演化与未来预测两方面;从过程视角来看,已有研究主要按照时间序列的观测数据选取、时间切片方式、形态规律挖掘、预测与评价的顺序展开;从问题视角来看,未来的研究应多关注时间序列模型在短序列数据方面的应用,加强对时间序列分析结果的评估。[创新/价值]通过综合性的梳理,系统地总结了当前情报学中关于时间序列分析的研究,为该领域的研究者提供了一个全面的概述和参考。  相似文献   

11.
We propose a method to identify the journals or proceedings that are most highly esteemed by a research group over some time frame. Using open publication databases, we identify the experts in the community, and analyse their publication pattern, and then use this as a guideline for evaluating scientific outputs of other groups of researchers publishing in the same domain. To illustrate the practicality of our method, we analyse the scientific output of Korean researchers in the security subject domain from 2004 to 2009, and comparing this groups’ output with that of well-known researchers. Our empirical analysis demonstrates that there is a persistent gap between these two research groups’ publications impact over this period, although the absolute number of journal publications greatly increased over recent years.  相似文献   

12.
13.
[目的/意义]由于科研履历提供了其他数据源所没有的丰富而且独特的信息,情报学研究者正越来越意识到它的重要性,并借助科研履历数据在多个领域开展了成果丰硕的各类研究.科研履历正逐步被看作是情报分析与研究的重要数据来源.[方法/过程]采用宏微观和横纵向的两维分析方法,对近20年以科研履历数据作为主要数据源的学术研究文献进行调研与梳理.[结果/结论]提出履历数据至少可以被应用到科研人员的职业成长、人才流动与科研合作、研究群体的特征分析以及科研项目、政策、建制的评估等4个领域的情报研究中.专家库带来的大量结构化履历数据以及以履历数据为枢纽形成的数据链,将推动情报学研究向纵深化发展.  相似文献   

14.
Academic librarians with teaching responsibility have traditionally delivered training in discovering and organising information. However, in recent years, there has been an increased emphasis on supporting researchers through all stages of the research lifecycle. While librarians are ideally placed to provide training in writing for publication and presentation of research, very few in the United Kingdom appear to be doing so. However, there are clear benefits to teaching these subjects. Based on feedback from faculty on user needs, the University of Cambridge Medical Library’s training programme was expanded to include training and support in the publication and presentation of research outputs. This article recounts the process by which the new courses were developed, and the techniques used by the library’s teaching staff to gain understanding of conventions and requirements of forms of written communication with which they were unfamiliar. It also evaluates the impact of the new courses, discusses next steps and provides advice for other librarians wishing to develop similar courses. D.I.  相似文献   

15.
[目的/意义] 开放科学环境对科学研究的透明性、严谨性和开放性提出了更高要求。在此背景下,旨在有效满足这种要求的新兴出版物注册式研究报告应运而生,进一步推动了开放科学的发展与演进。本文拟全面介绍注册式研究报告的产生背景与发展历程,试图窥见其在推进开放科学发展进程中的主要功能和核心价值。[方法/过程] 综合运用网络调研法和内容分析法,从注册式研究报告的产生背景、发展现状、内容构成、出版流程、主要功能及核心价值等方面系统梳理该新兴出版物的基本概况。[结果/结论] 注册式研究报告的突出特点主要体现在:①内容构成方面,注册式研究报告既包含最终研究成果又包含详细的研究计划;②出版流程方面,多数注册式研究报告分为两个阶段出版。这种新兴出版物有利于减少发表偏倚的发生,规范并革新传统出版流程,进而提高研究成果的透明性、可靠性及严谨性。  相似文献   

16.
This paper analyses the publication patterns of researchers in the field of applied sciences at Universities of Technology in South Africa. Aspects investigated include publications in SCOPUS-listed journals; number of citations and countries of publication. Collaborative research patterns at national and international levels were also investigated. A bibliometric analysis approach was followed using SCOPUS as the main source of data and analysing the articles published in selected applied science disciplines. Results show that researchers in the field of applied sciences in universities of technology have increased their number of publications over the past 10?years and are also working in conjunction with other researchers both nationally and internationally. The analysis is an important addition to the field in South Africa which helps in measuring how institutions are positively responding to government incentives in research. The results are also important to information professionals who are increasingly playing an important role in research impact assessments.  相似文献   

17.
In recent times, there has been a proliferation of questionable practices in research publishing, for example, via predatory journals, hijacked journals, plagiarism, tortured phrases and paper mills. This paper intends to analyse whether journals that had been removed from the Directory of Open Access Journals (DOAJ) in 2018 due to suspected misconduct were cited within journals indexed in the Scopus database. Our analysis showed that Scopus contained over 15 thousand references to the removed journals identified. The majority of the publications citing these journals came from the area of Engineering. It is important to note that although we cannot assume that all the journals removed followed unethical practices, it is still essential that researchers are aware of the issues around citing journals that have been suspected of misconduct. We suggest that research libraries play a crucial role in training, advising and providing information to researchers about these ethical issues of publication malpractice and misconduct.  相似文献   

18.
It is well known that a number of research outcomes are not reported (the so‐called ‘file drawer problem’). It is generally assumed that what is not reported are ‘negative results’. Our study approaches the issue from a new angle by exploring what researchers perceive to be ‘unpublishable’. A survey regarding ‘unpublishables’ was sent out to 2,535 faculty members at Indiana University. Forty of these individuals consented to in‐depth interviews, which more fully explored these academics' views on the issue of unpublishable work. Our results indicate that there are several types of research besides negative results that are perceived to be unpublishable yet worthy of publication. Moreover, there is a great diversity within and across disciplines as to what constitutes ‘unpublishable’ research. Respondents indicated that academic discourse would benefit from the formal dissemination of papers that included inconclusive or null results, as well as replication and refutation studies. The results of our study suggest that there is a perceived gap in scholarly communication, which is to the detriment of science. These results can be used by administrators, educators, and publishers in order to refine scholarly communication practices so as to create a more robust, accurate literature and to inform future generations of researchers.  相似文献   

19.
This paper presents an analysis of several dimensions of scientific performance across all research disciplines measured by seven essentially different indicators that quantify productivity (absolute and fractional), collaboration (general, per publication, and international), independence from (co)advisors, and citations. The study population consists of all researchers who have obtained a Ph.D. degree in Slovenia since the country's independence in 1991. We assign researchers to 234 disciplines based on their Ph.D. thesis’ UDC classification; for each researcher, only bibliographic data for the first 10 years of their careers were used in order to avoid inconsistencies due to different career stages.While our findings show that there are notable differences between disciplines for all indicators, we also find that the trends for individual indicators are similar for the vast majority of disciplines; specifically, we observe that the fractional productivity and independence from (co)advisors of researchers are decreasing in all disciplines throughout the observed period, whereas collaboration (general, per publication, and international), and the number of citations are increasing. Moreover, our research results expose two disciplines in terms of UDC classification (mathematics and natural sciences (UDC 5), and applied sciences, medicine, technology (UDC 6)), which stand out in terms of the analyzed indicators.  相似文献   

20.
《Journal of Informetrics》2019,13(2):540-554
Collaboration among researchers is becoming increasingly common, which raises a large number of scientometrics questions for which there is not a clear and generally accepted answer. For instance, what value should be given to a two-author or three-author publication with respect to a single-author publication? This paper uses axiomatic analysis and proposes a practical method to compute the expected value of an n-authors publication that takes into consideration the added value induced by collaboration in contexts in which there is no prior or ex-ante information about the publication's potential merits or scientific impact. The only information required is the number of authors. We compared the obtained theoretical values with the empirical values based on a large dataset from the Web of Science database. We found that the theoretical values are very close to the empirical values for some disciplines, but not for all. This observation provides support in favor of the method proposed in this paper. We expect that our findings can help researchers and decision-makers to choose more effective and fair counting methods that take into account the benefits of collaboration.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号