共查询到20条相似文献,搜索用时 15 毫秒
1.
自动文摘是文本挖掘的主要任务之一。相比于抽取式自动文摘,生成式自动文摘在思想上更接近人工摘要的过程,具有重要研究意义。近几年伴随着深度学习方法的发展,基于深层神经网络模型的生成式自动文摘也有了令人瞩目的发展。为了更全面地理解该类方法的思想和研究现状,本文从生成式自动文摘的任务描述入手,梳理了基于RNN (recurrent neural network,循环神经网络)的模型、基于CNN (convolutional neural network,卷积神经网络)的模型、基于RNN+CNN的模型、融合注意力机制的模型和融合强化学习的模型共五大类生成式自动文摘的深度学习方法。这类方法表明,在深层神经网络的训练下,特别是融合注意力机制和强化学习后,摘要效果得以明显提升。在生成式自动文摘研究的未来发展中,除深度学习方法本身的不断应用和改进外,还需关注如何有效实现篇章级语义理解下的摘要、面向不同文本对象特点的摘要和摘要结果自动评价等问题。此外,如何结合传统摘要研究中的成熟方法进一步提高摘要效果,也是一个很有价值的研究方向。 相似文献
2.
选取网络文本资源的标题识别作为切入点,除考虑多数研究关注的文本的格式信息(如字体)、位置信息等特征外,加入对标题与网页正文内容的相关度的考虑,利用科技监测项目采集到的大量历史数据作为统计分析的基础,从候选标题的可能来源和特征方面,构建基于规则的网络文本资源标题快速识别方法,并给出该方法的时间效率和识别准确率测评结果。 相似文献
3.
The effectiveness of a video retrieval system largely depends on the choice of underlying text and image retrieval components. The unique properties of video collections (e.g., multiple sources, noisy features and temporal relations) suggest we examine the performance of these retrieval methods in such a multimodal environment, and identify the relative importance of the underlying retrieval components. In this paper, we review a variety of text/image retrieval approaches as well as their individual components in the context of broadcast news video. Numerous components of text/image retrieval have been discussed in detail, including retrieval models, text sources, temporal expansion methods, query expansion methods, image features, and similarity measures. For each component, we conduct a series of retrieval experiments on TRECVID video collections to identify their advantages and disadvantages. To provide a more complete coverage of video retrieval, we briefly discuss an emerging approach called concept-based video retrieval, and review strategies for combining multiple retrieval outputs. 相似文献
4.
高校多校区图书馆教学资源共享机制与多功能网络技术平台的研究 总被引:1,自引:0,他引:1
文章分析了图书馆信息资源共建共享的研究现状,提出了建立高校信息资源共建共享管理模式与管理机制,并对建立共享网络信息平台、信息检索系统、手机短信服务系统以及图书馆自动统计人数系统等几个方面进行了深入研究.参考文献8. 相似文献
5.
Kezban Dilek Onal Ye Zhang Ismail Sengor Altingovde Md Mustafizur Rahman Pinar Karagoz Alex Braylan Brandon Dang Heng-Lu Chang Henna Kim Quinten McNamara Aaron Angert Edward Banner Vivek Khetan Tyler McDonnell An Thanh Nguyen Dan Xu Byron C. Wallace Maarten de Rijke Matthew Lease 《Information Retrieval》2018,21(2-3):111-182
A recent “third wave” of neural network (NN) approaches now delivers state-of-the-art performance in many machine learning tasks, spanning speech recognition, computer vision, and natural language processing. Because these modern NNs often comprise multiple interconnected layers, work in this area is often referred to as deep learning. Recent years have witnessed an explosive growth of research into NN-based approaches to information retrieval (IR). A significant body of work has now been created. In this paper, we survey the current landscape of Neural IR research, paying special attention to the use of learned distributed representations of textual units. We highlight the successes of neural IR thus far, catalog obstacles to its wider adoption, and suggest potentially promising directions for future research. 相似文献
6.
7.
8.
《The Reference Librarian》2013,54(27-28):177-183
Classification of periodicals can be done for either of two reasons- to place bound periodical volumes in the stacks close to monographs on the same subject or to organize the volumes in a separate periodicals area. Either reason provides a key to locating periodicals that does not depend on the patron's ability to interpret spine titles, title changes, or changing cataloging codes. If the choice is to integrate bound periodical volumes with the monographs, the same classification system must be used. For a separate bound periodicals area, many libraries have developed schemes to organize their titles on the shelves in alphabetical order by entry. Which choice is best depends on where a library chooses to shelve its periodicals and how the library staff believes patrons approach the task of finding titles. 相似文献
9.
[目的/意义]对文献数据库用户心智模型演进的驱动因素结构进行测量。[方法/过程]研究采用问卷调查法收集483份关于文献数据库用户对其心智模型演进驱动因素认知的问卷,采用二阶验证性因素分析方法对收集到的数据进行分析。[结果/结论]研究发现文献数据库用户心智模型的驱动因素有文献数据库界面引导与提示、自我摸索、与同学交流、文献数据库信息服务产品、搜索引擎学习迁移、简单信息检索任务、复杂信息检索任务、信息检索课程、请教老师、图书馆信息检索培训和购物网站学习迁移。这些因素对用户心智模型演进的重要性依次升高。此外,由于用户心智模型构成维度的复合性,每种驱动因素对文献数据库内容认知、信息检索方法认知、信息检索结果筛选的影响都存在差异。研究结果可为文献数据库的界面优化设计和用户信息素养培训提供指导建议。 相似文献
10.
11.
12.
颠覆性技术是一个具有复杂的内在结构的技术群。从空间维度来看,颠覆性技术是包含了主导技术、辅助技术、支撑技术的复杂技术群,涉及多学科、多领域。在此背景下,运用科学计量的方法对颠覆性技术进行科技评价和科学技术演变规律探索面临挑战,实质表现为数据检索。本文探索了一种基于机器学习的专利数据集构建新策略,将专利检索任务作为机器学习的二分类任务,类似于信息检索中基于主动学习的查询分类思想,并提出了将F-measure特征最大化方法与CNN(convolutional neural networks)模型相结合的文本分类改进方法。本文以人工智能(artificial intelligence,AI)技术域为例进行训练实验,实验结果的准确率、召回率和F1值分别达到98.01%、97.04%和97.89%,这表明本文提出的策略能够精准地识别人工智能专利,提高了专利检索的准确率和召回率,以利于构建精、准、全的人工智能技术域专利数据集。 相似文献
13.
Internet信息检索分析与研究 总被引:7,自引:0,他引:7
综述了目前Internet 网上信息检索的主要方法及存在的问题, 并对其检索技术进行了深入的分析与比较。介绍了机器学习、智能A gent、信息过滤等新技术在信息检索中的应用, 并采用神经网络Hopfield 模型及算法进行词汇扩充来提高用户的检索提问表达, 从而提高了网上信息检索的能力。 相似文献
14.
The ability to find tables and extract information from them is a necessary component of many information retrieval tasks. Documents often contain tables in order to communicate densely packed, multi-dimensional information. Tables do this by employing layout patterns to efficiently indicate fields and records in two-dimensional form. Their rich combination of formatting and content presents difficulties for traditional retrieval techniques. This paper describes techniques for extracting tables from text and retrieving answers from the extracted information. We compare machine learning (especially, Conditional Random Fields) and heuristic methods for table extraction. To retrieve answers, our approach creates a cell document, which contains the cell and its metadata (headers, titles) for each table cell, and the retrieval model ranks the cells of the extracted tables using a language-modeling approach. Performance is tested using government statistical Web sites and news articles, and errors are analyzed in order to improve the system. 相似文献
15.
学术文献特征表示,是学术文献搜索、分类组织、个性化推荐等学术大数据服务的关键步骤。研究表明,图神经网络能够有效学习文献的特征表示,然而当前研究主要集中在有监督学习方法上,不仅对数据集的大小和质量的要求较高,且学习到的文献特征表示与具体任务高度耦合。基于此,本文将四种无监督图神经网络方法引入学术文献表示学习,从Cora、CiteSeer和DBLP (database systems and logic programming)数据集的引文网络、共被引网络和文献耦合网络中学习文献的表示向量,并应用于文献分类和论文推荐两大下游任务。研究结果表明,(1)深度互信息图神经网络适合于文献分类任务,对抗正则化变分图自编码器则在论文推荐任务上性能更佳;(2)Cora数据集上的结果表明,相较于共被引和文献耦合网络,引文网络更适合于学习通用的文献表示向量。 相似文献
16.
唐光前 《现代图书情报技术》2003,(6):50-52
分析了基于 Microsoft Search Service为图书馆自建数据库创建 Web全文检索系统的理由、Microsoft SearchService的索引机制和检索机制 ,并运用 ASP.NET技术给出了一个具体的实现方案 相似文献
17.
《Cataloging & classification quarterly》2013,51(3):37-68
In the Dewey Decimal Classification (DDC) Online Project, subject searching and browsing of DDC Schedules and Relative Index were featured in an experimental online catalog. The effectiveness of this DDC in an online catalog was tested in online retrieval experiments at four participating libraries. These experiments provided data for analyses of subject searchers' use of a library classification in the information retrieval environment of an online catalog. Recommendations were provided for the enhancement of bibliographic records, online catalogs, and online cataloging systems with a library classification. In this paper, subject searchers' use of the subject outline search capability of the experimental online catalog is described. This capability was unique to the experimental online catalog and all other online catalogs, because it referred searchers to online displays of the classification schedules based on their entry of subject terms. Failure analyses of subject outline searches demonstrated its specific strengths and weaknesses. Users' postsearch interview comments highlighted their experiences and their satisfaction with this search. Based on the failure analyses and users' interview comments, recommendations are provided for the improvement of the subject outline search in online catalogs. 相似文献
18.
Investigating task performance of probabilistic topic models: an empirical study of PLSA and LDA 总被引:2,自引:0,他引:2
Probabilistic topic models have recently attracted much attention because of their successful applications in many text mining
tasks such as retrieval, summarization, categorization, and clustering. Although many existing studies have reported promising
performance of these topic models, none of the work has systematically investigated the task performance of topic models;
as a result, some critical questions that may affect the performance of all applications of topic models are mostly unanswered,
particularly how to choose between competing models, how multiple local maxima affect task performance, and how to set parameters
in topic models. In this paper, we address these questions by conducting a systematic investigation of two representative
probabilistic topic models, probabilistic latent semantic analysis (PLSA) and Latent Dirichlet Allocation (LDA), using three
representative text mining tasks, including document clustering, text categorization, and ad-hoc retrieval. The analysis of
our experimental results provides deeper understanding of topic models and many useful insights about how to optimize the
performance of topic models for these typical tasks. The task-based evaluation framework is generalizable to other topic models
in the family of either PLSA or LDA. 相似文献
19.
The Health Science Library at University of Tennessee (UT), Memphis has taken advantage of a campuswide network for the purpose of providing enhanced access to library services. With a terminal or microcomputer, members of the UT Memphis community can use an electronic menu system to complete photocopy, interlibrary loan, and computer literature search request forms; leave messages or sign up for library workshops; use electronic mail to receive citations and abstracts from computer literature searches; use an electronic bulletin board to scan the library's new acquisitions lists, library hours, services, and policies; and use bibliographic retrieval software to search the library's locally mounted databases. Remote access to library services and electronic resources, which is available twenty-four hours a day, could potentially save users time and the institution money. Remote access, however, is intended to supplement, not to supplant or discourage, in-house library use. 相似文献
20.
《Microprocessing and Microprogramming》1994,40(7):465-486
In multicomputer networks, the adaptive routing has been expected as a promising way to improve network performance by utilizing available network bandwidth. Previous adaptive routing algorithms in wormhole-routed multicomputer networks restrict the routing of messages to prevent deadlocks, and the routing restriction results in low degree of adaptiveness and low utilization of communication channels. In this paper, we examine the possibility of performing restriction-free, nonminimal adaptive routing in wormhole-routed networks as an approach to further improving the performance of these networks. A new flow control policy, called message cutting, is proposed, and two adaptive routing strategies are presented. Freedom of communication deadlock is achieved by the proposed flow control policy. The proposed adaptive routing strategies do not restrict routing and maximally utilize the physical and virtual channels. Simulation results show that the restriction-free adaptive routing approach is promising from the fact that it has the lowest latency and highest throughput depending on the number of virtual channels per physical channel and patterns of message traffic. 相似文献