首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Technology transfer, research and development and engineering projects frequently require in-depth literature reviews. These reviews are carried out using computerized, bibliographic data bases. The review and/or searching process involves keywords selected from data base thesauri. The search strategy is formulated to provide both breadth and depth of coverage and yields both relevant and nonrelevant citations. Experience indicates that about 10–20% of the citations are relevant. As a consequence, significant amounts of time are required to eliminate the nonrelevant citations. This paper describes statistically based, lexical association methods which can be employed to determine citation relevance. In particular, the searcher selects relevant terms from citation-derived indexes and this information along with lexical statistics is used to determine citation relevance. Preliminary results are encouraging with the techniques providing an effective concentration of relevant citations.  相似文献   

3.
4.
A new dictionary-based text categorization approach is proposed to classify the chemical web pages efficiently. Using a chemistry dictionary, the approach can extract chemistry-related information more exactly from web pages. After automatic segmentation on the documents to find dictionary terms for document expansion, the approach adopts latent semantic indexing (LSI) to produce the final document vectors, and the relevant categories are finally assigned to the test document by using the k-NN text categorization algorithm. The effects of the characteristics of chemistry dictionary and test collection on the categorization efficiency are discussed in this paper, and a new voting method is also introduced to improve the categorization performance further based on the collection characteristics. The experimental results show that the proposed approach has the superior performance to the traditional categorization method and is applicable to the classification of chemical web pages.  相似文献   

5.
Passage retrieval (already operational for lawyers) has advantages in output form over reference retrieval and is economically feasible. Previous experiments in passage retrieval for scientists have demonstrated recall and false retrieval rates as good or better than those of present reference retrieval services. The present experiment involved a greater variety of forms of retrieval question. In addition, search words were selected independently by two different people for each retrieval question. The search words selected, in combination with the computer procedures used for passage retrieval, produced average recall ratios of 72 and 67%, respectively, for the two selectors. The false retrieval rates were (except for one predictably difficult question) respectively 13 and 10 falsely retrieved sentences per answer-paper retrieved.  相似文献   

6.
This work addresses the information retrieval problem of auto-indexing Arabic documents. Auto-indexing a text document refers to automatically extracting words that are suitable for building an index for the document. In this paper, we propose an auto-indexing method for Arabic text documents. This method is mainly based on morphological analysis and on a technique for assigning weights to words. The morphological analysis uses a number of grammatical rules to extract stem words that become candidate index words. The weight assignment technique computes weights for these words relative to the container document. The weight is based on how spread is the word in a document and not only on its rate of occurrence. The candidate index words are then sorted in descending order by weight so that information retrievers can select the more important index words. We empirically verify the usefulness of our method using several examples. For these examples, we obtained an average recall of 46% and an average precision of 64%.  相似文献   

7.
基于文本的本体学习方法研究   总被引:3,自引:1,他引:3  
梁健  王惠临 《情报理论与实践》2007,30(1):112-115,17
本文介绍了当前基于文本的本体学习主要方法,在种子概念的基础上,设计了一种基于文本的本体学习方法,同时分析了术语获取、概念分类、关系获取等基于文本的本体学习关键技术。实验表明,借助种子概念能够从纯文本中抽取概念,对概念进行分类,为本体开发提供基础。  相似文献   

8.
9.
论企业技术集成创新与反求技术创新   总被引:3,自引:0,他引:3  
朱浩  章荣中 《科学学研究》2003,21(Z1):275-278
时代呼唤满足社会、国家和用户需求的技术集成创新,是企业技术创新永恒的课题。本文从现代意义的产品内涵和外延的角度,构建了由设计技术、材料技术、制造技术、管理技术和支撑技术组成的产品技术集成创新系统;论述了反求技术创新的方法、原理、内容与程序,为企业技术创新以技术和方法上的指导。  相似文献   

10.
高科技企业科技人才评价与激励   总被引:10,自引:0,他引:10       下载免费PDF全文
本文针对我国科技人才评价中存在的重要问题,准确界定了“科技人才”以及“科技企业”,并对科技人才从智力素质、身心素质、能力素质、绩效素质和思想品德素质等五个方面设置了一套较为完整的评价考核指标体系,应用模糊数学中模糊综合评价技术建立相关评价模型,较为准确地计算出科技人才综合素质模糊综合评价值,从而得到较为公正的评价结果,为薪酬激励奠定基础。  相似文献   

11.
Task-based evaluation of text summarization using Relevance Prediction   总被引:2,自引:0,他引:2  
This article introduces a new task-based evaluation measure called Relevance Prediction that is a more intuitive measure of an individual’s performance on a real-world task than interannotator agreement. Relevance Prediction parallels what a user does in the real world task of browsing a set of documents using standard search tools, i.e., the user judges relevance based on a short summary and then that same user—not an independent user—decides whether to open (and judge) the corresponding document. This measure is shown to be a more reliable measure of task performance than LDC Agreement, a current gold-standard based measure used in the summarization evaluation community. Our goal is to provide a stable framework within which developers of new automatic measures may make stronger statistical statements about the effectiveness of their measures in predicting summary usefulness. We demonstrate—as a proof-of-concept methodology for automatic metric developers—that a current automatic evaluation measure has a better correlation with Relevance Prediction than with LDC Agreement and that the significance level for detected differences is higher for the former than for the latter.  相似文献   

12.
随着网络的发展和cn域名的普及,个人网站得到迅速发展。留言本是个人网站的一个重要的功能之一,本文利用JSP技术设计了基于文本的留言本,既可以满足个人网站的需要,又节省了数据库的费用。  相似文献   

13.
Associative classification methods have been recently applied to various categorization tasks due to its simplicity and high accuracy. To improve the coverage for test documents and to raise classification accuracy, some associative classifiers generate a huge number of association rules during the mining step. We present two algorithms to increase the computational efficiency of associative classification: one to store rules very efficiently, and the other to increase the speed of rule matching, using all of the generated rules. Empirical results using three large-scale text collections demonstrate that the proposed algorithms increase the feasibility of applying associative classification to large-scale problems.  相似文献   

14.
Query response times within a fraction of a second in Web search engines are feasible due to the use of indexing and caching techniques, which are devised for large text collections partitioned and replicated into a set of distributed-memory processors. This paper proposes an alternative query processing method for this setting, which is based on a combination of self-indexed compressed text and posting lists caching. We show that a text self-index (i.e., an index that compresses the text and is able to extract arbitrary parts of it) can be competitive with an inverted index if we consider the whole query process, which includes index decompression, ranking and snippet extraction time. The advantage is that within the space of the compressed document collection, one can carry out the posting lists generation, document ranking and snippet extraction. This significantly reduces the total number of processors involved in the solution of queries. Alternatively, for the same amount of hardware, the performance of the proposed strategy is better than that of the classical approach based on treating inverted indexes and corresponding documents as two separate entities in terms of processors and memory space.  相似文献   

15.
技术联盟是企业技术创新和突破的主要途径,机会主义行为的存在严重影响了联盟的稳健发展。从调节聚焦这一全新的理论视角,根据社会认知原则对技术联盟中机会主义的界定,以及不同聚焦类型的联盟成员对机会主义的容忍程度进行了研究,得出结论,促进型聚焦比预防型聚焦的联盟成员更能容忍投机行为。以产学研技术联盟为例,产方为促进型聚焦,学研方为预防型聚焦,通过利益共享、第三方介入等手段融合双方聚焦类型,从而增强联盟稳定性,促进其高效运行。  相似文献   

16.
The rise in the amount of textual resources available on the Internet has created the need for tools of automatic document summarization. The main challenges of query-oriented extractive summarization are (1) to identify the topics of the documents and (2) to recover query-relevant sentences of the documents that together cover these topics. Existing graph- or hypergraph-based summarizers use graph-based ranking algorithms to produce individual scores of relevance for the sentences. Hence, these systems fail to measure the topics jointly covered by the sentences forming the summary, which tends to produce redundant summaries. To address the issue of selecting non-redundant sentences jointly covering the main query-relevant topics of a corpus, we propose a new method using the powerful theory of hypergraph transversals. First, we introduce a new topic model based on the semantic clustering of terms in order to discover the topics present in a corpus. Second, these topics are modeled as the hyperedges of a hypergraph in which the nodes are the sentences. A summary is then produced by generating a transversal of nodes in the hypergraph. Algorithms based on the theory of submodular functions are proposed to generate the transversals and to build the summaries. The proposed summarizer outperforms existing graph- or hypergraph-based summarizers by at least 6% of ROUGE-SU4 F-measure on DUC 2007 dataset. It is moreover cheaper than existing hypergraph-based summarizers in terms of computational time complexity.  相似文献   

17.
18.
杜敏  苏竣 《科学学研究》2007,25(Z2):164-167
将技术的社会建构理论运用于标准政策分析中,通过对制度层面、政治层面和技术层面的深入分析,将标准制定过程中不同影响因素加以整合。指出除要重视标准原则的指导作用,发挥完善标准制定程序的保障作用外,还要重视标准制定过程中的不同参与主体的作用;同时,要对标准中的知识产权问题给予关注。  相似文献   

19.
The use of domain-specific concepts in biomedical text summarization   总被引:3,自引:0,他引:3  
Text summarization is a method for data reduction. The use of text summarization enables users to reduce the amount of text that must be read while still assimilating the core information. The data reduction offered by text summarization is particularly useful in the biomedical domain, where physicians must continuously find clinical trial study information to incorporate into their patient treatment efforts. Such efforts are often hampered by the high-volume of publications. This paper presents two independent methods (BioChain and FreqDist) for identifying salient sentences in biomedical texts using concepts derived from domain-specific resources. Our semantic-based method (BioChain) is effective at identifying thematic sentences, while our frequency-distribution method (FreqDist) removes information redundancy. The two methods are then combined to form a hybrid method (ChainFreq). An evaluation of each method is performed using the ROUGE system to compare system-generated summaries against a set of manually-generated summaries. The BioChain and FreqDist methods outperform some common summarization systems, while the ChainFreq method improves upon the base approaches. Our work shows that the best performance is achieved when the two methods are combined. The paper also presents a brief physician’s evaluation of three randomly-selected papers from an evaluation corpus to show that the author’s abstract does not always reflect the entire contents of the full-text.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号