期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

W.A. van der Meulen P.J.F.C. Janssen 《Information processing & management》1977,13(1):13-21

A comparative evaluation has been carried out on the Philips “DIRECT” and the British “INSPEC” retrieval system. DIRECT is based on automatic indexing whereas INSPEC uses manual subject indexing.Two queries were submitted to both systems, using the same data base. The results are expressed in terms of recall and precision. Both recall and precision of INSPEC were found to be higher than those of DIRECT by 20%. It is concluded that this is mainly a result of the query formulation. The effectiveness obtained with automatic indexing of documents is equivalent to that of the manual procedure. 相似文献

2.

Automatic indexing using the SLC-II system

C.I. Barnes L. Costantini S. Perschke 《Information processing & management》1978,14(2):107-119

相似文献

3.

Automatic indexing using term discrimination and term precision measurements

G. Salton A. Wong C.T. Yu 《Information processing & management》1976,12(1):43-51

A variety of abstract automatic indexing models have been developed in recent times in an effort to produce indexing methods that are both effective and usable in practice. Among these are the term discrimination model and the term precision system. These two indexing systems are briefly described and experimental evidence is cited showing that a combination of both theories produces better retrieval performance than either one alone. Appropriate conclusions are reached concerning viable automatic indexing procedures usable in practice. 相似文献

4.

Automatic,semantics-based indexing of natural language texts for information retrieval systems

Stephan Braun Camilla Schwind 《Information processing & management》1976,12(2):147-153

The fundamental idea of the work reported here is to extract index phrases from texts with the help of a single word concept dictionary and a thesaurus containing relations among concepts. The work is based on the fact, that, within every phrase, the single words the phrase is composed of are related in a certain well denned manner, the type of relations holding between concepts depending only on the concepts themselves. Therefore relations can be stored in a semantic network. The algorithm described extracts single word concepts from texts and combines them to phrases using the semantic relations between these concepts, which are stored in the network. The results obtained show that phrase extraction from texts by this semantic method is possible and offers many advantages over other (purely syntactic or statistic) methods concerning preciseness and completeness of the meaning representation of the text. But the results show, too, that some syntactic and morphologic “filtering” should be included for effectivity reasons. 相似文献

5.

Automatic indexing of online health resources for a French quality controlled gateway

Aurélie Névéol Alexandrina Rogozan Stéfan Darmoni 《Information processing & management》2006

The profusion of online resources calls for tools and methods to help Internet users find precisely what they are looking for. Quality controlled gateway CISMeF provides such services for health resources. However, the human cost of maintaining and updating the catalogue are increasingly high. This paper presents the automatic indexing system currently developed in the CISMeF team to be used as such for preliminary indexing, or after human reviewing for the final indexing. The system architecture, using the INTEX platform for MeSH term extraction is detailed. The results of a first evaluation tend to indicate that the automatic indexing strategy is relevant, as it achieves a precision comparable to that of other existing operational systems. Moreover, the system presented in this paper retrieves keyword/qualifier pairs as opposed to single terms, therefore providing a significantly more precise indexing. Further development and tests will be carried out in order to improve the coverage of the dictionaries, and validate the efficiency of the system in the indexers’ everyday work. 相似文献

6.

On automatic support to indexing a life sciences data base

N. Vleduts-Stokolov 《Information processing & management》1982,18(6):313-321

The paper describes a technique developed as automatic support to subject heading indexing at BIOSIS. The technique is based on the use of a formalized language for semantic representation of biological texts and subject headings—the language of Concept Primitives. The structure of the language is discussed as well as the structure of the Semantic Vocabulary, in which natural language words from biological texts are described by Concept Primitives. The Semantic Vocabulary is being constructed. Approximately 8,000 entries corresponding to high frequency significant words have been compiled, comprising at least three-quarters of the final number. Results of experiments checking the approach are given, and journal/subject heading and author/subject heading correlation data are analyzed to be used as a supporting technique. 相似文献

7.

Automatic data exchange system

《Journal of The Franklin Institute》1962,273(2):174-175

相似文献

8.

Thesaurus-based automatic book indexing

Martin Dillon 《Information processing & management》1982,18(4):167-178

This paper describes a technique for automatic book indexing. The technique requires a dictionary of terms that are to appear in the index, along with all text strings that count as instances of the term. It also requires that the text be in a form suitable for processing by a text formatter. A program searches the text for each occurrence of a term or its associated strings and creates an entry to the index when either is found. The results of the experimental application to a portion of a book text are presented, including measures of precision and recall, with precision giving the ratio of terms correctly assigned in the automatic process to the total assigned, and recall giving the ratio of correct terms automatically assigned to the total number of term assignments according to a human standard. Results indicate that the technique can be applied successfully, especially for texts that employ a technical vocabulary and where there is a premium on indexing exhaustivity. 相似文献

9.

Efficient immediate-access dynamic indexing

《Information processing & management》2023,60(3):103248

In a dynamic retrieval system, documents must be ingested as they arrive, and be immediately findable by queries. Our purpose in this paper is to describe an index structure and processing regime that accommodates that requirement for immediate access, seeking to make the ingestion process as streamlined as possible, while at the same time seeking to make the growing index as small as possible, and seeking to make term-based querying via the index as efficient as possible. We describe a new compression operation and a novel approach to extensible lists which together facilitate that triple goal. In particular, the structure we describe provides incremental document-level indexing using as little as two bytes per posting and only a small amount more for word-level indexing; provides fast document insertion; supports immediate and continuous queryability; provides support for fast conjunctive queries and similarity score-based ranked queries; and facilitates fast conversion of the dynamic index to a “normal” static compressed inverted index structure. Measurement of our new mechanism confirms that in-memory dynamic document-level indexes for collections into the gigabyte range can be constructed at a rate of two gigabytes/minute using a typical server architecture, that multi-term conjunctive Boolean queries can be resolved in just a few milliseconds each on average even while new documents are being concurrently ingested, and that the net memory space required for all of the required data structures amounts to an average of as little as two bytes per stored posting, less than half the space required by the best previous mechanism. 相似文献

10.

Toward a theory of indexing

Harold Borko 《Information processing & management》1977,13(6):355-365

A theory of indexing helps explain the nature of indexing, the structure of the vocabulary, and the quality of the index. Indexing theories formulated by Jonker, Heilprin, Landry and Salton are described. Each formulation has a different focus. Jonker, by means of the Terminological and Connective Continua, provided a basis for understanding the relationships between the size of the vocabulary, the hierarchical organization, and the specificity by which concepts can be described. Heilprin introduced the idea of a search path which leads from query to document. He also added a third dimension to Jonker's model; the three variables are diffuseness, permutivity and hierarchical connectedness. Landry made an ambitious and well conceived attempt to build a comprehensive theory of indexing predicated upon sets of documents, sets of attributes, and sets of relationships between the two. It is expressed in theorems and by formal notation. Salton provided both a notational definition of indexing and procedures for improving the ability of index terms to discriminate between relevant and nonrelevant documents. These separate theories need to be tested experimentally and eventually combined into a unified comprehensive theory of indexing. 相似文献

11.

Semi-automatic indexing and encoding

《Journal of The Franklin Institute》1960,270(1):3-26

相似文献

12.

Evaluation of machine-aided indexing2

Paul H. Klingbiel Catherine C. Rinker 《Information processing & management》1976,12(6):351-366

The Defense Documentation Center (DDC), a field activity of the Defense Supply Agency, implemented an automated indexing procedure in October 1973. This Machine-Aided Indexing (MAI) System [1] had been under development since 1969. The following is a report of several comparisons designed to measure the retrieval effectiveness of MAI and manual indexing procedures under normal operational conditions.Several definitions are required in order to clarify the MAI process as it pertains to these investigations. The MAI routines scan unedited text in the form of titles and abstracts. The output of these routines is called Candidate Index Terms. These word strings are matched by computer against an internal file of manually screened and cross-referenced terms called a Natural Language Data Base (NLDB). The NLDB differs from a standard thesaurus in that there is no related term category. Word strings which match the NLDB are accepted as valid MAI output. The mismatches are manually screened for suitability. Those accepted are added to the NLDB. If now, the original set of Candidate Index Terms is matched against the updated NLDB, the matched output is unedited MAI. If both the unedited matches and mismatches are further structured in accession order and sent to technical analysts for review, the output of that process is called edited MAI.The tests were designed to (a) compare unedited MAI with manual indexing, holding the indexing language and the retrieval technique constant; (b) compare edited MAI with unedited MAI, holding both the indexing and the retrieval technique constant; and (c) compare two different retrieval techniques, called simple and complex, while holding the indexing constant. 相似文献

13.

Automatic ranking of information retrieval systems using data fusion

Rabia Nuray Fazli Can 《Information processing & management》2006

Measuring effectiveness of information retrieval (IR) systems is essential for research and development and for monitoring search quality in dynamic environments. In this study, we employ new methods for automatic ranking of retrieval systems. In these methods, we merge the retrieval results of multiple systems using various data fusion algorithms, use the top-ranked documents in the merged result as the “(pseudo) relevant documents,” and employ these documents to evaluate and rank the systems. Experiments using Text REtrieval Conference (TREC) data provide statistically significant strong correlations with human-based assessments of the same systems. We hypothesize that the selection of systems that would return documents different from the majority could eliminate the ordinary systems from data fusion and provide better discrimination among the documents and systems. This could improve the effectiveness of automatic ranking. Based on this intuition, we introduce a new method for the selection of systems to be used for data fusion. For this purpose, we use the bias concept that measures the deviation of a system from the norm or majority and employ the systems with higher bias in the data fusion process. This approach provides even higher correlations with the human-based results. We demonstrate that our approach outperforms the previously proposed automatic ranking methods. 相似文献

14.

Back-of-book subject indexing with APL: Automated indexing for those without computer background

John C. Pierce 《Information processing & management》1978,14(2):85-91

相似文献

15.

基于数据格式支持机制的自动化渗透测试框架

闻观行张园超张玉清《中国科学院研究生院学报》2011,28(5)

Backtrack4是功能最全面的一款测试平台,但由于数据交换处理机制的缺失使得它难以胜任高效的测试需求.设计了相应的数据格式支持机制,并依此开发了一个渗透测试框架(PTF).该框架会自动使用有关的渗透测试工具进行信息探测、漏洞评估、报告生成.真实网络环境中的实验验证了PTF能高效完成自动化渗透测试,进而大幅提升了使用Backtrack4进行渗透测试的有效性. 相似文献

16.

Between traditional classification and coordinate indexing

Amtabha Ghose Anand S. Dhawle 《Information processing & management》1979,15(1):27-31

An ordering system for a global information network is necessary in order to enable the user to retrieve the particular information he is looking for. Classification has been one of the methods of ordering. The principle of traditional classification has been based on the idea of partitioning the universe of knowledge in mutually exclusive classes, i.e. subjects. A particular topic is defined by narrower classification within a class following the principle of ‘genusspecies’ relationship. Ranganathan's system of faceted classification has only replaced the classification of terms into subjects and sub-subjects by classification of terms into five ambiguous categories. Taube's system of coordinate indexing gives full freedom to the user to combine any number of terms of his choice. To be effective for social sciences such a system has to overcome some difficult problems of semantics. The system MANIS described here maintains the traditional classification and yet allows the user to combine terms of his choice, where the choice is restricted to the terms belonging to the system of traditional classification. 相似文献

17.

On relative indexing in fuzzy retrieval systems

Ronald Rousseau 《Information processing & management》1985,21(5):415-417

相似文献

18.

Concept integration of document databases using different indexing languages

Xueying Zhang 《Information processing & management》2006

An integrated information retrieval system generally contains multiple databases that are inconsistent in terms of their content and indexing. This paper proposes a rough set-based transfer (RST) model for integration of the concepts of document databases using various indexing languages, so that users can search through the multiple databases using any of the current indexing languages. The RST model aims to effectively create meaningful transfer relations between the terms of two indexing languages, provided a number of documents are indexed with them in parallel. In our experiment, the indexing concepts of two databases respectively using the Thesaurus of Social Science (IZ) and the Schlagwortnormdatei (SWD) are integrated by means of the RST model. Finally, this paper compares the results achieved with a cross-concordance method, a conditional probability based method and the RST model. 相似文献

19.

Intelligent indexing and retrieval: A man-machine partnership

R.A. Wall 《Information processing & management》1980,16(2):73-90

The designation of overlapping hierarchies in thesauri, first outlined in 1973, is suggested as a key element in progress towards a successful man-machine partnership. An updating, expansion and theoretical background of the 1973 proposal are given. The use of the UDC, both as a matrix and a searching aid, is postulated but is not essential. Means of distinguishing overlapping terms from other “related terms” are suggested, in order to make possible the accurate representation of all hirarchical relationships. At its largest, the result could be a “Universal Reference Vocabulary”, maintained on-line only and used to construct profiles before searching via natural language and/or class numbers. It is suggested that a computer program package for a small model area within Social Sciences should be given priority. 相似文献

20.

Relative indexing on the basis of users' profiles

Czesław Daniłowicz 《Information processing & management》1983,19(3):159-163

Principles for determining the profile of a system user have been presented. These principles are based on the analysis of co-occurrence of index terms in queries and pertinent documents. Moreover, a procedure for determining index term weights on the basis of user profiles has been introduced. The information value of the index term weights depends on the degree of homogeneity of the system users. 相似文献