首页 | 本学科首页   官方微博 | 高级检索  
     检索      


The challenge of commercial document retrieval,Part I: Major issues,and a framework based on search exhaustivity,determinacy of representation and document collection size
Institution:1. Department of Neurotoxicology, Mossakowski Medical Research Centre, Polish Academy of Sciences, 5 Pawińskiego St, 02-106 Warsaw, Poland;2. Center for Translational Neuromedicine, University of Rochester, NY, USA;3. Faculty of Life Sciences, University of Manchester, UK;4. Achucarro Center for Neuroscience, IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain;1. Biomaterials and Bioengineering Research Laboratory, Department of Biotechnology, Pachhunga University College, Mizoram University (A Central University), Aizawl 796 001, Mizoram, India;2. Department of Biochemistry, Sri Venkateswara University, Tirupathi, AP 517 502, India;3. Biomaterials Laboratory, Department of Materials Engineering, Indian Institute of Science, Bangalore, Karnataka 560 012, India;4. Bioengineering and Drug Design Laboratory, Department of Biotechnology, Indian Institute of Technology Madras, Chennai 600 036, Tamil Nadu, India;1. Department of Mathematics, Michigan State University, East Lansing, MI 48824-1027, USA;2. Department of Mathematics, Dartmouth College, Hanover, NH 03755-3551, USA;1. Tampere University of Technology, Fast-Lab., P.O. Box 600, FIN-33101 Tampere, Finland;2. Universidad de Cantabria, Department of Electrical and Energy Engineering, Avda. De los Castros s/n, 39005 Santander, Spain
Abstract:With the growing focus on what is collectively known as “knowledge management”, a shift continues to take place in commercial information system development: a shift away from the well-understood data retrieval/database model, to the more complex and challenging development of commercial document/information retrieval models. While document retrieval has had a long and rich legacy of research, its impact on commercial applications has been modest. At the enterprise level most large organizations have little understanding of, or commitment to, high quality document access and management. Part of the reason for this is that we still do not have a good framework for understanding the major factors which affect the performance of large-scale corporate document retrieval systems. The thesis of this discussion is that document retrieval—specifically, access to intellectual content—is a complex process which is most strongly influenced by three factors: the size of the document collection; the type of search (exhaustive, existence or sample); and, the determinacy of document representation. Collectively, these factors can be used to provide a useful framework for, or taxonomy of, document retrieval, and highlight some of the fundamental issues facing the design and development of commercial document retrieval systems. This is the first of a series of three articles. Part II (D.C. Blair, The challenge of commercial document retrieval. Part II. A strategy for document searching based on identifiable document partitions, Information Processing and Management, 2001b, this issue) will discuss the implications of this framework for search strategy, and Part III (D.C. Blair, Some thoughts on the reported results of Text REtrieval Conference (TREC), Information Processing and Management, 2002, forthcoming) will consider the importance of the TREC results for our understanding of operating information retrieval systems.
Keywords:
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号