首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Evaluating the effectiveness of content-oriented XML retrieval methods   总被引:1,自引:0,他引:1  
Content-oriented XML retrieval approaches aim at a more focused retrieval strategy: Instead of retrieving whole documents, document components that are exhaustive to the information need while at the same time being as specific as possible should be retrieved. In this article, we show that the evaluation methods developed for standard retrieval must be modified in order to deal with the structure of XML documents. More precisely, the size and overlap of document components must be taken into account. For this purpose, we propose a new effectiveness metric based on the definition of a concept space defined upon the notions of exhaustiveness and specificity of a search result. We compare the results of this new metric by the results obtained with the official metric used in INEX, the evaluation initiative for content-oriented XML retrieval.
Gabriella KazaiEmail:
  相似文献   

2.
关涉是从逻辑视角出发,包容系统因素且容纳情境因素的相关性定义。目前常用的定义方法是形式化一组关涉属性,以这组属性来表示关涉。关涉属性的选择是一个IR逻辑体系所用语言的定义问题,而信息检索问题本质上是一个语言问题,因此这组关涉属性代表对信息检索系统要素的最一般的抽象描述,也就是信息检索系统的本质——关涉关系。关涉理论可用于信息检索系统定性分析和比较。  相似文献   

3.
This paper relates to the difficulty in retrieving precise information from big repositories of magazine articles in full text, and proposes an Extended Markup Language (XML) vocabulary for improving retrieval rates. The hypothesis tested was as follows: Magazine articles marked up with an XML vocabulary, indexed only by selected parts, give more precise search results than the same search using full text index.The study was exploratory with the following characteristics: 29 magazine articles were tested for results, 8 scholars were interviewed for defining 23 search strategies and evaluating results. The data showed that precision improved from 40.72% with full text search to 62.84% using XML markup and searching only in specific labels.Revision of the vocabulary and more testing has to be done by the library and information science community in order to obtain a valid vocabulary and provide more research results. Cultural characteristics and politics of librarians and information managers’ community are as important as technical issues in order to consider any technical proposal to be implemented successfully to achieve interoperability.  相似文献   

4.
XML及基于XML的广播式检索   总被引:3,自引:0,他引:3  
郭少友 《情报学报》2002,21(5):568-572
本文比较详细地介绍了XML的主要特点 ,并简要介绍了DTD和DOM技术 ,然后以对多个图书馆馆藏进行检索为例 ,初步探讨了利用XML技术进行广播式检索的基本思路。  相似文献   

5.
XML, the Extensible Markup Language, is key to the current revolution in publishing technology. Liberating content from proprietary systems and presentational coding, XML enables content to be published efficiently in a multitude of forms – print and electronic. This article discusses XML itself – a metalanguage by which publishers can describe the particular features of their publications apart from how those features are to be rendered in specific presentations – and also surveys a number of other related technologies in the XML family for styling, transforming, and linking. The result of an unprecedented degree of collaboration among competing interests, XML is an enabling technology that greatly enriches our publishing environment.  相似文献   

6.
XML在图书馆采访工作中的应用   总被引:3,自引:0,他引:3  
本文讨论了利用因特网和XML元数据在图书馆采访自动化系统与书业系统之间进行电子数据交换(EDI)的应用可能 ,并设计实现了一个基于XML文档的图书采访中心和图书馆自动化系统应用实例  相似文献   

7.
XML 数据库技术   总被引:4,自引:0,他引:4  
为了顺应Internet发展的需要, XML数据库应运而生。XML数据库是一种新型的数据库技术,与传统数据库相比,它适合于对半结构化数据的存取管理;它能表示和移植数据以及具有集成异构数据库系统的能力。XML数据库技术的这些特别优势将会对网络信息资源的管理产生重大影响。  相似文献   

8.
XML及其在图书馆的应用   总被引:11,自引:1,他引:10  
文章在简要介绍了XML的定义、特点、功能与用途后,探讨了XML技术在图书馆的应用前景。  相似文献   

9.
This paper investigates the impact of three approaches to XML retrieval: using Zettair, a full-text information retrieval system; using eXist, a native XML database; and using a hybrid system that takes full article answers from Zettair and uses eXist to extract elements from those articles. For the content-only topics, we undertake a preliminary analysis of the INEX 2003 relevance assessments in order to identify the types of highly relevant document components. Further analysis identifies two complementary sub-cases of relevance assessments (General and Specific) and two categories of topics (Broad and Narrow). We develop a novel retrieval module that for a content-only topic utilises the information from the resulting answer list of a native XML database and dynamically determines the preferable units of retrieval, which we call Coherent Retrieval Elements. The results of our experiments show that—when each of the three systems is evaluated against different retrieval scenarios (such as different cases of relevance assessments, different topic categories and different choices of evaluation metrics)—the XML retrieval systems exhibit varying behaviour and the best performance can be reached for different values of the retrieval parameters. In the case of INEX 2003 relevance assessments for the content-only topics, our newly developed hybrid XML retrieval system is substantially more effective than either Zettair or eXist, and yields a robust and a very effective XML retrieval.  相似文献   

10.
XML与数字图书馆   总被引:19,自引:1,他引:18  
介绍了一种新的网页编写语言XML 及它的优势和广阔发展前景。特别是XML 在数字图书馆中的应用, 预示着数字图书馆划时代的到来。  相似文献   

11.
XML在图书馆系统中的实现技术   总被引:2,自引:3,他引:2  
主要探讨了XML 在图书馆系统中的实现机理, 提出了一个基于XML 的图书馆系统框架, 并介绍了相关的基础知识和关键技术。  相似文献   

12.
在电子商务的发展初期,WEB仅仅是一种电子媒介,以简单的方式来说,WEB的存在是为了使阅读比普通文档变得更容易。这也是最初设计WEB的目的。使用者的日益增加及专用网络(如EDI网络)平台价格昂贵且难以建立,WEB成为许多公司和个人在进行商务活动时的选择。随着WEB技术的进一步的完善,在WEB上开展电子商务(即WEB商务)必将成为将来商务活动的主流。 站在一个较高的层次上看,WEB商务和传统的商务基本相同,只是除了一些活动因为采用电子形式而有些改变。但不管怎样,整个商务活动都要经历销售、支付、履约、物流和售后服务这几个环节。传统交易可以通过人与人的接触、电话、  相似文献   

13.
首先说明利用加权XML数据模型分别得到标准XML参考实例和XML数据实例的方法,并对DTD约束修饰符的表达方法进行介绍。其次,详细阐述相似度算法的实现方法,重点说明在XML数据实例中寻找与标准XML参考实例的匹配节点算法和计算标准 XML参考实例与XML数据实例的相似度算法。最后,对相关实验及其结论进行总结。  相似文献   

14.
阐述检索标识专指度各种概念,检索效率概念,检索标识专指度与检索效率的关系以及在文献检索全过程的三个环节中提高专指度的措施,专指度的适度控制问题,自然语言检索中的专指度问题.  相似文献   

15.
16.
对比分析了PubMed,BIOSISPreviews,EMBASE.corn3个数据库的收录情况、检索结果、关注度,为医学科研定题或立项检索时合理选择英文医学检索工具提供依据,提高外文文献的查全率。  相似文献   

17.
XML在数字图书馆的应用   总被引:3,自引:0,他引:3  
本文介绍了XML的产生、发展和数字图书馆的产生背景及概念,并对XML在数字图书馆中的应用进行探讨。  相似文献   

18.
基于XML的MARC研究   总被引:4,自引:1,他引:3  
本文分析了机读目录MARC在未来数字化图书馆应用的局限性,并提出了改进方案,以哈尔滨工业大学为例,对其采用的中文机读目录CNMARC格式进行了XML转换的尝试,从而使得MARC书目数据库和Internet上的非书目数据库的集成成为可能.本文的研究对于现有MARC数据在未来数字图书馆中的利用具有重要意义.  相似文献   

19.
Documents formatted in eXtensible Markup Language (XML) are available in collections of various document types. In this paper, we present an approach for the summarisation of XML documents. The novelty of this approach lies in that it is based on features not only from the content of documents, but also from their logical structure. We follow a machine learning, sentence extraction-based summarisation technique. To find which features are more effective for producing summaries, this approach views sentence extraction as an ordering task. We evaluated our summarisation model using the INEX and SUMMAC datasets. The results demonstrate that the inclusion of features from the logical structure of documents increases the effectiveness of the summariser, and that the learnable system is also effective and well-suited to the task of summarisation in the context of XML documents. Our approach is generic, and is therefore applicable, apart from entire documents, to elements of varying granularity within the XML tree. We view these results as a step towards the intelligent summarisation of XML documents.
Mounia LalmasEmail:
  相似文献   

20.
XML信息检索探究   总被引:4,自引:0,他引:4  
廖述梅  万常选  徐升华 《情报学报》2007,381(2):229-234
XML文档是具有层次结构和文本内容的半结构化数据。现有的Web信息检索是基于HTML文档的关键词全文检索,无法胜任XML元素粒度的检索;同时,XML数据库检索实现的是精确查找,检索结果无排序支持。因此,融合信息检索和数据库技术研究XML检索问题成为必然。本文从XML检索的问题域出发,阐述了XML信息检索(XML IR)的国内外研究现状与特点,并分析了目前XML IR的热点和难点问题。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号