共查询到20条相似文献,搜索用时 15 毫秒
1.
Content-oriented XML retrieval approaches aim at a more focused retrieval strategy: Instead of retrieving whole documents, document components that are exhaustive to the information need while at the same time being as specific as possible should be retrieved. In this article, we show that the evaluation methods developed for standard retrieval must be modified in order to deal with the structure of XML documents. More precisely, the size and overlap of document components must be taken into account. For this purpose, we propose a new effectiveness metric based on the definition of a concept space defined upon the notions of exhaustiveness and specificity of a search result. We compare the results of this new metric by the results obtained with the official metric used in INEX, the evaluation initiative for content-oriented XML retrieval.
相似文献
Gabriella KazaiEmail: |
2.
关涉是从逻辑视角出发,包容系统因素且容纳情境因素的相关性定义。目前常用的定义方法是形式化一组关涉属性,以这组属性来表示关涉。关涉属性的选择是一个IR逻辑体系所用语言的定义问题,而信息检索问题本质上是一个语言问题,因此这组关涉属性代表对信息检索系统要素的最一般的抽象描述,也就是信息检索系统的本质——关涉关系。关涉理论可用于信息检索系统定性分析和比较。 相似文献
3.
This paper relates to the difficulty in retrieving precise information from big repositories of magazine articles in full text, and proposes an Extended Markup Language (XML) vocabulary for improving retrieval rates. The hypothesis tested was as follows: Magazine articles marked up with an XML vocabulary, indexed only by selected parts, give more precise search results than the same search using full text index.The study was exploratory with the following characteristics: 29 magazine articles were tested for results, 8 scholars were interviewed for defining 23 search strategies and evaluating results. The data showed that precision improved from 40.72% with full text search to 62.84% using XML markup and searching only in specific labels.Revision of the vocabulary and more testing has to be done by the library and information science community in order to obtain a valid vocabulary and provide more research results. Cultural characteristics and politics of librarians and information managers’ community are as important as technical issues in order to consider any technical proposal to be implemented successfully to achieve interoperability. 相似文献
4.
XML及基于XML的广播式检索 总被引:3,自引:0,他引:3
本文比较详细地介绍了XML的主要特点 ,并简要介绍了DTD和DOM技术 ,然后以对多个图书馆馆藏进行检索为例 ,初步探讨了利用XML技术进行广播式检索的基本思路。 相似文献
5.
Bill Kasdorf 《Learned Publishing》2001,14(3):223-231
XML, the Extensible Markup Language, is key to the current revolution in publishing technology. Liberating content from proprietary systems and presentational coding, XML enables content to be published efficiently in a multitude of forms – print and electronic. This article discusses XML itself – a metalanguage by which publishers can describe the particular features of their publications apart from how those features are to be rendered in specific presentations – and also surveys a number of other related technologies in the XML family for styling, transforming, and linking. The result of an unprecedented degree of collaboration among competing interests, XML is an enabling technology that greatly enriches our publishing environment. 相似文献
6.
7.
XML 数据库技术 总被引:4,自引:0,他引:4
郭瑞华 《现代图书情报技术》2004,20(9):61-65
为了顺应Internet发展的需要, XML数据库应运而生。XML数据库是一种新型的数据库技术,与传统数据库相比,它适合于对半结构化数据的存取管理;它能表示和移植数据以及具有集成异构数据库系统的能力。XML数据库技术的这些特别优势将会对网络信息资源的管理产生重大影响。 相似文献
8.
9.
This paper investigates the impact of three approaches to XML retrieval: using Zettair, a full-text information retrieval system; using eXist, a native XML database; and using a hybrid system that takes full article answers from Zettair and uses eXist to extract elements from those articles. For the content-only topics, we undertake a preliminary analysis of the INEX 2003 relevance assessments in order to identify the types of highly relevant document components. Further analysis identifies two complementary sub-cases of relevance assessments (General and Specific) and two categories of topics (Broad and Narrow). We develop a novel retrieval module that for a content-only topic utilises the information from the resulting answer list of a native XML database and dynamically determines the preferable units of retrieval, which we call Coherent Retrieval Elements. The results of our experiments show that—when each of the three systems is evaluated against different retrieval scenarios (such as different cases of relevance assessments, different topic categories and different choices of evaluation metrics)—the XML retrieval systems exhibit varying behaviour and the best performance can be reached for different values of the retrieval parameters. In the case of INEX 2003 relevance assessments for the content-only topics, our newly developed hybrid XML retrieval system is substantially more effective than either Zettair or eXist, and yields a robust and a very effective XML retrieval. 相似文献
10.
XML与数字图书馆 总被引:19,自引:1,他引:18
孙晓菲 《现代图书情报技术》2000,16(4):14-15
介绍了一种新的网页编写语言XML 及它的优势和广阔发展前景。特别是XML 在数字图书馆中的应用, 预示着数字图书馆划时代的到来。 相似文献
11.
12.
在电子商务的发展初期,WEB仅仅是一种电子媒介,以简单的方式来说,WEB的存在是为了使阅读比普通文档变得更容易。这也是最初设计WEB的目的。使用者的日益增加及专用网络(如EDI网络)平台价格昂贵且难以建立,WEB成为许多公司和个人在进行商务活动时的选择。随着WEB技术的进一步的完善,在WEB上开展电子商务(即WEB商务)必将成为将来商务活动的主流。 站在一个较高的层次上看,WEB商务和传统的商务基本相同,只是除了一些活动因为采用电子形式而有些改变。但不管怎样,整个商务活动都要经历销售、支付、履约、物流和售后服务这几个环节。传统交易可以通过人与人的接触、电话、 相似文献
13.
首先说明利用加权XML数据模型分别得到标准XML参考实例和XML数据实例的方法,并对DTD约束修饰符的表达方法进行介绍。其次,详细阐述相似度算法的实现方法,重点说明在XML数据实例中寻找与标准XML参考实例的匹配节点算法和计算标准 XML参考实例与XML数据实例的相似度算法。最后,对相关实验及其结论进行总结。 相似文献
14.
15.
16.
王红霞 《中华医学图书馆杂志》2011,(9):56-57
对比分析了PubMed,BIOSISPreviews,EMBASE.corn3个数据库的收录情况、检索结果、关注度,为医学科研定题或立项检索时合理选择英文医学检索工具提供依据,提高外文文献的查全率。 相似文献
17.
18.
基于XML的MARC研究 总被引:4,自引:1,他引:3
本文分析了机读目录MARC在未来数字化图书馆应用的局限性,并提出了改进方案,以哈尔滨工业大学为例,对其采用的中文机读目录CNMARC格式进行了XML转换的尝试,从而使得MARC书目数据库和Internet上的非书目数据库的集成成为可能.本文的研究对于现有MARC数据在未来数字图书馆中的利用具有重要意义. 相似文献
19.
Massih R. Amini Anastasios Tombros Nicolas Usunier Mounia Lalmas 《Information Retrieval》2007,10(3):233-255
Documents formatted in eXtensible Markup Language (XML) are available in collections of various document types. In this paper,
we present an approach for the summarisation of XML documents. The novelty of this approach lies in that it is based on features
not only from the content of documents, but also from their logical structure. We follow a machine learning, sentence extraction-based
summarisation technique. To find which features are more effective for producing summaries, this approach views sentence extraction
as an ordering task. We evaluated our summarisation model using the INEX and SUMMAC datasets. The results demonstrate that
the inclusion of features from the logical structure of documents increases the effectiveness of the summariser, and that
the learnable system is also effective and well-suited to the task of summarisation in the context of XML documents. Our approach
is generic, and is therefore applicable, apart from entire documents, to elements of varying granularity within the XML tree.
We view these results as a step towards the intelligent summarisation of XML documents.
相似文献
Mounia LalmasEmail: |