共查询到20条相似文献,搜索用时 15 毫秒
1.
Sentence level novelty detection aims at spotting sentences with novel information from an ordered sentence list. In the task, sentences appearing later in the list with no new meanings are eliminated. For the task of novelty detection, the contributions of this paper are three-fold. First, conceptually, this paper reveals the computational nature of the task currently overlooked by the Novelty community—Novelty as a combination of partial overlap (PO) and complete overlap (CO) relations between sentences. We define partial overlap between two sentences as a sharing of common facts, while complete overlap is when one sentence covers all of the meanings of the other sentence. Second, technically, a novel approach, the selected pool method is provided which follows naturally from the PO-CO computational structure. We provide formal error analysis for selected pool and methods based on this PO-CO framework. We address the question how accurate must the PO judgments be to outperform the baseline pool method. Third, experimentally, results were presented for all the three novelty datasets currently available. Results show that the selected pool is significantly better or no worse than the current methods, an indication that the term overlap criterion for the PO judgments could be adequately accurate.
相似文献
Shaoping MaEmail: |
2.
Massih R. Amini Anastasios Tombros Nicolas Usunier Mounia Lalmas 《Information Retrieval》2007,10(3):233-255
Documents formatted in eXtensible Markup Language (XML) are available in collections of various document types. In this paper,
we present an approach for the summarisation of XML documents. The novelty of this approach lies in that it is based on features
not only from the content of documents, but also from their logical structure. We follow a machine learning, sentence extraction-based
summarisation technique. To find which features are more effective for producing summaries, this approach views sentence extraction
as an ordering task. We evaluated our summarisation model using the INEX and SUMMAC datasets. The results demonstrate that
the inclusion of features from the logical structure of documents increases the effectiveness of the summariser, and that
the learnable system is also effective and well-suited to the task of summarisation in the context of XML documents. Our approach
is generic, and is therefore applicable, apart from entire documents, to elements of varying granularity within the XML tree.
We view these results as a step towards the intelligent summarisation of XML documents.
相似文献
Mounia LalmasEmail: |
3.
Jacob Soll 《Archival Science》2007,7(4):331-342
This article examines the archival methods developed by Colbert to train his son in state administration. Based on Colbert’s
correspondence with his son, it reveals the practices Colbert thought necessary to collect and manage information in his state
encyclopedic archive during the last half of the 17th century.
相似文献
Jacob SollEmail: |
4.
Evaluation is a major driving force in advancing the state of the art in language technologies. In particular, methods for automatically assessing the quality of machine output is the preferred method for measuring progress, provided that these metrics have been validated against human judgments. Following recent developments in the automatic evaluation of machine translation and document summarization, we present a similar approach, implemented in a measure called POURPRE, an automatic technique for evaluating answers to complex questions based on n-gram co-occurrences between machine output and a human-generated answer key. Until now, the only way to assess the correctness of answers to such questions involves manual determination of whether an information “nugget” appears in a system's response. The lack of automatic methods for scoring system output is an impediment to progress in the field, which we address with this work. Experiments with the TREC 2003, TREC 2004, and TREC 2005 QA tracks indicate that rankings produced by our metric correlate highly with official rankings, and that POURPRE outperforms direct application of existing metrics.
相似文献
Dina Demner-FushmanEmail: |
5.
6.
A summary overview of the children’s and young adult publishing industry in China with a focus on the size of the market,
ten major publishing houses, copyright and trends. Special emphasis has been placed on specific transaction for the sale of
translation rights from German language publishers to China and minimal activities of German rights sold to Chinese publishers.
相似文献
Jing BartzEmail: |
7.
Smoothing of document language models is critical in language modeling approaches to information retrieval. In this paper,
we present a novel way of smoothing document language models based on propagating term counts probabilistically in a graph
of documents. A key difference between our approach and previous approaches is that our smoothing algorithm can iteratively
propagate counts and achieve smoothing with remotely related documents. Evaluation results on several TREC data sets show that the proposed method significantly outperforms the
simple collection-based smoothing method. Compared with those other smoothing methods that also exploit local corpus structures,
our method is especially effective in improving precision in top-ranked documents through “filling in” missing query terms
in relevant documents, which is attractive since most users only pay attention to the top-ranked documents in search engine
applications.
相似文献
ChengXiang ZhaiEmail: |
8.
This paper, based on PhD research, reflects upon the market for electronic books in the general trade sectors of UK and US
publishers during the early years of the 21st century. The paper reports on interviews carried out with publishers between
2003 and 2005, and reflects upon four areas which presented and still present challenges to the uptake of e-books—negative
perceptions from consumers; formats; pricing and issues regarding digital rights. The paper concludes that the development
and uptake of electronic books has some way to go in the general trade/mass-market sectors.
相似文献
Cliff McKnightEmail: |
9.
Nathan Hollier 《Publishing Research Quarterly》2008,24(3):165-174
This article provides a summary of and commentary on ‘A Lovely Kind of Madness: Small and Independent Publishing in Australia’,
an unpublished report by Kate Freeth, commissioned by the Small Press Underground Networking Community (SPUNC), the representative
body for small and independent publishers in Australia, and released in November 2007. Freeth’s 14,000 word report constitutes
the most detailed and comprehensive study of Australian small and independent publishing since the second volume of Michael
Denholm’s Small Press Publishing in Australia (1991) and provides much primary material for policy makers, scholars, and people working in and around the publishing industry.
相似文献
Nathan HollierEmail: |
10.
Jennifer S. Milligan 《Archival Science》2007,7(4):359-367
Curious Archives examines the creation of the museum of archives, the Musée de l’Histoire de France, at the Imperial Archives
of France under the direction of Leon de Laborde, 1858–1867. This museum was intended as a crucial tool for publicizing the
Archives and educating the public, but also represented a break from the Archives’ role as administrative storehouse both
in practice and in the popular imagination. The museum’s conception and reception reveal conflicts around the Archives’ mission
and contents, particularly regarding public interest, the potential dangers of public curiosity, and nature of documentary
and historical knowledge in nineteenth-century France.
相似文献
Jennifer S. MilliganEmail: |
11.
On rank-based effectiveness measures and optimization 总被引:1,自引:0,他引:1
Many current retrieval models and scoring functions contain free parameters which need to be set—ideally, optimized. The process
of optimization normally involves some training corpus of the usual document-query-relevance judgement type, and some choice
of measure that is to be optimized. The paper proposes a way to think about the process of exploring the space of parameter
values, and how moving around in this space might be expected to affect different measures. One result, concerning local optima,
is demonstrated for a range of rank-based evaluation measures.
相似文献
Hugo ZaragozaEmail: |
12.
To put an end to the large copyright trade deficit, both Chinese government agencies and publishing houses have been striving
for entering the international publication market. The article analyzes the background of the going-global strategy, and sums
up the performance of both Chinese administrations and publishers.
相似文献
Qing Fang (Corresponding author)Email: |
13.
Andy Weissberg 《Publishing Research Quarterly》2008,24(4):255-260
This article analyzes current industry practices toward the identification of digital book content. It highlights key technology
trends, workflow considerations and supply chain behaviors, and examines the implications of these trends and behaviors on
the production, discoverability, purchasing and consumption of digital book products.
相似文献
Andy WeissbergEmail: |
14.
World Book and Copyright Day was established by a resolution of the 28th General Council of UNESCO in 1995. Its avowed aim
was ‘to pay a world-wide tribute to books and authors on this date, encouraging everyone, and in particular young people,
to discover the pleasure of reading and gain a renewed respect for the irreplaceable contributions of those who have furthered
the social and cultural progress of humanity.’ This article examines the context for World Book and Copyright Day, the extent
to which cultural and commercial interests have converged in the activities of the day and argues that an analysis of the
activities of the day reveal a specifically European attitude to book culture.
相似文献
Alexis WeedonEmail: |
15.
Sandeep Chaufla 《Publishing Research Quarterly》2008,24(3):187-201
A review and analysis of the rules and regulations including the tax aspects of making an investment in India is presented.
The full range from Foreign Direct Investment to different forms of doing business with specific examples from the publishing
industry is explored to help understand current policies and regulations.
相似文献
Sandeep ChauflaEmail: Email: |
16.
Filip Boudrez 《Archival Science》2007,7(2):179-193
This paper gives an overview of the archival issues that relate to digitally signed documents. First, by way of introduction,
the advanced digital signature is presented briefly. In the second part, a number of problems are discussed that present themselves
when a digital signature is used as a proof of authenticity and integrity for digital documents in general. In particular,
it is also being investigated whether it makes any sense for the archivist to digitally sign all electronic records under
his or her management. Problems relating to the (medium) long-term archiving of digitally signed documents are dealt with
in the third part. After an overview of the sticking points for long-term validation (“Archival issues”) a number of possible
solutions are discussed (“Solutions for long-term archiving”).
相似文献
Filip BoudrezEmail: |
17.
Beatrice S. Bartlett 《Archival Science》2007,7(4):369-390
This article describes the first half century of the Communist government’s supervision and management of the central-government
archives of the last two dynasties. Immediately with the Communist ascent to power in 1949, the new government took great
interest in assembling and protecting the country’s archival documents, readying the Ming-Qing archives for access to scholars,
and preparing for publication of selected materials. By the 1980s Beijing’s Number One Historical Archives, in charge of the
largest holding of Ming-Qing documents, had become the first Chinese authority to complete a full sorting and preliminary
catalogues for such a collection. Moreover, to facilitate searches, an attempt has recently begun to create a subject-heading
system for these and other holdings in the country. In the first half century’s final decades, foreign researchers were admitted
for the first time and tours and international exchanges began to take place.
相似文献
Beatrice S. BartlettEmail: |
18.
Panagiotis Symeonidis Alexandros Nanopoulos Apostolos N. Papadopoulos Yannis Manolopoulos 《Information Retrieval》2008,11(1):51-75
Collaborative Filtering (CF) Systems have been studied extensively for more than a decade to confront the “information overload”
problem. Nearest-neighbor CF is based either on similarities between users or between items, to form a neighborhood of users
or items, respectively. Recent research has tried to combine the two aforementioned approaches to improve effectiveness. Traditional
clustering approaches (k-means or hierarchical clustering) has been also used to speed up the recommendation process. In this paper, we use biclustering
to disclose this duality between users and items, by grouping them in both dimensions simultaneously. We propose a novel nearest-biclusters
algorithm, which uses a new similarity measure that achieves partial matching of users’ preferences. We apply nearest-biclusters
in combination with two different types of biclustering algorithms—Bimax and xMotif—for constant and coherent biclustering,
respectively. Extensive performance evaluation results in three real-life data sets are provided, which show that the proposed
method improves substantially the performance of the CF process.
相似文献
Yannis ManolopoulosEmail: |
19.
This paper reviews the archival process at the Inter-university Consortium for Political and Social Research (ICPSR), a repository
of digital social science data, and maps ICPSR’s Ingest and Access operations to the Open Archival Information System (OAIS)
Reference Model. The paper also assesses ICPSR’s conformance with the archival responsibilities of “trusted” OAIS repositories,
with the proviso that audit criteria for archival certification are still under development. The ICPSR to OAIS mapping exercise
has benefits for the larger social science archiving community because it provides an interpretation of the reference model
in the quantitative social science environment and points to preservation-related issues that may be salient for other social
science archives. Building on the archives’ long tradition of shared norms and cooperation, we may ultimately be able to design
a federated system of trusted social science repositories that provides access to the global heritage.
相似文献
Cole WhitemanEmail: |
20.
We present software that generates phrase-based concordances in real-time based on Internet searching. When a user enters
a string of words for which he wants to find concordances, the system sends this string as a query to a search engine and
obtains search results for the string. The concordances are extracted by performing statistical analysis on search results
and then fed back to the user. Unlike existing tools, this concordance consultation tool is language-independent, so concordances
can be obtained even in a language for which there are no well-established analytical methods. Our evaluation has revealed
that concordances can be obtained more effectively than by only using a search engine directly.
相似文献
Yuichiro IshiiEmail: |