排序方式: 共有4条查询结果,搜索用时 15 毫秒
1
1.
Travis Gagie Aleksi Hartikainen Kalle Karhu Juha Kärkkäinen Gonzalo Navarro Simon J. Puglisi Jouni Sirén 《Information Retrieval》2017,20(3):253-291
Most of the fastest-growing string collections today are repetitive, that is, most of the constituent documents are similar to many others. As these collections keep growing, a key approach to handling them is to exploit their repetitiveness, which can reduce their space usage by orders of magnitude. We study the problem of indexing repetitive string collections in order to perform efficient document retrieval operations on them. Document retrieval problems are routinely solved by search engines on large natural language collections, but the techniques are less developed on generic string collections. The case of repetitive string collections is even less understood, and there are very few existing solutions. We develop two novel ideas, interleaved LCPs and precomputed document lists, that yield highly compressed indexes solving the problem of document listing (find all the documents where a string appears), top-k document retrieval (find the k documents where a string appears most often), and document counting (count the number of documents where a string appears). We also show that a classical data structure supporting the latter query becomes highly compressible on repetitive data. Finally, we show how the tools we developed can be combined to solve ranked conjunctive and disjunctive multi-term queries under the simple \({\textsf{tf}}{\textsf{-}}{\textsf{idf}}\) model of relevance. We thoroughly evaluate the resulting techniques in various real-life repetitiveness scenarios, and recommend the best choices for each case. 相似文献
2.
Marina L. Puglisi Charles Hulme Lorna G. Hamilton Margaret J. Snowling 《Scientific Studies of Reading》2017,21(6):498-514
The home literacy environment is a well-established predictor of children’s language and literacy development. We investigated whether formal, informal, and indirect measures of the home literacy environment predict children’s reading and language skills once maternal language abilities are taken into account. Data come from a longitudinal study of children at high risk of dyslexia (N = 251) followed from preschool years. Latent factors describing maternal language were significant predictors of storybook exposure but not of direct literacy instruction. Maternal language and phonological skills respectively predicted children’s language and reading/spelling skills. However, after accounting for variations in maternal language, storybook exposure was not a significant predictor of children’s outcomes. In contrast, direct literacy instruction remained a predictor of children’s reading/spelling skills. We argue that the relationship between early informal home literacy activities and children’s language and reading skills is largely accounted for by maternal skills and may reflect genetic influences. 相似文献
3.
De Almeida Maia Denise Pohl Steffi Okuda Paola Matiko Martins Liu Ting Puglisi Marina Leite Ploubidis George Eid Michael Cogo-Moreira Hugo 《Educational Assessment, Evaluation and Accountability》2022,34(2):227-239
Educational Assessment, Evaluation and Accountability - The Bracken School Readiness Assessment (BSRA) has been used in large studies such as the Millennium Cohort Study (MCS). Important... 相似文献
4.
1