首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 10 毫秒
The success of information retrieval depends on the ability to measure the effective relationship between a query and its response. If both are posed in natural language, one might expect that understanding the meaning of that language could not be avoided. The aim of this research is to demonstrate that it is perhaps unnecessary to be able to determine the meaning in the absolute sense; it may be sufficient to measure how far there is a conformity in meaning, and then only in the context of the set of documents in which the answer to a query is sought. Handling a particular language using a computer is made possible through replacing certain texts by special sets. A given text has a ‘syntactic trace’, the set of all the overlapping trigrams forming part of the text. When determining the effective relationship between a query and its answer, not only do their syntactic traces play a role, but so do the traces of all other documents in the set. This is known as the ‘information trace method’.  相似文献   

The fundamental idea of the work reported here is to extract index phrases from texts with the help of a single word concept dictionary and a thesaurus containing relations among concepts. The work is based on the fact, that, within every phrase, the single words the phrase is composed of are related in a certain well denned manner, the type of relations holding between concepts depending only on the concepts themselves. Therefore relations can be stored in a semantic network. The algorithm described extracts single word concepts from texts and combines them to phrases using the semantic relations between these concepts, which are stored in the network. The results obtained show that phrase extraction from texts by this semantic method is possible and offers many advantages over other (purely syntactic or statistic) methods concerning preciseness and completeness of the meaning representation of the text. But the results show, too, that some syntactic and morphologic “filtering” should be included for effectivity reasons.  相似文献   

Managing personal information such as to-dos and contacts has become our daily routines, consuming more time than needed. Existing PIM tools require extensive involvement of human users. This becomes a problem in using mobile devices due to their physical constraints. To address the limitations of traditional PIM tools, we propose a model of mobile PIM agent (PIMA) that aims to improve PIM on mobile devices through natural language interface and application integration. We conducted a user study to evaluate PIMA empirically with prototype systems. The results show that mobile PIMA improved perceived usefulness, ease-of-use, and efficiency of PIM on mobile devices, which in turn accounted for positive attitude and intention to use the system. The findings of this study provide suggestions for designing and developing PIM applications on mobile devices.  相似文献   

建立同步介入机制促进大学生信息素质培养   总被引:1,自引:0,他引:1  
文章首先分析了当前我国高校大学生信息素质教育的现状,认为这种培养模式存在一定的局限性,并分析了其产生原因;然后指出,要建立科学的信息素质培养模式,必须根据大学生发展的阶段性特征,引入同步介入机制,并探讨了同步介入机制的具体实施方案.  相似文献   

The primary aim of this study is to suggest a formalized definition (“explication”) of “relevance relationship” between texts, including the explication of the concept of “degree of relevance”. The concept of information language (IL), its vocabulry and syntax and the notion of the “semantic power” of an information language are defined. The concept of ideally functioning information retrieval systems (IRS) is suggested and different kinds of deviations from such IRS are considered.  相似文献   

In this study, quantitative measures of the information content of textual material have been developed based upon analysis of the linguistic structure of the sentences in the text. It has been possible to measure such properties as: (1) the amount of information contributed by a sentence to the discourse; (2) the complexity of the information within the sentence, including the overall logical structure and the contributions of local modifiers; (3) the density of information based on the ratio of the number of words in a sentence to the number of information-contributing operators.Two contrasting types of texts were used to develop the measures. The measures were then applied to contrasting sentences within one type of text. The textual material was drawn from narrative patient records and from the medical research literature. Sentences from the records were analyzed by computer and those from the literature were analyzed manually, using the same methods of analysis. The results show that quantitative measures of properties of textual information can be developed which accord with intuitively perceived differences in the informational complexity of the material.  相似文献   

Among the problems associated with modern information retrieval systems is the lack of any systematic approach to the design of query language interfaces. In this paper we attempt to show how a relationally organised data base is well suited to bibliographic data management, and how, given such a relational organisation it is possible to construct an interface which separates the query language from the physical representation of the data base. It is also shown how such a query language organisation may be usefully interfaced to existing retrieval systems. Finally a query language for retrieval applications is proposed.  相似文献   

A rapid increase in the use of web-based technologies – and corresponding changes in government and local council policies – in recent years, means that many vital services are now provided solely online. While this has many potential benefits, it can place additional burdens on certain demographic groups, some of whom may become considerably disadvantaged or even disenfranchised. This is particularly problematic for English-as-a Second Language (ESL) speakers, who are often immigrants or refugees and thus have a greater need to access these e-government services, and who may struggle to understand and assess the relevance of complex documents. In this work we investigate the search behaviours and performance of native English speakers and two different groups of ESL speakers when completing e-government tasks, and the effect of document readability/complexity. In contrast with previous work, our results show significant differences between groups of varying language proficiency in terms of objective search performance, time on task, and self-perceived performance and confidence. We also demonstrate that document reading level moderates the effect of language proficiency on objective search performance. The findings contribute to our existing understanding of how English language proficiency affects search for e-government topics, and have important implications for the future development of e-government services to ensure more equitable access and use.  相似文献   

We analyzed natural language document retrieval queries from the Thomas Cooper Library at the University of South Carolina in order to investigate the frequency of various types of ill-formed input, such as spelling errors, co-occurrence violations, conjunctions, ellipsis and missing or incorrect punctuation. The primary reason for analyzing ill-formed inputs was to determine whether there is a significant need to study ill-formed inputs in detail. After analyzing the queries, we found that most of the queries were sentence fragments and that many of them contained some type of ill-formed input. Conjunctions caused the most problems. The next most serious problem was caused by punctuation errors. Spelling errors occurred in a small number of the queries. The remaining types of ill-formed input considered, ellipsis and co-occurrence violations, were not found in the queries.  相似文献   

This study proposes the codification of lexical information in electronic dictionaries, in accordance with a generic and extendable XML scheme model, and its conjunction with linguistic tools for the processing of natural language. Our approach is different from other similar studies in that we propose XML coding of those items from a dictionary of meanings that are less related to the lexical units. Linguistic information, such as morphology, syllables, phonology, etc., will be included by means of specific linguistic tools. The use of XML as a container for the information allows the use of other XML tools for carrying out searches or for enabling presentation of the information in different resources. This model is particularly important as it combines two parallel paradigms—extendable labelling of documents and computational linguistics—and it is also applicable to other languages. We have included a comparison with the labelling proposal of printed dictionaries carried out by the Text Encoding Initiative (TEI). The proposed design has been validated with a dictionary of more than 145 000 accepted meanings.  相似文献   

Referring expression generation is the part of natural language generation that decides how to refer to the entities appearing in an automatically generated text. Lexicalization is the part of this process which involves the choice of appropriate vocabulary or expressions to transform the conceptual content of a referring expression into the corresponding text in natural language. This problem presents an important challenge when we have enough knowledge to allow more than one alternative. In those cases, we need some heuristics to decide which alternatives are more appropriate in a given situation. Whereas most work on natural language generation has focused on a generic way of generating language, in this paper we explore personal preferences as a type of heuristic that has not been properly addressed. We empirically analyze the TUNA corpus, a corpus of referring expression lexicalizations, to investigate the influence of language preferences in how people lexicalize new referring expressions in different situations. We then present two corpus-based approaches to solve the problem of referring expression lexicalization, one that takes preferences into account and one that does not. The results show a decrease of 50% in the similarity error against the reference corpus when personal preferences are used to generate the final referring expression.  相似文献   

This paper addresses the problem of the automatic recognition and classification of temporal expressions and events in human language. Efficacy in these tasks is crucial if the broader task of temporal information processing is to be successfully performed. We analyze whether the application of semantic knowledge to these tasks improves the performance of current approaches. We therefore present and evaluate a data-driven approach as part of a system: TIPSem. Our approach uses lexical semantics and semantic roles as additional information to extend classical approaches which are principally based on morphosyntax. The results obtained for English show that semantic knowledge aids in temporal expression and event recognition, achieving an error reduction of 59% and 21%, while in classification the contribution is limited. From the analysis of the results it may be concluded that the application of semantic knowledge leads to more general models and aids in the recognition of temporal entities that are ambiguous at shallower language analysis levels. We also discovered that lexical semantics and semantic roles have complementary advantages, and that it is useful to combine them. Finally, we carried out the same analysis for Spanish. The results obtained show comparable advantages. This supports the hypothesis that applying the proposed semantic knowledge may be useful for different languages.  相似文献   

党的二十大报告指出,“积极稳妥推进碳达峰碳中和”“加快规划建设新型能源体系”。氢能作为绿色低碳的二次能源,在促进可再生能源规模化高效利用、推动交通领域能源替代、加快工业领域深度脱碳等方面具有应用前景,是建设新型能源体系不可或缺的组成部分,也是实现碳达峰、碳中和的重要绿色解决方案。为全面系统研究我国氢能政策体系,文章调研621份我国中央和地方政府发布的氢能政策文件,基于政策信息学,利用自然语言处理技术挖掘氢能政策要素信息和结构化数据指标,结合文本分析、定量分析和数据可视化分析研究氢能政策发展演化轨迹、产业区域格局及产业链布局等特征,该研究框架及分析方法有利于提高研究氢能政策的系统性和时效性。基于上述研究,文章最后针对我国氢能产业的薄弱环节提出加速发展的政策建议。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号