首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 627 毫秒
1.
Frequent requests from users to search engines on the World Wide Web are to search for information about people using personal names. Current search engines only return sets of documents containing the name queried, but, as several people usually share a personal name, the resulting sets often contain documents relevant to several people. It is necessary to disambiguate people in these result sets in order to to help users find the person of interest more readily. In the task of name disambiguation, effective measurement of similarities in the documents is a crucial step towards the final disambiguation. We propose a new method that uses web directories as a knowledge base to find common contexts in documents and uses the common contexts measure to determine document similarities. Experiments, conducted on documents mentioning real people on the web, together with several famous web directory structures, suggest that there are significant advantages in using web directories to disambiguate people compared with other conventional methods.  相似文献   

2.
Queries submitted to search engines can be classified according to the user goals into three distinct categories: navigational, informational, and transactional. Such classification may be useful, for instance, as additional information for advertisement selection algorithms and for search engine ranking functions, among other possible applications. This paper presents a study about the impact of using several features extracted from the document collection and query logs on the task of automatically identifying the users’ goals behind their queries. We propose the use of new features not previously reported in literature and study their impact on the quality of the query classification task. Further, we study the impact of each feature on different web collections, showing that the choice of the best set of features may change according to the target collection.  相似文献   

3.
The goal of the study presented in this article is to investigate to what extent the classification of a web page by a single genre matches the users’ perspective. The extent of agreement on a single genre label for a web page can help understand whether there is a need for a different classification scheme that overrides the single-genre labelling. My hypothesis is that a single genre label does not account for the users’ perspective. In order to test this hypothesis, I submitted a restricted number of web pages (25 web pages) to a large number of web users (135 subjects) asking them to assign only a single genre label to each of the web pages. Users could choose from a list of 21 genre labels, or select one of the two ‘escape’ options, i.e. ‘Add a label’ and ‘I don’t know’. The rationale was to observe the level of agreement on a single genre label per web page, and draw some conclusions about the appropriateness of limiting the assignment to only a single label when doing genre classification of web pages. Results show that users largely disagree on the label to be assigned to a web page.  相似文献   

4.
The presentation of search results on the web has been dominated by the textual form of document representation. On the other hand, the document’s visual aspects such as the layout, colour scheme, or presence of images have been studied in a limited context with regard to their effectiveness of search result presentation. This article presents a comparative evaluation of textual and visual forms of document representation as additional components of document surrogates. A total of 24 people were recruited for our task-based user study. The experimental results suggest that an increased level of document representation available in the search results can facilitate users’ interaction with a search interface. The results also suggest that the two forms of additional representations are likely beneficial to users’ information searching process in different contexts.  相似文献   

5.
Many Web sites have begun allowing users to submit items to a collection and tag them with keywords. The folksonomies built from these tags are an interesting topic that has seen little empirical research. This study compared the search information retrieval (IR) performance of folksonomies from social bookmarking Web sites against search engines and subject directories. Thirty-four participants created 103 queries for various information needs. Results from each IR system were collected and participants judged relevance. Folksonomy search results overlapped with those from the other systems, and documents found by both search engines and folksonomies were significantly more likely to be judged relevant than those returned by any single IR system type. The search engines in the study had the highest precision and recall, but the folksonomies fared surprisingly well. Del.icio.us was statistically indistinguishable from the directories in many cases. Overall the directories were more precise than the folksonomies but they had similar recall scores. Better query handling may enhance folksonomy IR performance further. The folksonomies studied were promising, and may be able to improve Web search performance.  相似文献   

6.
One major approach for information finding in the WWW is to navigate through some Web directories and browse them until the goal pages were found. However, such directories are generally constructed manually and may have disadvantages of narrow coverage and inconsistency. Besides, most of existing directories provide only monolingual hierarchies that organized Web pages in terms that a user may not be familiar with. In this work, we will propose an approach that could automatically arrange multilingual Web pages into a multilingual Web directory to break the language barriers in Web navigation. In this approach, a self-organizing map is constructed to train each set of monolingual Web pages and obtain two feature maps, which reveal the relationships among Web pages and thematic keywords, respectively, for such language. We then apply a hierarchy generation process on these maps to obtain the monolingual hierarchy for these Web pages. A hierarchy alignment method is then applied on these monolingual hierarchies to discover the associations between nodes in different hierarchies. Finally, a multilingual Web directory is constructed according to such associations. We applied the proposed approach on a set of Web pages and obtained interesting result that demonstrates the feasibility of our method in multilingual Web navigation.  相似文献   

7.
Broken hypertext links are a frequent problem in the Web. Sometimes the page which a link points to has disappeared forever, but in many other cases the page has simply been moved to another location in the same web site or to another one. In some cases the page besides being moved, is updated, becoming a bit different to the original one but rather similar. In all these cases it can be very useful to have a tool that provides us with pages highly related to the broken link, since we could select the most appropriate one. The relationship between the broken link and its possible linkable pages, can be defined as a function of many factors. In this work we have employed several resources both in the context of the link and in the Web to look for pages related to a broken link. From the resources in the context of a link, we have analyzed several sources of information such as the anchor text, the text surrounding the anchor, the URL and the page containing the link. We have also extracted information about a link from the Web infrastructure such as search engines, Internet archives and social tagging systems. We have combined all of these resources to design a system that recommends pages that can be used to recover the broken link. A novel methodology is presented to evaluate the system without resorting to user judgments, thus increasing the objectivity of the results, and helping to adjust the parameters of the algorithm. We have also compiled a web page collection with true broken links, which has been used to test the full system by humans.  相似文献   

8.
A fast and efficient page ranking mechanism for web crawling and retrieval remains as a challenging issue. Recently, several link based ranking algorithms like PageRank, HITS and OPIC have been proposed. In this paper, we propose a novel recursive method based on reinforcement learning which considers distance between pages as punishment, called “DistanceRank” to compute ranks of web pages. The distance is defined as the number of “average clicks” between two pages. The objective is to minimize punishment or distance so that a page with less distance to have a higher rank. Experimental results indicate that DistanceRank outperforms other ranking algorithms in page ranking and crawling scheduling. Furthermore, the complexity of DistanceRank is low. We have used University of California at Berkeley’s web for our experiments.  相似文献   

9.
Students use general web search engines as their primary source of research while trying to find answers to school-related questions. Although search engines are highly relevant for the general population, they may return results that are out of educational context. Another rising trend; social community question answering websites are the second choice for students who try to get answers from other peers online. We attempt discovering possible improvements in educational search by leveraging both of these information sources. For this purpose, we first implement a classifier for educational questions. This classifier is built by an ensemble method that employs several regular learning algorithms and retrieval based approaches that utilize external resources. We also build a query expander to facilitate classification. We further improve the classification using search engine results and obtain 83.5% accuracy. Although our work is entirely based on the Turkish language, the features could easily be mapped to other languages as well. In order to find out whether search engine ranking can be improved in the education domain using the classification model, we collect and label a set of query results retrieved from a general web search engine. We propose five ad-hoc methods to improve search ranking based on the idea that the query-document category relation is an indicator of relevance. We evaluate these methods for overall performance, varying query length and based on factoid and non-factoid queries. We show that some of the methods significantly improve the rankings in the education domain.  相似文献   

10.
吴颂  李大鹏 《现代情报》2007,27(7):169-170,174
本文从五个方面概述了在因特网上查找中医药信息资源的途径,包括从中医药学专业网址中查找,从中医院校的主页上查找,从生物医学信息网络中查找,利用医学搜索引擎与目录查找,利用通用检索工具和综合性网络查找.并简单介绍了使用方法。  相似文献   

11.
The study investigated how users’ emotion control and search tasks interact and influence the Web search behavior and performance among experienced Web users. Sixty-seven undergraduate students with substantial Web experience participated in the study. Effects of emotion control and tasks were found significant on the search behavior but not on the search performance. The interaction effect between emotion control and tasks on the search behavior was also significant: effects of users’ emotion control on the search behavior varied depending on search tasks. Profile analyses of search behaviors identified and contrasted the most commonly occurring profiles of search activities in different search tasks. Suggestions were made to improve information literacy programs, and implications for future research were discussed.  相似文献   

12.
The Web and especially major Web search engines are essential tools in the quest to locate online information for many people. This paper reports results from research that examines characteristics and changes in Web searching from nine studies of five Web search engines based in the US and Europe. We compare interactions occurring between users and Web search engines from the perspectives of session length, query length, query complexity, and content viewed among the Web search engines. The results of our research shows (1) users are viewing fewer result pages, (2) searchers on US-based Web search engines use more query operators than searchers on European-based search engines, (3) there are statistically significant differences in the use of Boolean operators and result pages viewed, and (4) one cannot necessary apply results from studies of one particular Web search engine to another Web search engine. The wide spread use of Web search engines, employment of simple queries, and decreased viewing of result pages may have resulted from algorithmic enhancements by Web search engine companies. We discuss the implications of the findings for the development of Web search engines and design of online content.  相似文献   

13.
The exponential growth of information available on the World Wide Web, and retrievable by search engines, has implied the necessity to develop efficient and effective methods for organizing relevant contents. In this field document clustering plays an important role and remains an interesting and challenging problem in the field of web computing. In this paper we present a document clustering method, which takes into account both contents information and hyperlink structure of web page collection, where a document is viewed as a set of semantic units. We exploit this representation to determine the strength of a relation between two linked pages and to define a relational clustering algorithm based on a probabilistic graph representation. The experimental results show that the proposed approach, called RED-clustering, outperforms two of the most well known clustering algorithm as k-Means and Expectation Maximization.  相似文献   

14.
A critical challenge for Web search engines concerns how they present relevant results to searchers. The traditional approach is to produce a ranked list of results with title and summary (snippet) information, and these snippets are usually chosen based on the current query. Snippets play a vital sensemaking role, helping searchers to efficiently make sense of a collection of search results, as well as determine the likely relevance of individual results. Recently researchers have begun to explore how snippets might also be adapted based on searcher preferences as a way to better highlight relevant results to the searcher. In this paper we focus on the role of snippets in collaborative web search and describe a technique for summarizing search results that harnesses the collaborative search behaviour of communities of like-minded searchers to produce snippets that are more focused on the preferences of the searchers. We go on to show how this so-called social summarization technique can generate summaries that are significantly better adapted to searcher preferences and describe a novel personalized search interface that combines result recommendation with social summarization.  相似文献   

15.
Over time, researchers have acknowledged the importance of understanding the users’ strategies in the design of search systems. However, when involving users in the comparison of search systems, methodological challenges still exist as researchers are pondering on how to handle the variability that human participants bring to the comparisons. This paper present methods for controlling the complexity of user-centered evaluations of search user interfaces through within-subjects designs, balanced task sets, time limitations, pre-formulated queries, cached result pages, and through limiting the users’ access to result documents. Additionally, we will present our experiences in using three measures – search speed, qualified search speed, and immediate accuracy – to facilitate the comparison of different search systems over studies.  相似文献   

16.
To improve search engine effectiveness, we have observed an increased interest in gathering additional feedback about users’ information needs that goes beyond the queries they type in. Adaptive search engines use explicit and implicit feedback indicators to model users or search tasks. In order to create appropriate models, it is essential to understand how users interact with search engines, including the determining factors of their actions. Using eye tracking, we extend this understanding by analyzing the sequences and patterns with which users evaluate query result returned to them when using Google. We find that the query result abstracts are viewed in the order of their ranking in only about one fifth of the cases, and only an average of about three abstracts per result page are viewed at all. We also compare search behavior variability with respect to different classes of users and different classes of search tasks to reveal whether user models or task models may be greater predictors of behavior. We discover that gender and task significantly influence different kinds of search behaviors discussed here. The results are suggestive of improvements to query-based search interface designs with respect to both their use of space and workflow.  相似文献   

17.
Web sites often provide the first impression of an organization. For many organizations, web sites are crucial to ensure sales or to procure services within. When a person opens a web site, the first impression is probably made in a few seconds, and the user will either stay or move on to the next site on the basis of many factors. One of the factors that may influence users to stay or go is the page aesthetics. Another reason may involve a user’s judgment about the site’s credibility. This study explores the possible link between page aesthetics and a user’s judgment of the site’s credibility. Our findings indicate that when the same content is presented using different levels of aesthetic treatment, the content with a higher aesthetic treatment was judged as having higher credibility. We call this the amelioration effect of visual design and aesthetics on content credibility. Our study suggests that this effect is operational within the first few seconds in which a user views a web page. Given the same content, a higher aesthetic treatment will increase perceived credibility.  相似文献   

18.
This case study analyzes the Internet-based resources that a software engineer uses in his daily work. Methodologically, we studied the web browser history of the participant, classifying all the web pages he had seen over a period of 12 days into web genres. We interviewed him before and after the analysis of the web browser history. In the first interview, he spoke about his general information behavior; in the second, he commented on each web genre, explaining why and how he used them. As a result, three approaches allow us to describe the set of 23 web genres obtained: (a) the purposes they serve for the participant; (b) the role they play in the various work and search phases; (c) and the way they are used in combination with each other. Further observations concern the way the participant assesses quality of web-based resources, and his information behavior as a software engineer.  相似文献   

19.
In this paper, we face the so called “ranked list problem” of Web searches, that occurs when users submit short requests to search engines. Generally, as a consequence of terms’ ambiguity and polysemy, users engage long cycles of query reformulation in an attempt to capture relevant information in the top ranked results.  相似文献   

20.
This article presents conceptual navigation and NavCon, an architecture that implements this navigation in World Wide Web pages. NavCon architecture makes use of ontology as metadata to contextualize user search for information. Based on ontologies, NavCon automatically inserts conceptual links in Web pages. By using these links, the user may navigate in a graph representing ontology concepts and their relationships. By browsing this graph, it is possible to reach documents associated with the user desired ontology concept. This Web navigation supported by ontology concepts we call conceptual navigation. Conceptual navigation is a technique to browse Web sites within a context. The context filters relevant retrieved information. The context also drives user navigation through paths that meet his needs. A company may implement conceptual navigation to improve user search for information in a knowledge management environment. We suggest that the use of an ontology to conduct navigation in an Intranet may help the user to have a better understanding about the knowledge structure of the company.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号