Web queries in question format are becoming a common element of a user's interaction with Web search engines. Web search services such as Ask Jeeves – a publicly accessible question and answer (Q&A) search engine – request users to enter question format queries. This paper provides results from a study examining queries in question format submitted to two different Web search engines – Ask Jeeves that explicitly encourages queries in question format and the Excite search service that does not explicitly encourage queries in question format. We identify the characteristics of queries in question format in two different data sets: (1) 30,000 Ask Jeeves queries and 15,575 Excite queries, including the nature, length, and structure of queries in question format. Findings include: (1) 50% of Ask Jeeves queries and less than 1% of Excite were in question format, (2) most users entered only one query in question format with little query reformulation, (3) limited range of formats for queries in question format – mainly “where”, “what”, or “how” questions, (4) most common question query format was “Where can I find………” for general information on a topic, and (5) non-question queries may be in request format. Overall, four types of user Web queries were identified: keyword, Boolean, question, and request. These findings provide an initial mapping of the structure and content of queries in question and request format. Implications for Web search services are discussed.  相似文献   

The analysis of contextual information in search engine query logs enhances the understanding of Web users’ search patterns. Obtaining contextual information on Web search engine logs is a difficult task, since users submit few number of queries, and search multiple topics. Identification of topic changes within a search session is an important branch of search engine user behavior analysis. The purpose of this study is to investigate the properties of a specific topic identification methodology in detail, and to test its validity. The topic identification algorithm’s performance becomes doubtful in various cases. These cases are explored and the reasons underlying the inconsistent performance of automatic topic identification are investigated with statistical analysis and experimental design techniques.  相似文献   

【目的】分析四种网络数据源在期刊影响力评价方面的异同,为期刊综合评价提供理论指导。【方法】 以2014年JCR、Pubmed数据库均收录的影响因子排名前50的开放存取期刊为研究对象,通过相关分析、因子分析提取各数据源平台在期刊评价中的主要影响因子,作为期刊评价中重点考察的代表性数据指标。【结果】 JCR的期刊影响力主要来源于被引频次、可引用项目;Google scholar扩大了期刊信息的传播范围;Altmetrics.com注重网络传播指标、获取指标和利用指标,其主要影响因素来源于网络传播指标。【结论】 JCR反映的是期刊在学界的影响力;Google scholar前置了期刊评价;应用搜索引擎获取期刊网站的链接数、IP访问量、PV浏览量等描述了期刊传播阶段的影响力。Altmetric.com将期刊评价延伸到传播、获取和利用的全过程。  相似文献   

随着大数据时代的到来,大数据应用已经深入到大众生活的很多方面。国际漫游的分析也可以基于大数据来实现,其具有较好的节假日可视化效果、能有效的指导网络优化和支撑数据变现。文章通过国际漫游多维度分析这个典型大数据应用,介绍大数据应用实现的各个环节,向大众揭开大数据处理的神秘面纱。  相似文献   

This study examines the facets and patterns of multiple Web query reformulations with a focus on reformulation sequences. Based on IR interaction models, it was presumed that query reformulation is the product of the interaction between the user and the IR system. Query reformulation also reflects the interplay between the surface and deeper levels of user interaction. Query logs were collected from a Web search engine through the selection of search sessions in which users submitted six or more unique queries per session. The final data set was composed of 313 search sessions. Three facets of query reformulation (content, format, and resource) as well as nine sub-facets were derived from the data. In addition, analysis of modification sequences identified eight distinct patterns: specified, generalized, parallel, building-block, dynamic, multitasking, recurrent, and format reformulation. Adapting Saracevic’s stratified model, the authors develop a model of Web query reformulation based on the results of the study. The implications for Web search engine design are finally discussed and the functions of an interactive reformulation tool are suggested.  相似文献   

随着iphone、Android等智能设备的迅速普及,移动Web技术逐渐成为关注的新热点,传统信息类和电子商务网站因市场需求向移动终端转移。使用jQuery Mobile和HTML5做移动Web应用开发,具有开发简单,发布周期短、跨平台跨设备的优点。文章对jQuery Mobile和HTML5的移动Web应用开发做了介绍和分析。  相似文献   

信任是电子商务成功的关键因素。本文研究了B2C(Business—to—Consumer)电子商务中的在线信任问题。分析了B2C电子商务环境中消费者在线信任的特点和构成,提出了在B2C电子商务环境中建立消费者在线信任关系的途径,提出电子商务网站设计的相关建议。并通过实际网站的实例分析说明建议的合理性。  相似文献   

Tracing the closure of oceans with irregular margins and the formation of an orocline are crucial for understanding plate reconstruction and continental assembly.The eastern Central Asian Orogenic Belt,where the Mongol-Okhotsk orocline is situated,is one of the world’s largest magmatic provinces.Using a large data set of U-Pb zircon ages,we updated the timing of many published igneous rocks,which allowed us to recognize tightly ’folded’ linear Carboniferous-Jurassic magmatic belts that wrap arou...  相似文献   

在车辆路径问题中,由于配送过程中各种不确定状况的出现,使得随机VRP逐渐成为研究者关注的焦点.数据仓库和数据挖掘技术的出现,给解决随机VRP问题提供了技术支持.针对随机需求VRP问题构建了相应的数据库及数据挖掘模型,最后用启发式算法对给定数据用例进行了求解,取得良好的效果.  相似文献   

This article addresses the question of whetherpersonal surveillance on the world wide web isdifferent in nature and intensity from that inthe offline world. The article presents aprofile of the ways in which privacy problemswere framed and addressed in the 1970s and1990s. Based on an analysis of privacy newsstories from 1999–2000, it then presents atypology of the kinds of surveillance practicesthat have emerged as a result of Internetcommunications. Five practices are discussedand illustrated: surveillance by glitch,surveillance by default, surveillance bydesign, surveillance by possession, andsurveillance by subject. The article offerssome tentative conclusions about theprogressive latency of tracking devices, aboutthe complexity created by multi-sourcing, aboutthe robustness of clickstream data, and aboutthe erosion of the distinction between themonitor and the monitored. These trendsemphasize the need to reject analysis thatframes our understanding of Internetsurveillance in terms of its impact onsociety. Rather the Internet should beregarded as a form of life whose evolvingstructure becomes embedded in humanconsciousness and social practice, and whosearchitecture embodies an inherent valence thatis gradually shifting away from the assumptionsof anonymity upon which the Internet wasoriginally designed.  相似文献   

Because of the rapid increase of data in the cloud of Amazon Web Service (AWS), the traditional methods for analyzing this data are not good and inappropriate, so unconventional methods of analysis have been proposed by many data scientists such as concurrent/ parallel techniques to meeting the requirements of performance and scalability entailed in such big data analyses. In this paper we are used Hadoop Map Reduce system that contains Hadoop Distributed File System (HDFS) and Hadoop cluster. We optimized it by combining it with five efficient Data Mining (DM) algorithms such as Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Correlative Naïve Bayes classifier (CNB), and Fuzzy CNB (FCNB) for strong analytics of cloud big data. The proposed system applied on product review data that taken form the cloud of AWS. The Evaluation of Hadoop Map Reduce done with important benchmarks as Mean Absolute Percentage Error (MPAE), Root Mean Square Error (RMSE), and runtime for word count, sort, inverted index. Also, the evaluation of DM models with Hadoop Map Reduce system done by using accuracy, sensitivity, specificity, memory, and running time. Experiments have shown that FCNB is effective in addressing the problem of big data.  相似文献   

本文指出传统的知识服务模式仍然是基于固有资源或系统的、针对用户独立兴趣所开展的"点对点"式粗放型静态服务,效率偏低;而用户的研究进程是动态的,存在明显的需求漂移状态,形成动态连续过程。因此提出精细化知识服务概念,围绕用户学术研究进程,动态跟踪用户需求在不同研究进度中的转移情况,以"面对面"的知识空间形式大大提高知识服务的效果,并通过实验研究的方法验证了用户需求漂移及知识空间的形成,以及论文所提出问题的理论意义和实际意义。  相似文献   

根据EXCEL工作表函数与数组公式的特点,提出了一种数据提取的方法。基于网络流量统计数据为例进行分析,构建了具有一定的通用性的数据提取公式,可提高网络管理的效率。  相似文献   

尽管许多管理顾问建议企业可以将部分技术开发活动进行外包,但是对于企业内部RD与外部技术获取之间的作用性质及其条件却知之甚少。本文构建了一个超越对数创新产出函数,利用中国31个省份的大中型工业企业的创新投入-产出数据检验了企业内部RD投入、外部RD投入以及外部技术引进(直接技术利用)之间的替代性和互补性。研究结果表明,当企业具有更强的技术吸收能力、较大的经济规模和具备以往技术转让经验时,内部RD与外部RD之间的互补性较强;而企业的RD投入(技术开发)与外部技术引进(直接技术利用)之间一般呈现出替代性。可见企业在不同的技术活动阶段、采用不同的技术活动形式时,外部技术的替代性与互补性有所不同。  相似文献   

Ethics and Information Technology - Data analytics and data-driven approaches in Machine Learning are now among the most hailed computing technologies in many industrial domains. One major...  相似文献   

Research on the diffusion of new technologies has centred on the study of the interfirm rate of diffusion, paying much less attention to intrafirm aspects. This paper attempts to overcome this gap in the literature by analysing the factors that influence the speed with which a new technology, the ATM, is fully adopted. The data over which the hypotheses are tested belongs to the Spanish savings banks market. The results show that the rate of intrafirm diffusion is explained by innovation, firm and market characteristics. In testing our hypotheses we make use of both traditional methods and survival analysis techniques.  相似文献   

期刊评价标准的合理构建对学术期刊的健康成长具有重要意义,大数据时代的来临将给我国当前学术期刊评价带来巨大影响。在简要阐述我国当前主要学术评价系统、评价标准、基本特征及存在问题的基础上,对大数据给学术期刊评价标准带来的影响进行了细致分析,在此基础上提出大数据背景下学术期刊评价标准可能包含的具体指标及其计算公式。  相似文献   

基于来自世界知识产权组织(WIPO)的一个纳米专利样本,使用专利的非专利引文(non-patent references,NPR)分析方法和负二项回归模型,定量研究了纳米科学对纳米技术的影响.回归结果显示,在纳米技术领域,专利对科学论文的引用,与专利价值显著正相关:专利对非科学出版物的引用,与专利价值没有显著的关系.尽管如此,专利的科学论文引文,并不代表科学对技术的直接投入,而代表一种科学和技术的共舞,一种基础科学和技术创新之间多层面的交流和互动.  相似文献   

<正>白春礼院士的经历很奇特,年少时他做文学梦,没想到却进入自然科学的前沿;他当过兵,开过大卡车,若干年后,这名热血澎湃的战士竟然成了世界闻名的科学家。 院士风采 盯了一天,记者才把白春礼院士从会场约出来,他出现在记者面前时,精神抖擞且不失儒雅,令人实在不能相信,这位浓眉大眼、和蔼可亲的英俊中年人竟然是一位51岁的院  相似文献   

A Zipfian model of an automatic bibliographic system is developed using parameters describing the contents of it database and its inverted file. The underlying structure of the Zipf distribution is derived, with particular emphasis on its application to work frequencies, especially with regard to the inverted flies of an automatic bibliographic system. Andrew Booth developed a form of Zipf's law which estimates the number of words of a particular frequency for a given author and text. His formulation has been adopted as the basis of a model of term dispersion in an inverted file system. The model is also distinctive in its consideration of the proliferation of spelling errors in free text, and the inclusion of all searchable elements from the system's inverted file. This model is applied to the National Library of Medicine's MEDLINE. The model carries implications for the determination of database storage requirements, search response time, and search exhaustiveness.  相似文献   

