首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 484 毫秒
1.
为了有效保护高层建筑内人员的生命和财产安全,采用PHP和my SQL数据库技术设计了一个基于无线传感器网络的建筑灾情远程监控系统。该系统实现了建筑内环境参数和建筑结构参数的实时采集,传感数据经灾害判定算法处理分析后,将结果以Web形式发布,用户可以通过移动或固定终端远程登录监控系统网页查询建筑灾害信息,并可通过网页对监控系统进行远程控制。  相似文献   

2.
WEB医学信息资源的识别和获取   总被引:1,自引:0,他引:1  
何丽娟 《现代情报》2007,27(5):70-71
随着Web医学信息资源的日益普及和丰富,Web医学信息资源的采集就显得尤为重要,本文在分析Web网页文档特点的基础上,探讨了Web医学信息资源的识别和获取方法。  相似文献   

3.
赵哲  马晓珺 《科技通报》2014,(4):206-208
利用Web页面的采集序位和被检索页面的相关信息和主题,使得以主题为分块的网络爬虫算法,能够尽可能多地把整个Web按照主题为依据进行分块整合,可以采用对URL定位信息,提高了页面的高效检索能力。仿真实验中表明,提出的主题相关爬虫算法能够跨越BBS中URL网页中的断裂带,提高了URL网页的召回率,也不至于因为网页的断裂而中止检索。算法精度分析表明,误判点都在等分线附近徘徊,偏差不大,表明算法精度较高。  相似文献   

4.
庞景安 《情报科学》2007,25(8):1171-1175
本文在研究Web小世界特征以及形成小世界网络机制的基础上,着重分析了运用网络计量学理论,对Web网站和网页之间存在的小世界特征进行深入研究的技术和方法,并分析比较了研究结果.同时,探讨了Web小世界特征在网络信息环境中的具体应用.  相似文献   

5.
应用领域本体的Web信息知识集成研究   总被引:2,自引:0,他引:2  
李超  王兰成 《情报科学》2007,25(3):430-434
缺少领域知识而进一步提高Web信息检索的质量是困难的,知识集成能够发挥重要作用。本文首先分析了目前Web用户信息利用的现状,研究领域本体与知识集成的方法,然后结合Web网页文档的特点及本体知识,给出一种基于领域本体的Web信息个性花集成方法,能够提高Web信息检索和用户利用的效率。  相似文献   

6.
李志义 《现代情报》2011,31(10):31-35
网络爬虫对网页的抓取与优化策略直接影响到网页采集的广度、深度,以及网页预处理的数量和搜索引擎的质量。搜索引擎的设计应在充分考虑网页遍历策略的同时,还应加强对网络爬虫优化策略的研究。本文从主题、优先采集、不重复采集、网页重访、分布式抓取等方面提出了网络爬虫的五大优化策略,对网络爬虫的设计有一定的指导和启迪作用。  相似文献   

7.
互联网的一个重要性质是网络中的网页信息随时发生着更新。在Web信息迅速增长的今天,网页更新的预测和确定成为了一个备受关注的课题。介绍了作为网页更新预测模型的泊松模型,并根据该模型的各种缺陷分析对网页更新预测算法的现状进行了阐述,同时对未来的研究方向进行了展望。  相似文献   

8.
伴随着信息技术、计算机网络技术等的创新发展与普及应用,Web网页类型呈现出多样化发展态势,网络环境日渐复杂,网站安全防范重要性愈发凸显。本文以Web网页安全为研究对象,从常见的Web网页安全问题入手,提出Web网页安全策略,对其实现方案进行了简要分析,以降低网页运行风险,增强网站安全水平。  相似文献   

9.
本文针对我国网络安全的实际情况,给出获取、利用Web网络中关键信息的系统模型。通过对Web网页搜索引擎、机器翻译、语料数据库自动建立等技术的研究,讨论构建基于机器翻译的跨语言网络信息安全主动防御模型。通过该模型获取、利用其他国家网络中关键信息,从而能够争取信息的获取权、控制权和使用权。  相似文献   

10.
本文通过对网页结构和内容特征的深入分析和识别,对噪音网页的过滤方法进行研究和实验。首先利用阈值过滤具有明显特征的噪音网页,而后建立网页特征向量,利用SVM对网页进行分类。采用采集自Web的网页数据进行实验分析,最后得出研究结论,并展望下一步工作。  相似文献   

11.
Broken hypertext links are a frequent problem in the Web. Sometimes the page which a link points to has disappeared forever, but in many other cases the page has simply been moved to another location in the same web site or to another one. In some cases the page besides being moved, is updated, becoming a bit different to the original one but rather similar. In all these cases it can be very useful to have a tool that provides us with pages highly related to the broken link, since we could select the most appropriate one. The relationship between the broken link and its possible linkable pages, can be defined as a function of many factors. In this work we have employed several resources both in the context of the link and in the Web to look for pages related to a broken link. From the resources in the context of a link, we have analyzed several sources of information such as the anchor text, the text surrounding the anchor, the URL and the page containing the link. We have also extracted information about a link from the Web infrastructure such as search engines, Internet archives and social tagging systems. We have combined all of these resources to design a system that recommends pages that can be used to recover the broken link. A novel methodology is presented to evaluate the system without resorting to user judgments, thus increasing the objectivity of the results, and helping to adjust the parameters of the algorithm. We have also compiled a web page collection with true broken links, which has been used to test the full system by humans.  相似文献   

12.
一种基于视觉分块的Web信息抽取方法研究   总被引:1,自引:0,他引:1  
随着浏览器/服务器体系结构和动态网页技术的广泛应用,对网页进行快速、准确地信息抽取的技术研究已成为一个热点.结合动态网页的生成特点和针对已有抽取方法的不足,提出了一种基于视觉分块的Web信息抽取方法.  相似文献   

13.
One major approach for information finding in the WWW is to navigate through some Web directories and browse them until the goal pages were found. However, such directories are generally constructed manually and may have disadvantages of narrow coverage and inconsistency. Besides, most of existing directories provide only monolingual hierarchies that organized Web pages in terms that a user may not be familiar with. In this work, we will propose an approach that could automatically arrange multilingual Web pages into a multilingual Web directory to break the language barriers in Web navigation. In this approach, a self-organizing map is constructed to train each set of monolingual Web pages and obtain two feature maps, which reveal the relationships among Web pages and thematic keywords, respectively, for such language. We then apply a hierarchy generation process on these maps to obtain the monolingual hierarchy for these Web pages. A hierarchy alignment method is then applied on these monolingual hierarchies to discover the associations between nodes in different hierarchies. Finally, a multilingual Web directory is constructed according to such associations. We applied the proposed approach on a set of Web pages and obtained interesting result that demonstrates the feasibility of our method in multilingual Web navigation.  相似文献   

14.
Topic distillation is one of the main information needs when users search the Web. Previous approaches for topic distillation treat single page as the basic searching unit, which has not fully utilized the structure information of the Web. In this paper, we propose a novel concept for topic distillation, named sub-site retrieval, in which the basic searching unit is sub-site instead of single page. A sub-site is the subset of a website, consisting of a structural collection of pages. The key of sub-site retrieval includes (1) extracting effective features for the representation of a sub-site using both the content and structure information, (2) delivering the sub-site-based retrieval results with a friendly and informative user interface. For the first point, we propose Punished Integration algorithm, which is based on the modeling of the growth of websites. For the second point, we design a user interface to better illustrate the search results of sub-site retrieval. Testing on the topic distillation task of TREC 2003 and 2004, sub-site retrieval leads to significant improvement of retrieval performance over the previous methods based on single pages. Furthermore, time complexity analysis shows that sub-site retrieval can be integrated into the index component of search engines.  相似文献   

15.
This article presents conceptual navigation and NavCon, an architecture that implements this navigation in World Wide Web pages. NavCon architecture makes use of ontology as metadata to contextualize user search for information. Based on ontologies, NavCon automatically inserts conceptual links in Web pages. By using these links, the user may navigate in a graph representing ontology concepts and their relationships. By browsing this graph, it is possible to reach documents associated with the user desired ontology concept. This Web navigation supported by ontology concepts we call conceptual navigation. Conceptual navigation is a technique to browse Web sites within a context. The context filters relevant retrieved information. The context also drives user navigation through paths that meet his needs. A company may implement conceptual navigation to improve user search for information in a knowledge management environment. We suggest that the use of an ontology to conduct navigation in an Intranet may help the user to have a better understanding about the knowledge structure of the company.  相似文献   

16.
基于Web挖掘的个性化服务研究   总被引:8,自引:0,他引:8  
论述了基于Web挖掘的个性化服务研究,提出了利用Web挖掘方法的个性化服务研究中的用户聚类、Web页面聚类、用户频繁访问路径发现算法及用户访问路径优化算法。利用这些算法得到的个性化信息可以准确把握用户兴趣模式并对Web信息资源的组织方式进行有效更新,从而提高网络信息服务效率,为用户提供“一对一”的具备自适应性的智能个性化服务。  相似文献   

17.
The Web and especially major Web search engines are essential tools in the quest to locate online information for many people. This paper reports results from research that examines characteristics and changes in Web searching from nine studies of five Web search engines based in the US and Europe. We compare interactions occurring between users and Web search engines from the perspectives of session length, query length, query complexity, and content viewed among the Web search engines. The results of our research shows (1) users are viewing fewer result pages, (2) searchers on US-based Web search engines use more query operators than searchers on European-based search engines, (3) there are statistically significant differences in the use of Boolean operators and result pages viewed, and (4) one cannot necessary apply results from studies of one particular Web search engine to another Web search engine. The wide spread use of Web search engines, employment of simple queries, and decreased viewing of result pages may have resulted from algorithmic enhancements by Web search engine companies. We discuss the implications of the findings for the development of Web search engines and design of online content.  相似文献   

18.
The exponential growth of information available on the World Wide Web, and retrievable by search engines, has implied the necessity to develop efficient and effective methods for organizing relevant contents. In this field document clustering plays an important role and remains an interesting and challenging problem in the field of web computing. In this paper we present a document clustering method, which takes into account both contents information and hyperlink structure of web page collection, where a document is viewed as a set of semantic units. We exploit this representation to determine the strength of a relation between two linked pages and to define a relational clustering algorithm based on a probabilistic graph representation. The experimental results show that the proposed approach, called RED-clustering, outperforms two of the most well known clustering algorithm as k-Means and Expectation Maximization.  相似文献   

19.
Stochastic simulation has been very effective in many domains but never applied to the WWW. This study is a premiere in using neural networks in stochastic simulation of the number of rejected Web pages per search query. The evaluation of the quality of search engines should involve not only the resulting set of Web pages but also an estimate of the rejected set of Web pages. The iterative radial basis functions (RBF) neural network developed by Meghabghab and Nasr [Iterative RBF neural networks as meta-models for stochastic simulations, in: Second International Conference on Intelligent Processing and Manufacturing of Materials, IPMM’99, Honolulu, Hawaii, 1999, pp. 729–734] was adapted to the actual evaluation of the number of rejected Web pages on four search engines, i.e., Yahoo, Alta Vista, Google, and Northern Light. Nine input variables were selected for the simulation: (1) precision, (2) overlap, (3) response time, (4) coverage, (5) update frequency, (6) boolean logic, (7) truncation, (8) word and multi-word searching, (9) portion of the Web pages indexed. Typical stochastic simulation meta-modeling uses regression models in response surface methods. RBF becomes a natural target for such an attempt because they use a family of surfaces each of which naturally divides an input space into two regions X+ and X− and the n patterns for testing will be assigned either class X+ or X−. This technique divides the resulting set of responses to a query into accepted and rejected Web pages. To test the hypothesis that the evaluation of any search engine query should involve an estimate of the number of rejected Web pages as part of the evaluation, RBF meta-model was trained on 937 examples from a set of 9000 different simulation runs on the nine different input variables. Results show that two of the variables can be eliminated which include: response time and portion of the Web indexed without affecting evaluation results. Results show that the number of rejected Web pages for a specific set of search queries on these four engines very high. Also a goodness measure of a search engine for a given set of queries can be designed which is a function of the coverage of the search engine and the normalized age of a new document in result set for the query. This study concludes that unless search engine designers address the issue of rejected Web pages, indexing, and crawling, the usage of the Web as a research tool for academic and educational purposes will stay hindered.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号