首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In information retrieval (IR), the improvement of the effectiveness often sacrifices the stability of an IR system. To evaluate the stability, many risk-sensitive metrics have been proposed. Since the theoretical limitations, the current works study the effectiveness and stability separately, and have not explored the effectiveness–stability tradeoff. In this paper, we propose a Bias–Variance Tradeoff Evaluation (BV-Test) framework, based on the bias–variance decomposition of the mean squared error, to measure the overall performance (considering both effectiveness and stability) and the tradeoff between effectiveness and stability of a system. In this framework, we define generalized bias–variance metrics, based on the Cranfield-style experiment set-up where the document collection is fixed (across topics) or the set-up where document collection is a sample (per-topic). Compared with risk-sensitive evaluation methods, our work not only measures the effectiveness–stability tradeoff of a system, but also effectively tracks the source of system instability. Experiments on TREC Ad-hoc track (1993–1999) and Web track (2010–2014) show a clear effectiveness–stability tradeoff across topics and per-topic, and topic grouping and max–min normalization can effectively reduce the bias–variance tradeoff. Experimental results on TREC Session track (2010–2012) also show that the query reformulation and increase of user data are beneficial to both effectiveness and stability simultaneously.  相似文献   

2.
The estimation of query model is an important task in language modeling (LM) approaches to information retrieval (IR). The ideal estimation is expected to be not only effective in terms of high mean retrieval performance over all queries, but also stable in terms of low variance of retrieval performance across different queries. In practice, however, improving effectiveness can sacrifice stability, and vice versa. In this paper, we propose to study this tradeoff from a new perspective, i.e., the bias–variance tradeoff, which is a fundamental theory in statistics. We formulate the notion of bias–variance regarding retrieval performance and estimation quality of query models. We then investigate several estimated query models, by analyzing when and why the bias–variance tradeoff will occur, and how the bias and variance can be reduced simultaneously. A series of experiments on four TREC collections have been conducted to systematically evaluate our bias–variance analysis. Our approach and results will potentially form an analysis framework and a novel evaluation strategy for query language modeling.  相似文献   

3.
We investigate how, and to what extent, morphological complexity of the language influences text classification using support vector machines (SVM). The Croatian–English parallel corpus provides the basis for direct comparison of two languages of radically different morphological complexity. We quantified, compared, and statistically tested the effects of morphological normalisation on SVM classifier performance based on a series of parallel experiments on both languages, carried over a large scale of different feature subset sizes obtained by different feature selection methods, and applying different levels of morphological normalisation. We also quantified the trade-off between feature space size and performance for different levels of morphological normalisation, and compared the results for both languages. Our experiments have shown that the improvements in SVM classifier performance is statistically significant; they are greater for small and medium number of features, especially for Croatian, whereas for large number of features the improvements are rather small and may be negligible in practice for both languages.  相似文献   

4.
The paper introduces a new method for the visualization of information retrieval. Angle attributes of a document are used to construct the angle–angle-based visual space. The retrieved documents are perceived, several traditional information retrieval evaluation models are visualized and interpreted, and new non-traditional retrieval control means based on the model are explored in the two-dimensional angle display space. The impacts of different metrics on the visualization of information retrieval are discussed. Ambiguity, future research directions and other relevant issues are also addressed.  相似文献   

5.
6.
Research on collaborative information retrieval (CIR) has shown positive impacts of collaboration on retrieval effectiveness in the case of complex and/or exploratory tasks. The synergic effect of accomplishing something greater than the sum of its individual components is reached through the gathering of collaborators’ complementary skills. However, these approaches often lack the consideration that collaborators might refine their skills and actions throughout the search session, and that a flexible system mediation guided by collaborators’ behaviors should dynamically adapt to this situation in order to optimize search effectiveness. In this article, we propose a new unsupervised collaborative ranking algorithm which leverages collaborators’ actions for (1) mining their latent roles in order to extract their complementary search behaviors; and (2) ranking documents with respect to the latent role of collaborators. Experiments using two user studies with respectively 25 and 10 pairs of collaborators demonstrate the benefit of such an unsupervised method driven by collaborators’ behaviors throughout the search session. Also, a qualitative analysis of the identified latent role is proposed to explain an over-learning noticed in one of the datasets.  相似文献   

7.
Emotions are an integral component of all human activities, including human–computer interactions. This article reviews literature on the theories of emotions, methods for studying emotions, and their role in human information behaviour. It also examines current research on emotions in library and information science, information retrieval and human–computer interaction, and outlines some of the challenges and directions for future work.  相似文献   

8.
In our paper we present an experimental study which investigated the possibility to project the need for information specialists serving knowledge workers in knowledge industries on the basis of an average university library serving their counterparts at a university. Information management functions, i.e. functions and processes related to information evaluation, acquisition, metadata creation, etc., performed in an average university library are the starting point of this investigation. The fundamental assumption is that these functions do not only occur in libraries but also in other contexts like, for instance, in knowledge industries. As a consequence, we try to estimate the need for information professionals in knowledge industries by means of quantitative methods from library and information science (Library Planning Model) and economics (input output analysis, occupational analysis). Our study confirms the validity of our assumption. Accordingly, the number of information specialists projected on the basis of university libraries is consistent with their actual number reported in national statistics. However, in order to attain a close fit, we had to revise the original research model by dismissing the split-up of information specialists into reader services and technical services staff.  相似文献   

9.
Organisations increasingly realise that they must transform into true digital enterprises to create competitive advantage and ensure corporate survival. However, many organisations do not realise that successful digital transformation (DT) requires much more than technology; it can only succeed if they manage their data, information and knowledge as true business assets. This paper describes collaborative research conducted by academic and industry partners, a mutually beneficial journey spanning the past ten years. The aim was to develop a Holistic Information Asset Management (HIAM) model indicating the important areas of information asset management (IAM) that support the DT journey. Interviews were conducted with C-level executives in organisations from all industries on three continents to investigate their IAM practices, the barriers to good IAM and the benefits of managing information assets (IAs) well. This paper proposes that organisations should focus on ten domains in their quest for effective IAM: i) business benefits, ii) business environment, iii) executive awareness, iv) leadership and management, v) information environment, vi) information systems, vii) information behaviour, viii) information attributes/quality, ix) information performance and x) justification.  相似文献   

10.
11.
Cross-genre author profiling aims to build generalized models for predicting profile traits of authors that can be helpful across different text genres for computer forensics, marketing, and other applications. The cross-genre author profiling task becomes challenging when dealing with low-resourced languages due to the lack of availability of standard corpora and methods. The task becomes even more challenging when the data is code-switched, which is informal and unstructured. In previous studies, the problem of cross-genre author profiling has been mainly explored for mono-lingual texts in highly resourced languages (English, Spanish, etc.). However, it has not been thoroughly explored for the code-switched text which is widely used for communication over social media. To fulfill this gap, we propose a transfer learning-based solution for the cross-genre author profiling task on code-switched (English–RomanUrdu) text using three widely known genres, Facebook comments/posts, Tweets, and SMS messages. In this article, firstly, we experimented with the traditional machine learning, deep learning and pre-trained transfer learning models (MBERT, XLMRoBERTa, ULMFiT, and XLNET) for the same-genre and cross-genre gender identification task. We then propose a novel Trans-Switch approach that focuses on the code-switching nature of the text and trains on specialized language models. In addition, we developed three RomanUrdu to English translated corpora to study the impact of translation on author profiling tasks. The results show that the proposed Trans-Switch model outperforms the baseline deep learning and pre-trained transfer learning models for cross-genre author profiling task on code-switched text. Further, the experimentation also shows that the translation of RomanUrdu text does not improve results.  相似文献   

12.
Taxonomy is widely used in many of the website and directory navigation schemes for content/knowledge retrieval. However, information or content navigation support through taxonomy is often constrained due to its inability to take into account the full nomenclature and cultural nuances of knowledge seekers. The emergence and increasing adoption of collaborative tagging (social bookmarking) tools have provided lightweight and informal conceptual structures called folksonomies for knowledge retrieval. As for folksonomies, they reflect the vocabulary of the users. Hence, integrating folksonomies into a taxonomy combines the best of the two schemes as the resultant structure enhances taxonomy navigation with personsalisation for knowledge search and retrieval. This paper presents TaxoFolk, an algorithm for deriving hybrid taxonomy-folksonomy classification for enhanced knowledge navigation. The algorithm integrates folksonomy with a taxonomy through several unsupervised data mining techniques with augmented heuristics.  相似文献   

13.
Within 18 months an online system has been installed, work routines established and a database of over 8 000 records created. This alone represents an enormous improvement over the 1 000 or so card catalogue entries that could have been created in the same period. Points of access and the range of physical formats which can be retrieved also represents substantially increased effectiveness and efficiency in information recall. It must be noted, however, that the short-term funding for the project and hence, the need to become selffunding has resulted in the concentration of staff time on current record input, which in turn supports those services available to external subscribers. Identification of further staff time will, therefore, have to be made if retrospective conversion is to be achieved within a reasonable timescale.During the same 18-month period, information services have been enhanced in-house and extended to a growing external user group. The resultant demand has necessitated appointment of an additional professional librarian - funded primarily from generated income.With the introduction of direct online access to the NHS and the potential mounting of HELMIS as a commercial database, we would expect to see increasing demand during 1986.  相似文献   

14.
Statistics show that the number of identity theft victims in the US increased by 12% in 2009, to 11.1 million adults, while the total annual fraud amount increased by 12.5%, to $54 billion. As the e-commerce volume is increasing and various online services are becoming more popular, the number of sites to which an average Internet user subscribes is increasing rapidly. Given the limited memory capacity of human beings, an Internet user's login credentials (in the form of a combination of a user ID and a password) are usually reused over multiple accounts, which can cause significant security problems. In this study, we address the vulnerability of login credentials. First, based on a unique Internet user data set, we analyze the behavioral characteristics of login credentials usage. We find that the same login credentials are used for many more accounts and reused much more often than previously expected. Furthermore, usage patterns are found to be quite skewed. Second, building on a network perspective of login credentials usage, we suggest a vulnerability measure of an individual's login credentials and analyze the vulnerability of current Internet users. The resulting information is valuable not only to the research community but also to managers and policy makers striving to reduce security vulnerability.  相似文献   

15.
16.
17.
Finite time convergence based on robust synergetic control (SC) theory and terminal attractor techniques is investigated. To this end a fast terminal synergetic control law (FTSC) is applied to drive a DC–DC Buck converter via simulation and through a dSpace based experimental setup to validate the approach. As robust as sliding mode control, the synergetic approach used is chattering free and provides rapid convergence. Efficacy of the proposed fast terminal synergetic controller is tested for step load change and output voltage variation and results compared to classical synergetic and PI control. Experimental validation using dSpace DS1104 confirms the results obtained in simulation showing the soundness of this approach compared to synergetic and PI controllers.  相似文献   

18.
To perform repetitive tasks, this paper proposes an adaptive boundary iterative learning control (ILC) scheme for a two-link rigid–flexible manipulator with parametric uncertainties. Using Hamilton?s principle, the coupled ordinary differential equation and partial differential equation (ODE–PDE) dynamic model of the system is established. In order to drive the joints to follow desired trajectory and eliminate deformation of flexible beam simultaneously, boundary control strategy is added based on the conventional joints torque control. The adaptive iterative learning algorithm for boundary control scheme includes a proportional-derivative (PD) feedback structure and an iterative term. This novel controller is designed to deal with the unmodeled dynamics and other unknown external disturbances. Numerical simulations are provided to verify the performance of proposed controller in MATLAB.  相似文献   

19.
We consider the problem of controlling the model of one-dimensional fluid flow through a soil packed tube in which a contaminant is initially distributed. A fluid is pumped through a tube to remove the contaminant. The control problem is to determine the optimal convective velocity due to the fluid being pumped by minimizing a given performance criterion. The performance criterion is chosen to be a combination of the total contaminant at the final time and the cost of the control. The set of orthogonal Fourier trigonometry series is used as a basis function of the Galerkin procedure to lump the distributed parameter system. A Legendre wavelet operational matrix of derivative is used to approximate the control and modal state variables. The main characteristics of this technique is that it reduces these problems to those of solving a system of algebraic equations, thus greatly simplifying the problem. The effectiveness of the proposed approach is illustrated numerically and the results are quite satisfactory.  相似文献   

20.
This paper deals with an application of the Sliding Mode Control (SMC) in the presence of lumped temperature disturbances in Peltier Cells (PCs). A controller is proposed so that a temperature disturbance reduction is obtained. A constructive Theorem based on a particular sliding mode surface using Lyapunov approach is demonstrated. Specifically, the proven Theorem shows a structural control law which consists of an interacting input law between the two available inputs of current and forced heat convection transfer function (ventilation input). The sliding mode surface is defined in terms of cold and hot state variables, emphasizing a new two sided control approach for PCs. In terms of applications, the localization problems are very important to minimize errors in all cases in which PCs are used. The application approach is targeted on a novel workpiece clamping device, which uses PCs to freeze water on a metallic plate under subzero temperatures. The ice structure is capable of evolving enough bonding strength to clamp workpieces form and force-fitted during machining operations without deforming the piece mechanically. This capability is especially crucial for micro parts as well as for hard to clamp pieces made of brittle or soft materials and irregular shaped complex geometries. The proposed SMC approach shows a robustness against the parametric uncertainties due to the nonlinear model of PCs. Computer simulation results as well as measurements are shown.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号