首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In image retrieval, most systems lack user-centred evaluation since they are assessed by some chosen ground truth dataset. The results reported through precision and recall assessed against the ground truth are thought of as being an acceptable surrogate for the judgment of real users. Much current research focuses on automatically assigning keywords to images for enhancing retrieval effectiveness. However, evaluation methods are usually based on system-level assessment, e.g. classification accuracy based on some chosen ground truth dataset. In this paper, we present a qualitative evaluation methodology for automatic image indexing systems. The automatic indexing task is formulated as one of image annotation, or automatic metadata generation for images. The evaluation is composed of two individual methods. First, the automatic indexing annotation results are assessed by human subjects. Second, the subjects are asked to annotate some chosen images as the test set whose annotations are used as ground truth. Then, the system is tested by the test set whose annotation results are judged against the ground truth. Only one of these methods is reported for most systems on which user-centred evaluation are conducted. We believe that both methods need to be considered for full evaluation. We also provide an example evaluation of our system based on this methodology. According to this study, our proposed evaluation methodology is able to provide deeper understanding of the system’s performance.  相似文献   

2.
The high quality evaluation of generated summaries is needed if we are to improve automatic summarization systems. Although human evaluation provides better results than automatic evaluation methods, its cost is huge and it is difficult to reproduce the results. Therefore, we need an automatic method that simulates human evaluation if we are to improve our summarization system efficiently. Although automatic evaluation methods have been proposed, they are unreliable when used for individual summaries. To solve this problem, we propose a supervised automatic evaluation method based on a new regression model called the voted regression model (VRM). VRM has two characteristics: (1) model selection based on ‘corrected AIC’ to avoid multicollinearity, (2) voting by the selected models to alleviate the problem of overfitting. Evaluation results obtained for TSC3 and DUC2004 show that our method achieved error reductions of about 17–51% compared with conventional automatic evaluation methods. Moreover, our method obtained the highest correlation coefficients in several different experiments.  相似文献   

3.
Performance-based university research funding systems   总被引:1,自引:0,他引:1  
The university research environment has been undergoing profound change in recent decades and performance-based research funding systems (PRFSs) are one of the many novelties introduced. This paper seeks to find general lessons in the accumulated experience with PRFSs that can serve to enrich our understanding of how research policy and innovation systems are evolving. The paper also links the PRFS experience with the public management literature, particularly new public management, and understanding of public sector performance evaluation systems. PRFSs were found to be complex, dynamic systems, balancing peer review and metrics, accommodating differences between fields, and involving lengthy consultation with the academic community and transparency in data and results. Although the importance of PRFSs seems based on their distribution of universities’ research funding, this is something of an illusion, and the literature agrees that it is the competition for prestige created by a PRSF that creates powerful incentives within university systems. The literature suggests that under the right circumstances a PRFS will enhance control by professional elites. PRFSs since they aim for excellence, may compromise other important values such as equity or diversity. They will not serve the goal of enhancing the economic relevance of research.  相似文献   

4.
Electronic word of mouth (eWOM) is prominent and abundant in consumer domains. Both consumers and product/service providers need help in understanding and navigating the resulting information spaces, which are vast and dynamic. The general tone or polarity of reviews, blogs or tweets provides such help. In this paper, we explore the viability of automatic sentiment analysis (SA) for assessing the polarity of a product or a service review. To do so, we examine the potential of the major approaches to sentiment analysis, along with star ratings, in capturing the true sentiment of a review. We further model contextual factors (specifically, product type and review length) as two moderators affecting SA accuracy. The results of our analysis of 900 reviews suggest that different tools representing the main approaches to SA display differing levels of accuracy, yet overall, SA is very effective in detecting the underlying tone of the analyzed content, and can be used as a complement or an alternative to star ratings. The results further reveal that contextual factors such as product type and review length, play a role in affecting the ability of a technique to reflect the true sentiment of a review.  相似文献   

5.
【目的】对影响开放式同行评议实践的相关因素进行实证研究,发掘开放式同行评议的关键影响因素。【方法】以Directory of Open Access Journals (DOAJ)中开放式同行评议期刊为研究对象,通过网络爬取相关数据。采用变量分类赋值的方式,对影响开放式同行评议的相关定性因素进行量化分析。采用多重对应分析图展示开放式同行评议相关影响因素及其不同类别的内在关联;采用最优尺度回归模型揭示相关影响因素对开放评议类型的影响程度。【结果】开放评议类型与评议专家身份的公开类别具有极密切的关联,评议专家身份对开放评议类型有显著正向影响,且重要性程度值非常高。【结论】评议专家身份是否公开成为开放式同行评议实践模式的关键影响因素,透明性同行评议是当前开放评议行之有效的实践模式。  相似文献   

6.
The profusion of online resources calls for tools and methods to help Internet users find precisely what they are looking for. Quality controlled gateway CISMeF provides such services for health resources. However, the human cost of maintaining and updating the catalogue are increasingly high. This paper presents the automatic indexing system currently developed in the CISMeF team to be used as such for preliminary indexing, or after human reviewing for the final indexing. The system architecture, using the INTEX platform for MeSH term extraction is detailed. The results of a first evaluation tend to indicate that the automatic indexing strategy is relevant, as it achieves a precision comparable to that of other existing operational systems. Moreover, the system presented in this paper retrieves keyword/qualifier pairs as opposed to single terms, therefore providing a significantly more precise indexing. Further development and tests will be carried out in order to improve the coverage of the dictionaries, and validate the efficiency of the system in the indexers’ everyday work.  相似文献   

7.
In recent years, there has been increased interest in topic-focused multi-document summarization. In this task, automatic summaries are produced in response to a specific information request, or topic, stated by the user. The system we have designed to accomplish this task comprises four main components: a generic extractive summarization system, a topic-focusing component, sentence simplification, and lexical expansion of topic words. This paper details each of these components, together with experiments designed to quantify their individual contributions. We include an analysis of our results on two large datasets commonly used to evaluate task-focused summarization, the DUC2005 and DUC2006 datasets, using automatic metrics. Additionally, we include an analysis of our results on the DUC2006 task according to human evaluation metrics. In the human evaluation of system summaries compared to human summaries, i.e., the Pyramid method, our system ranked first out of 22 systems in terms of overall mean Pyramid score; and in the human evaluation of summary responsiveness to the topic, our system ranked third out of 35 systems.  相似文献   

8.
秦岩  代君  廖莹驰 《情报科学》2021,39(1):104-110
【目的/意义】研究学术会议论文新颖性测度方法,为会议论文的新颖性评价提供新的视角。【方法/过程】本 文设计吸收新颖性指标和产出新颖性指标测度方法,选择“人工智能”领域的会议论文进行实证研究。【结果/结论】 具有高吸收新颖性水平和高产出新颖性水平的论文成为A类会议论文的几率最高,结果表明新颖性测度方法的有 效性,对于会议论文的自动评审有一定的参考意义。【创新/局限】设计会议论文新颖性测度方法,促进会议论文评 价的发展;只针对计算机的人工智能领域进行实证,具有一定的领域局限性。  相似文献   

9.
Timeline generation systems are a class of algorithms that produce a sequence of time-ordered sentences or text snippets extracted in real-time from high-volume streams of digital documents (e.g. news articles), focusing on retaining relevant and informative content for a particular information need (e.g. topic or event). These systems have a range of uses, such as producing concise overviews of events for end-users (human or artificial agents). To advance the field of automatic timeline generation, robust and reproducible evaluation methodologies are needed. To this end, several evaluation metrics and labeling methodologies have recently been developed - focusing on information nugget or cluster-based ground truth representations, respectively. These methodologies rely on human assessors manually mapping timeline items (e.g. sentences) to an explicit representation of what information a ‘good’ summary should contain. However, while these evaluation methodologies produce reusable ground truth labels, prior works have reported cases where such evaluations fail to accurately estimate the performance of new timeline generation systems due to label incompleteness. In this paper, we first quantify the extent to which the timeline summarization test collections fail to generalize to new summarization systems, then we propose, evaluate and analyze new automatic solutions to this issue. In particular, using a depooling methodology over 19 systems and across three high-volume datasets, we quantify the degree of system ranking error caused by excluding those systems when labeling. We show that when considering lower-effectiveness systems, the test collections are robust (the likelihood of systems being miss-ranked is low). However, we show that the risk of systems being mis-ranked increases as the effectiveness of systems held-out from the pool increases. To reduce the risk of mis-ranking systems, we also propose a range of different automatic ground truth label expansion techniques. Our results show that the proposed expansion techniques can be effective at increasing the robustness of the TREC-TS test collections, as they are able to generate large numbers missing matches with high accuracy, markedly reducing the number of mis-rankings by up to 50%.  相似文献   

10.
Documents circulating in paper form are increasingly being substituted by its electronic equivalent in the modern office today so that any stored document can be retrieved whenever needed later on. The office worker is already burdened with information overload, so effective and efficient retrieval facilities become an important factor affecting worker productivity. This paper first reviews the features of current document management systems with varying facilities to manage, store and retrieve either reference to documents or whole documents. Information retrieval databases, groupware products and workflow management systems are presented as developments to handle different needs, together with the underlying concepts of knowledge management. The two problems of worker finiteness and worker ignorance remain outstanding, as they are only partially addressed by the above-mentioned systems. The solution lies in a shift away from pull technology where the user has to actively initiate the request for information towards push technology, where available information is automatically delivered without user intervention. Intelligent information retrieval agents are presented as a solution together with a marketing scenario of how they can be introduced.  相似文献   

11.
影响图书馆服务质量评价的若干因素   总被引:2,自引:0,他引:2  
冯琼 《现代情报》2010,30(2):123-125
图书馆的服务质量评价是读者进入图书馆利用馆藏服务、馆员服务、环境服务的过程中的感知来实现的,在这个过程中,图书馆服务系统中的各个环境和过程以及系统中的每个人,都是影响用户满意度的因素。本文从图书馆和读者两方面,对影响读者评价图书馆服务质量的各种因素进行了具体的分析。  相似文献   

12.
The benefits and priorities of public funding of R&D programmes are the subject of considerable research and debate and a number of methodologies have been suggested which might allow us to arbitrate on the issues involved. This paper looks at one method that is actually used in practice to evaluate and rank publicly funded R&D programmes in the UK. We describe the improvements that have been made to the mapping measurement impact (MMI) model, which is used by the UK Department of Trade and Industry to assess the economic benefit to industry of different research projects funded as part of the United Kingdom National Measurement System. The model has been in use for more than 5 years as a means to compare publicly funded R&D programmes. It allows evaluation of their benefit and prioritisation of future funding schemes and has potential for wider application in other areas of public R&D investment both inside and outside the UK.  相似文献   

13.
惠淑敏 《科研管理》2015,36(10):146-152
为应对学术评价活动日益增多、评审成本过高、评价任务繁重等挑战,本文构建学术文献高效获取、研究成果主动推介和科研成果自动评价于一体的论文推荐-传播平台模型,运用仿真方法验证模型性能后提出利用平台用户的推荐-传播行为自动计算衡量学术论文贡献的影响指数、质量指数和价值指数的算法。基于论文推荐-传播平台开展学术评价能够共享科研人员劳动、提高科研群体效率、节省评价费用、使评价结果变得更客观。  相似文献   

14.
迪莉娅 《现代情报》2009,39(12):131-137
[目的/意义] 大数据环境下,APP已经成为工作、生活、娱乐甚至是赚钱的重要工具,与此同时,APP也成为用户隐私泄露的重灾区。用户一方面担心隐私的泄露,另外一方面由于APP所带来的益处,却愿意主动提供隐私数据供商家利用,这就是所谓的"隐私悖论"现象。[方法/过程] 隐私计算是研究隐私悖论的重要方法之一,通过对APP用户隐私计算影响因素的调查,分析影响用户自愿提供隐私数据的核心因素,并分析悖论存在的原因。[结果/结论] APP用户隐私的保护需要不断加强法律、制度的建设和开发商与运营商的监管,而不断提高用户隐私保护的意识也是不可忽视的重要内容。  相似文献   

15.
This paper provides a systematic review of the literature on knowledge management (KM) in small and medium enterprises (SMEs) and SME networks. The main objective is to highlight the state-of-the-art of KM from the management point of view in order to identify relevant research gaps. The review highlights that in recent years the trend of papers on the topic is growing and involves a variety of approaches, methodologies and models from different research areas. The vast majority of papers analysed focus on the topic of KM in the SME while there are only few papers analysing KM in networks populated by SMEs. The content analysis of the papers highlights six areas of investigation from which were derived ten research questions concerning three perspectives: the factors affecting KM; the impact of KM on firm’s performance; the knowledge management systems.  相似文献   

16.
Wilhelm Gschel 《Endeavour》1981,5(4):158-166
The use of satellites for linking major television systems is now well established, but a further important development is now imminent. This is to transmit television programmes via satellites stationary relative to the Earth, from which they are beamed back to individual receiving sets on the ground. The advantages are better reception and the elimination of the need for a multiplicity of booster stations on the ground. This system, which has already achieved experimental success and is expected to become operative in Europe in the mid-1980s, is here reviewed with special reference to the Franco-German project formally approved last year.  相似文献   

17.
Information management is a neglected function of urban and regional planning in Asian metropolitan regions. Although some kind of information systems exist, these are mostly used for general administration and not for urban and regional planning. The critical factors which appear to impede the implementation of information systems are the inability to define the information needs, lack of systematic and disaggregated data, reliance on secondary sources and administrative records, lack of effective data processing devices, shortage of skilled personnel for data analysis, inadequate and defective methods and procedures for monitoring and evaluation, and limited resources for the adoption of information systems and technology. Some international organizations are increasingly committed to assisting the developing countries in practical programmes and pilot demonstration projects geared to the development of information systems for planning.  相似文献   

18.
Enterprises in both the public and private sector undertake knowledge management (KM) initiatives through which they hope to engender a new, more adaptive and flexible culture of learning and innovation in their organisations. Creative activities involving social learning and innovation are, however, more common in less formal entities such as communities of practice at work and community service organisations in civil society. This paper presents the results and implications of collaborative research into the understanding, development and evaluation of socio-technical systems (STS) designed to mobilise collective knowledge in diverse community settings. The research concerns information and communication technologies (ICT)-mediated activities of communities in the broader civil society and also those in formal organisations. The paper describes and critically evaluates a set of three STS that have the potential to support the collective knowledge of innovative groups, teams and networks, which can all be considered forms of community. The findings could be of strategic value to business, government and community service organisations initiating KM programmes aimed at using collective learning to support innovation.  相似文献   

19.
张育玮  邢雯  徐茜  关记兴 《科研管理》2019,40(11):175-184
随着云计算的不断发展,云企业信息系统(EIS)显示出其优势,愈来愈多企业愿意考虑从传统EIS转换到云EIS系统。然而,却很少研究探讨企业在EIS的转换议题,本研究整合信息系统成功模型和技术接受模型,发展一个多理论框架去探索企业向云ERP转换的意愿。我们从企业管理者收集了199份调查问卷,通过结构方程模型验证本研究模型与假设,结果发现有用性和易用性是影响转换意愿重要因素,易用性、信息质量和系统质量则正向影响认知有用性,这些结论,给未来转换意愿研究提供了参考,我们建议云服务供应商应重视系统本身的易用性和有用性,提高系统质量并关注信息质量,以帮助他们提供企业需求的产品,吸引更多企业从传统EIS转换到云EIS。  相似文献   

20.
张育玮  邢雯  徐茜  关记兴 《科研管理》2006,40(11):175-184
随着云计算的不断发展,云企业信息系统(EIS)显示出其优势,愈来愈多企业愿意考虑从传统EIS转换到云EIS系统。然而,却很少研究探讨企业在EIS的转换议题,本研究整合信息系统成功模型和技术接受模型,发展一个多理论框架去探索企业向云ERP转换的意愿。我们从企业管理者收集了199份调查问卷,通过结构方程模型验证本研究模型与假设,结果发现有用性和易用性是影响转换意愿重要因素,易用性、信息质量和系统质量则正向影响认知有用性,这些结论,给未来转换意愿研究提供了参考,我们建议云服务供应商应重视系统本身的易用性和有用性,提高系统质量并关注信息质量,以帮助他们提供企业需求的产品,吸引更多企业从传统EIS转换到云EIS。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号