This paper presents a semantically rich document representation model for automatically classifying financial documents into predefined categories utilizing deep learning. The model architecture consists of two main modules including document representation and document classification. In the first module, a document is enriched with semantics using background knowledge provided by an ontology and through the acquisition of its relevant terminology. Acquisition of terminology integrated to the ontology extends the capabilities of semantically rich document representations with an in depth-coverage of concepts, thereby capturing the whole conceptualization involved in documents. Semantically rich representations obtained from the first module will serve as input to the document classification module which aims at finding the most appropriate category for that document through deep learning. Three different deep learning networks each belonging to a different category of machine learning techniques for ontological document classification using a real-life ontology are used.Multiple simulations are carried out with various deep neural networks configurations, and our findings reveal that a three hidden layer feedforward network with 1024 neurons obtain the highest document classification performance on the INFUSE dataset. The performance in terms of F1 score is further increased by almost five percentage points to 78.10% for the same network configuration when the relevant terminology integrated to the ontology is applied to enrich document representation. Furthermore, we conducted a comparative performance evaluation using various state-of-the-art document representation approaches and classification techniques including shallow and conventional machine learning classifiers.  相似文献   

引进机器学习的思想,通过数据挖掘的手段,提出了一种新的渠道糙率直接反演方法,并应用于石头河水库灌区水流模拟计算。结果表明,计算水位于实测水位过程基本吻合。  相似文献   

The impact of crisis events can be devastating in a multitude of ways, many of which are unpredictable due to the suddenness in which they occur. The evolution of social media (for example Twitter) has given directly affected individuals or those with valuable information a platform to effectively share their stories to the masses. As a result, these platforms have become vast repositories of helpful information for emergency organizations. However, different crisis events often contain event-specific keywords, which results in the difficult extraction of useful information with a single model. In this paper, we put forward TASR, which stands for Topic-Agnostic Stylometric Representations, a novice deep learning architecture that uses stylometric and adversarial learning to remove topical bias to better manage the unknown surrounding unseen events. As an alternative to domain adaptive approaches requiring data from the unseen event, it reduces the work for those responding to the onset of a crisis. Overall, we conduct a comprehensive study of the situational properties of TASR, the benefits of its architecture including its topic-agnostic and explainable properties, and how it improves upon comparable models in past research. From two experiments, on average, TASR is able to outperform state-of-the-art methods such as transfer learning and domain adoption by 11% in AUC. The ablation study illustrates how different architecture choices of TASR impact the results and that TASR has been optimized for this task. Finally, we conduct a case study to show that explainable results from our model can be used to help guide human analysts through crisis information extraction.  相似文献   

Deep Learning has reached human-level performance in several medical tasks including classification of histopathological images. Continuous effort has been made at finding effective strategies to interpret these types of models, among them saliency maps, which depict the weights of the pixels on the classification as an heatmap of intensity values, have been by far the most used for image classification. However, there is a lack of tools for the systematic evaluation of saliency maps, and existing works introduce non-natural noise such as random or uniform values. To address this issue, we propose an approach to evaluate the faithfulness of the saliency maps by introducing natural perturbations in the image, based on oppose-class substitution, and studying their impact on evaluation metrics adapted from saliency models. We validate the proposed approach on a breast cancer metastases detection dataset PatchCamelyon with 327,680 patches of histopathological images of sentinel lymph node sections. Results show that GradCAM, Guided-GradCAM and gradient-based saliency map methods are sensitive to natural perturbations and correlate to the presence of tumor evidence in the image. Overall, this approach proves to be a solution for the validation of saliency map methods without introducing confounding variables and shows potential for application on other medical imaging tasks.  相似文献   

机器学习作为研究生所研究的一个方向,由于其基础研究和应用基础研究的特点,要求研究生在机器学习算法及在相关的应用领域必须具有较强的创新能力。因此,本文结合机器学习方向研究生的创新要求,从理论基础学习、文献查阅、论文选题以及交流合作等不同方面探讨研究生的创新能力培养,为提高机器学习方向研究生的创新水平和培养质量进行有益尝试。  相似文献   

机器学习是人工智能的一个前沿分支学科,同时也是实现人工智能的一个重要途径.以机器学习领域内的著名期刊《机器学习》(Machine Learning)和《机器学习研究杂志》(Journal of Machine Learning Research)作为样本,运用多视角引文网络分析CiteSpaceⅡ软件绘制的知识图谱显示出国际机器学习研究的前沿领域是以“数据挖掘”为代表的9个知识群;通过对知识图谱的深度解读,进一步揭示出机器学习研究的前沿演化路径.  相似文献   

自主学习能力薄弱是很多学生的特点,独立院校的学生更是如此。针对此现状,本文通过对耿丹学院12级一个非英语专业班的学生进行英语学习策略的训练来探索学习策略教学对大学英语自主学习的作用,增强大学生的自主学习意识和能力,以此提高学习效果,实现最佳学习目标,成为真正意义上的自主语言学习者。  相似文献   

机器学习技术在自然语言处理中的应用是一个研究热点。简单介绍并分析、评价了机器学习的方法之一--基于实例学习。就其在自然语言处理关键环节之一--浅层句法分析方面进行实验研究并分析其结果。最后,讨论了基于实例学习在自然语言处理中的应用。  相似文献   

在新开发的系统被部署应用之前,恶意代码检测成为非常重要的一个环节,同时也是很大的一个挑战。本文中,采用机器学习,发现系统的实现结构,包含设计中的正常功能以及隐藏存在的恶意行为。通常情况下,带有机器学习的出版的系统被认为是完全确定的。但是实际的系统经常是不确定的,而且流行的算法并不适用。本文设计了针对不完全确定系统的广义并且高效的机器学习算法,来检验恶意代码的植入。并进一步延伸机器学习的结果,从一个近似的模型开始,比已知的算法更有效的学习一个实现的结构。实验表明本文的算法更有效地检测恶意植入行为。  相似文献   

改变学生的学习方式、提高学生的学习能力是新一轮课程改革的关键。对新课程视野中的学习理论、学习方式含义与类型的分析与探讨,有助于我们把握新课程改革中学生学习的特点,以全面提高学生的学习能力。  相似文献   

技术学习曲线研究综述   总被引:1,自引:1,他引:0  
技术学习曲线源于单位成本与累计产量的关系研究,其广泛应用于技术学习率的估算方面,并且为预测技术成本演变趋势、测算技术推广速率提供了便利条件。技术学习曲线的研究主要集中在学习机理、技术学曲线模型与技术学习率测算3个方面,从这3个方面研究现状入手,综述影响学习曲线的相关机理,归纳技术学习曲线模型,总结技术学习率在生产活动的实际应用情况,并通过已有的研究现状分析目前研究存在的问题,进而提出今后研究的重点与方向。  相似文献   

学案导学是把教学内容转变成一个个的知识点,避免了初中学生学习时的盲目性,将学生的主体地位展现得淋漓尽致,使学生切实地加入了学习的行列之中,获得了学〉--j的乐趣。基于此,本文对初中物理学案导学法的有效实施进行了探讨。  相似文献   

This paper presents an investigation about how to automatically formulate effective queries using full or partial relevance information (i.e., the terms that are in relevant documents) in the context of relevance feedback (RF). The effects of adding relevance information in the RF environment are studied via controlled experiments. The conditions of these controlled experiments are formalized into a set of assumptions that form the framework of our study. This framework is called idealized relevance feedback (IRF) framework. In our IRF settings, we confirm the previous findings of relevance feedback studies. In addition, our experiments show that better retrieval effectiveness can be obtained when (i) we normalize the term weights by their ranks, (ii) we select weighted terms in the top K retrieved documents, (iii) we include terms in the initial title queries, and (iv) we use the best query sizes for each topic instead of the average best query size where they produce at most five percentage points improvement in the mean average precision (MAP) value. We have also achieved a new level of retrieval effectiveness which is about 55–60% MAP instead of 40+% in the previous findings. This new level of retrieval effectiveness was found to be similar to a level using a TREC ad hoc test collection that is about double the number of documents in the TREC-3 test collection used in previous works.  相似文献   

[目的/意义]机器学习作为人工智能的关键核心技术,受到了前所未有的重视和快速发展。深入研究其发展现状和竞争格局,有助于为企业战略和相关产业政策制定提供科学决策依据。[方法/过程]基于DⅡ数据库和WOS数据库,从发展阶段、热点与核心领域识别、竞争国家对比三方面,对该技术领域发展现状、竞争格局进行了分析。[结果/结论]机器学习技术处于快速成长期,我国目前也处于快速发展期;我国在技术结构布局上存在短板;美国的专利活动最强,我国也属于技术活跃者;美国的专利质量最高,我国与其相差较大;互联网企业是重要推动力量;热点领域有智能诊断、自动驾驶仪、教育辅助、语音识别、计算机视觉等;核心领域有排序、学习、知识处理、搜索、模糊逻辑系统、专家系统等。  相似文献   

技工学校的学生由于文化基础薄弱,对文化课的学习缺乏兴趣。针对这种现状,本文就我校的语文教学提出了自己的建议,从创新教学理念、合作探究、引导阅读、建立新的评价体系等方面阐述个人见解和看法。  相似文献   

Digital twins, along with the internet of things (IoT), data mining, and machine learning technologies, offer great potential in the transformation of today’s manufacturing paradigm toward intelligent manufacturing. Production control in petrochemical industry involves complex circumstances and a high demand for timeliness; therefore, agile and smart controls are important components of intelligent manufacturing in the petrochemical industry. This paper proposes a framework and approaches for constructing a digital twin based on the petrochemical industrial IoT, machine learning and a practice loop for information exchange between the physical factory and a virtual digital twin model to realize production control optimization. Unlike traditional production control approaches, this novel approach integrates machine learning and real-time industrial big data to train and optimize digital twin models. It can support petrochemical and other process manufacturing industries to dynamically adapt to the changing environment, respond in a timely manner to changes in the market due to production optimization, and improve economic benefits. Accounting for environmental characteristics, this paper provides concrete solutions for machine learning difficulties in the petrochemical industry, e.g., high data dimensions, time lags and alignment between time series data, and high demand for immediacy. The approaches were evaluated by applying them in the production unit of a petrochemical factory, and a model was trained via industrial IoT data and used to realize intelligent production control based on real-time data. A case study shows the effectiveness of this approach in the petrochemical industry.  相似文献   

技术学习的功能和来源   总被引:24,自引:2,他引:22  
本文首先介绍了技术学习的概念,其次指出了技术学习工作对于中国产业结构升级的重要性,第三部分是综合了现有的研究,给出了技术学习的主要来源。  相似文献   

随着高等院校学习咨询服务意识的增强,关于如何高效科学地解决大学生学习困扰方面的研究也日益增多,视角也越加多样化。本文在高校实际工作总结的基础上,以抗逆力理论视角探讨当前高校学习咨询服务体系构建的路径与工作模式,以此为高校学风建设工作实践提供理论借鉴和实践指导。  相似文献   

高校机关干部中普遍存在着学习积极性不高、学习深入度不够、学习方法不当等诸多问题。创建高校学习型机关需要在路径选择上进行探索.要利用机关党委中心组、部门党支部、群团组织和网络等形式,抓好讲学、互学、自学和考学.发挥高校人才密集、学科集中、氛围浓厚、资源充裕的优势。同时.处理好学习与工作、整体与个体、领导干部与普通干部、理论学习与业务提升之间的关系,形成人人爱学习、人人会学习、人人学习好的风气。  相似文献   

