首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
    
Federated Learning (FL) has been foundational in improving the performance of a wide range of applications since it was first introduced by Google. Some of the most prominent and commonly used FL-powered applications are Android’s Gboard for predictive text and Google Assistant. FL can be defined as a setting that makes on-device, collaborative Machine Learning possible. A wide range of literature has studied FL technical considerations, frameworks, and limitations with several works presenting a survey of the prominent literature on FL. However, prior surveys have focused on technical considerations and challenges of FL, and there has been a limitation in more recent work that presents a comprehensive overview of the status and future trends of FL in applications and markets. In this survey, we introduce the basic fundamentals of FL, describing its underlying technologies, architectures, system challenges, and privacy-preserving methods. More importantly, the contribution of this work is in scoping a wide variety of FL current applications and future trends in technology and markets today. We present a classification and clustering of literature progress in FL in application to technologies including Artificial Intelligence, Internet of Things, blockchain, Natural Language Processing, autonomous vehicles, and resource allocation, as well as in application to market use cases in domains of Data Science, healthcare, education, and industry. We discuss future open directions and challenges in FL within recommendation engines, autonomous vehicles, IoT, battery management, privacy, fairness, personalization, and the role of FL for governments and public sectors. By presenting a comprehensive review of the status and prospects of FL, this work serves as a reference point for researchers and practitioners to explore FL applications under a wide range of domains.  相似文献   

2.
Recently, models that based on Transformer (Vaswani et al., 2017) have yielded superior results in many sequence modeling tasks. The ability of Transformer to capture long-range dependencies and interactions makes it possible to apply it in the field of portfolio management (PM). However, the built-in quadratic complexity of the Transformer prevents its direct application to the PM task. To solve this problem, in this paper, we propose a deep reinforcement learning-based PM framework called LSRE-CAAN, with two important components: a long sequence representations extractor and a cross-asset attention network. Direct Policy Gradient is used to solve the sequential decision problem in the PM process. We conduct numerical experiments in three aspects using four different cryptocurrency datasets, and the empirical results show that our framework is more effective than both traditional and state-of-the-art (SOTA) online portfolio strategies, achieving a 6x return on the best dataset. In terms of risk metrics, our framework has an average volatility risk of 0.46 and an average maximum drawdown risk of 0.27 across the four datasets, both of which are lower than the vast majority of SOTA strategies. In addition, while the vast majority of SOTA strategies maintain a poor turnover rate of approximately greater than 50% on average, our framework enjoys a relatively low turnover rate on all datasets, efficiency analysis illustrates that our framework no longer has the quadratic dependency limitation.  相似文献   

3.
    
In recent years, reasoning over knowledge graphs (KGs) has been widely adapted to empower retrieval systems, recommender systems, and question answering systems, generating a surge in research interest. Recently developed reasoning methods usually suffer from poor performance when applied to incomplete or sparse KGs, due to the lack of evidential paths that can reach target entities. To solve this problem, we propose a hybrid multi-hop reasoning model with reinforcement learning (RL) called SparKGR, which implements dynamic path completion and iterative rule guidance strategies to increase reasoning performance over sparse KGs. Firstly, the model dynamically completes the missing paths using rule guidance to augment the action space for the RL agent; this strategy effectively reduces the sparsity of KGs, thus increasing path search efficiency. Secondly, an iterative optimization of rule induction and fact inference is designed to incorporate global information from KGs to guide the RL agent exploration; this optimization iteratively improves overall training performance. We further evaluated the SparKGR model through different tasks on five real world datasets extracted from Freebase, Wikidata and NELL. The experimental results indicate that SparKGR outperforms state-of-the-art baseline models without losing interpretability.  相似文献   

4.
近年尽管针对中文本文分类的研究成果不少,但基于深度学习对中文政策等长文本进行自动分类的研究还不多见。为此,借鉴和拓展传统的数据增强方法,提出集成新时代人民日报分词语料库(NEPD)、简单数据增强(EDA)算法、word2vec和文本卷积神经网络(TextCNN)的NEWT新型计算框架;实证部分,基于中国地方政府发布的科技政策文本进行算法校验。实验结果显示,在取词长度分别为500、750和1 000词的情况下,应用NEWT算法对中文科技政策文本进行分类的效果优于RCNN、Bi-LSTM和CapsNet等传统深度学习模型,F1值的平均提升比例超过13%;同时,NEWT在较短取词长度下能够实现全文输入的近似效果,可以部分改善传统深度学习模型在中文长文本自动分类任务中的计算效率。  相似文献   

5.
    
  相似文献   

6.
    
Aesthetic assessment evaluates the quality of a given image using subjective annotations, commonly user ratings, as a knowledge base. Rating complexity is usually relaxed in state-of-the-art works by employing a binary high/low quality label computed from the mean value of rating votes. Nevertheless, this approach introduces uncertainty to average-quality images, which may affect the performance of machine learning models trained from annotated data.In this work, we present a novel approach to aesthetic assessment based on redefining the rating-based groundtruths present in most datasets. Our intent is twofold: to reduce the rating uncertainty and to automatically group them into clusters reflecting high and low quality patterns, thus avoiding an arbitrary threshold like 5 in 1–10 ratings. The experimentation uses the well-known AVA dataset, which consists of more than 255,000 images, and we train several CNN models to test our new groundtruths against the baseline ones. The results show that our approach achieves significant performance gains, between 3% and 9% more balanced accuracy than the baseline groundtruths.  相似文献   

7.
Most existing state-of-the-art neural network models for math word problems use the Goal-driven Tree-Structured decoder (GTS) to generate expression trees. However, we found that GTS does not provide good predictions for longer expressions, mainly because it does not capture the relationships among the goal vectors of each node in the expression tree and ignores the position order of the nodes before and after the operator. In this paper, we propose a novel Recursive tree-structured neural network with Goal Forgetting and information aggregation (RGFNet) to address these limits. The goal forgetting and information aggregation module is based on ordinary differential equations (ODEs) and we use it to build a sub-goal information feedback neural network (SGIFNet). Unlike GTS, which uses two-layer gated-feedforward networks to generate goal vectors, we introduce a novel sub-goal generation module. The sub-goal generation module could capture the relationship among the related nodes (e.g. parent nodes, sibling nodes) using attention mechanism. Experimental results on two large public datasets i.e. Math23K and Ape-clean show that our tree-structured model outperforms the state-of-the-art models and obtains answer accuracy over 86%. Furthermore, the performance on long-expression problems is promising.1  相似文献   

8.
    
Since the patient is not quarantined during the conclusion of the Polymerase Chain Reaction (PCR) test used in the diagnosis of COVID-19, the disease continues to spread. In this study, it was aimed to reduce the duration and amount of transmission of the disease by shortening the diagnosis time of COVID-19 patients with the use of Computed Tomography (CT). In addition, it is aimed to provide a decision support system to radiologists in the diagnosis of COVID-19. In this study, deep features were extracted with deep learning models such as ResNet-50, ResNet-101, AlexNet, Vgg-16, Vgg-19, GoogLeNet, SqueezeNet, Xception on 1345 CT images obtained from the radiography database of Siirt Education and Research Hospital. These deep features are given to classification methods such as Support Vector Machine (SVM), k Nearest Neighbor (kNN), Random Forest (RF), Decision Trees (DT), Naive Bayes (NB), and their performance is evaluated with test images. Accuracy value, F1-score and ROC curve were considered as success criteria. According to the data obtained as a result of the application, the best performance was obtained with ResNet-50 and SVM method. The accuracy was 96.296%, the F1-score was 95.868%, and the AUC value was 0.9821. The deep learning model and classification method examined in this study and found to be high performance can be used as an auxiliary decision support system by preventing unnecessary tests for COVID-19 disease.  相似文献   

9.
    
Deep multi-view clustering (MVC) is to mine and employ the complex relationships among views to learn the compact data clusters with deep neural networks in an unsupervised manner. The more recent deep contrastive learning (CL) methods have shown promising performance in MVC by learning cluster-oriented deep feature representations, which is realized by contrasting the positive and negative sample pairs. However, most existing deep contrastive MVC methods only focus on the one-side contrastive learning, such as feature-level or cluster-level contrast, failing to integrating the two sides together or bringing in more important aspects of contrast. Additionally, most of them work in a separate two-stage manner, i.e., first feature learning and then data clustering, failing to mutually benefit each other. To fix the above challenges, in this paper we propose a novel joint contrastive triple-learning framework to learn multi-view discriminative feature representation for deep clustering, which is threefold, i.e., feature-level alignment-oriented and commonality-oriented CL, and cluster-level consistency-oriented CL. The former two submodules aim to contrast the encoded feature representations of data samples in different feature levels, while the last contrasts the data samples in the cluster-level representations. Benefiting from the triple contrast, the more discriminative representations of views can be obtained. Meanwhile, a view weight learning module is designed to learn and exploit the quantitative complementary information across the learned discriminative features of each view. Thus, the contrastive triple-learning module, the view weight learning module and the data clustering module with these fused features are jointly performed, so that these modules are mutually beneficial. The extensive experiments on several challenging multi-view datasets show the superiority of the proposed method over many state-of-the-art methods, especially the large improvement of 15.5% and 8.1% on Caltech-4V and CCV in terms of accuracy. Due to the promising performance on visual datasets, the proposed method can be applied into many practical visual applications such as visual recognition and analysis. The source code of the proposed method is provided at https://github.com/ShizheHu/Joint-Contrastive-Triple-learning.  相似文献   

10.
For many companies the remaining barriers to adopting cloud computing services are related to security. One of these significant security issues is the lack of auditability for various aspects of security in the cloud computing environment. In this paper we look at the issue of cloud computing security auditing from three perspectives: user auditing requirements, technical approaches for (data) security auditing and current cloud service provider capabilities for meeting audit requirements. We also divide specific auditing issues into two categories: infrastructure security auditing and data security auditing. We find ultimately that despite a number of techniques available to address user auditing concerns in the data auditing area, cloud providers have thus far only focused on infrastructure security auditing concerns.  相似文献   

11.
    
Semi-supervised anomaly detection methods leverage a few anomaly examples to yield drastically improved performance compared to unsupervised models. However, they still suffer from two limitations: 1) unlabeled anomalies (i.e., anomaly contamination) may mislead the learning process when all the unlabeled data are employed as inliers for model training; 2) only discrete supervision information (such as binary or ordinal data labels) is exploited, which leads to suboptimal learning of anomaly scores that essentially take on a continuous distribution. Therefore, this paper proposes a novel semi-supervised anomaly detection method, which devises contamination-resilient continuous supervisory signals. Specifically, we propose a mass interpolation method to diffuse the abnormality of labeled anomalies, thereby creating new data samples labeled with continuous abnormal degrees. Meanwhile, the contaminated area can be covered by new data samples generated via combinations of data with correct labels. A feature learning-based objective is added to serve as an optimization constraint to regularize the network and further enhance the robustness w.r.t. anomaly contamination. Extensive experiments on 11 real-world datasets show that our approach significantly outperforms state-of-the-art competitors by 20%–30% in AUC-PR and obtains more robust and superior performance in settings with different anomaly contamination levels and varying numbers of labeled anomalies.  相似文献   

12.
Nowadays, researchers are investing their time and devoting their efforts in developing and motivating the 6G vision and resources that are not available in 5G. Edge computing and autonomous vehicular driving applications are more enhanced under the 6G services that are provided to successfully operate tasks. The huge volume of data resulting from such applications can be a plus in the AI and Machine Learning (ML) world. Traditional ML models are used to train their models on centralized data sets. Lately, data privacy becomes a real aspect to take into consideration while collecting data. For that, Federated Learning (FL) plays nowadays a great role in addressing privacy and technology together by maintaining the ability to learn over decentralized data sets. The training is limited to the user devices only while sharing the locally computed parameter with the server that aggregates those updated weights to optimize a global model. This scenario is repeated multiple rounds for better results and convergence. Most of the literature proposed client selection methods to converge faster and increase accuracy. However, none of them has targeted the ability to deploy and select clients in real-time wherever and whenever needed. In fact, some mobile and vehicular devices are not available to serve as clients in the FL due to the highly dynamic environments and/or do not have the capabilities to accomplish this task. In this paper, we address the aforementioned limitations by introducing an on-demand client deployment in FL offering more volume and heterogeneity of data in the learning process. We make use of containerization technology such as Docker to build efficient environments using any type of client devices serving as volunteering devices, and Kubernetes utility called Kubeadm to monitor the devices. The performed experiments illustrate the relevance of the proposed approach and the efficiency of the deployment of clients whenever and wherever needed.  相似文献   

13.
    
Textual data have been a major form to convey internet users’ content. How to effectively and efficiently discover latent topics among them has essential theoretical and practical value. Recently, neural topic models(NTMs), especially Variational Auto-encoder-based NTMs, proved to be a successful approach for mining meaningful and interpretable topics. However, they usually suffer from two major issues:(1)Posterior collapse: KL divergence will rapidly reach zeros resulting in low-quality representation in latent distribution; (2)Unconstrained topic generative models: Topic generative models are always unconstrained, which potentially leads to discovering redundant topics. To address these issues, we propose Autoencoding Sinkhorn Topic Model based on Sinkhorn Auto-encoder(SAE) and Sinkhorn divergence. SAE utilizes Sinkhorn divergence rather than problematic KL divergence to optimize the difference between posterior and prior, which is free of posterior collapse. Then, to reduce topic redundancy, Sinkhorn Topic Diversity Regularization(STDR) is presented. STDR leverages the proposed Salient Topic Layer and Sinkhorn divergence for measuring distance between salient topic features and serves as a penalty term in loss function facilitating discovering diversified topics in training. Several experiments have been conducted on 2 popular datasets to verify our contribution. Experiment results demonstrate the effectiveness of the proposed model.  相似文献   

14.
    
Stock exchange forecasting is an important aspect of business investment plans. The customers prefer to invest in stocks rather than traditional investments due to high profitability. The high profit is often linked with high risk due to the nonlinear nature of data and complex economic rules. The stock markets are often volatile and change abruptly due to the economic conditions, political situation and major events for the country. Therefore, to investigate the effect of some major events more specifically global and local events for different top stock companies (country-wise) remains an open research area. In this study, we consider four countries- US, Hong Kong, Turkey, and Pakistan from developed, emerging and underdeveloped economies’ list. We have explored the effect of different major events occurred during 2012–2016 on stock markets. We use the Twitter dataset to calculate the sentiment analysis for each of these events. The dataset consists of 11.42 million tweets that were used to determine the event sentiment. We have used linear regression, support vector regression and deep learning for stock exchange forecasting. The performance of the system is evaluated using the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE). The results show that performance improves by using the sentiment for these events.  相似文献   

15.
Research on automated social media rumour verification, the task of identifying the veracity of questionable information circulating on social media, has yielded neural models achieving high performance, with accuracy scores that often exceed 90%. However, none of these studies focus on the real-world generalisability of the proposed approaches, that is whether the models perform well on datasets other than those on which they were initially trained and tested. In this work we aim to fill this gap by assessing the generalisability of top performing neural rumour verification models covering a range of different architectures from the perspectives of both topic and temporal robustness. For a more complete evaluation of generalisability, we collect and release COVID-RV, a novel dataset of Twitter conversations revolving around COVID-19 rumours. Unlike other existing COVID-19 datasets, our COVID-RV contains conversations around rumours that follow the format of prominent rumour verification benchmarks, while being different from them in terms of topic and time scale, thus allowing better assessment of the temporal robustness of the models. We evaluate model performance on COVID-RV and three popular rumour verification datasets to understand limitations and advantages of different model architectures, training datasets and evaluation scenarios. We find a dramatic drop in performance when testing models on a different dataset from that used for training. Further, we evaluate the ability of models to generalise in a few-shot learning setup, as well as when word embeddings are updated with the vocabulary of a new, unseen rumour. Drawing upon our experiments we discuss challenges and make recommendations for future research directions in addressing this important problem.  相似文献   

16.
对前人提出的数字图像边缘的经典算子--kirsch算子进行研究,详细分析了这种方法的出发点、理论根据、数学原理和实现方法及优缺点。并通过用Visual C++实现的实验结果,对kirsch算法提出了改进,实现了改进的快速kirsch算子的边缘提取,给出了对于一般图像的处理效果的分析。  相似文献   

17.
Stock movement forecasting is usually formalized as a sequence prediction task based on time series data. Recently, more and more deep learning models are used to fit the dynamic stock time series with good nonlinear mapping ability, but not much of them attempt to unveil a market system’s internal dynamics. For instance, the driving force (state) behind the stock rise may be the company’s good profitability or concept marketing, and it is helpful to judge the future trend of the stock. To address this issue, we regard the explored pattern as an organic component of the hidden mechanism. Considering the effective hidden state discovery ability of the Hidden Markov Model (HMM), we aim to integrate it into the training process of the deep learning model. Specifically, we propose a deep learning framework called Hidden Markov Model-Attentive LSTM (HMM-ALSTM) to model stock time series data, which guides the hidden state learning of deep learning methods via the market’s pattern (learned by HMM) that generates time series data. What is more, a large number of experiments on 6 real-world data sets and 13 stock prediction baselines for predicting stock movement and return rate are implemented. Our proposed HMM-ALSTM achieves an average 10% improvement on all data sets compared to the best baseline.  相似文献   

18.
    
This paper presents a semantically rich document representation model for automatically classifying financial documents into predefined categories utilizing deep learning. The model architecture consists of two main modules including document representation and document classification. In the first module, a document is enriched with semantics using background knowledge provided by an ontology and through the acquisition of its relevant terminology. Acquisition of terminology integrated to the ontology extends the capabilities of semantically rich document representations with an in depth-coverage of concepts, thereby capturing the whole conceptualization involved in documents. Semantically rich representations obtained from the first module will serve as input to the document classification module which aims at finding the most appropriate category for that document through deep learning. Three different deep learning networks each belonging to a different category of machine learning techniques for ontological document classification using a real-life ontology are used.Multiple simulations are carried out with various deep neural networks configurations, and our findings reveal that a three hidden layer feedforward network with 1024 neurons obtain the highest document classification performance on the INFUSE dataset. The performance in terms of F1 score is further increased by almost five percentage points to 78.10% for the same network configuration when the relevant terminology integrated to the ontology is applied to enrich document representation. Furthermore, we conducted a comparative performance evaluation using various state-of-the-art document representation approaches and classification techniques including shallow and conventional machine learning classifiers.  相似文献   

19.
    
Graph neural networks have been frequently applied in recommender systems due to their powerful representation abilities for irregular data. However, these methods still suffer from the difficulties such as the inflexible graph structure, sparse and highly imbalanced data, and relatively shallow networks, limiting rate prediction ability for recommendations. This paper presents a novel deep dynamic graph attention framework based on influence and preference relationship reconstruction (DGA-IPR) for recommender systems to learn optimal latent representations of users and items. The entire framework involves a user branch and an item branch. An influence-based dynamic graph attention (IDGA) module, a preference-based dynamic graph attention (PDGA) module, and an adaptive fine feature extraction (AFFE) module are respectively constructed for each branch. Concretely, the first two attention modules concentrate on reconstructing influence and preference relationship graphs, breaking imbalanced and fixed constraints of graph structures. Then a deep feature aggregation block and an adaptive feature fusion operation are built, improving the network depth and capturing potential high-order information expressions. Besides, AFFE is designed to acquire finer latent features for users and items. The DGA-IPR architecture is formed by integrating IDGA, PDGA, and AFFE for users and items, respectively. Experiments reveal the superiority of DGA-IPR over existing recommendation models.  相似文献   

20.
    
Deep Learning has reached human-level performance in several medical tasks including classification of histopathological images. Continuous effort has been made at finding effective strategies to interpret these types of models, among them saliency maps, which depict the weights of the pixels on the classification as an heatmap of intensity values, have been by far the most used for image classification. However, there is a lack of tools for the systematic evaluation of saliency maps, and existing works introduce non-natural noise such as random or uniform values. To address this issue, we propose an approach to evaluate the faithfulness of the saliency maps by introducing natural perturbations in the image, based on oppose-class substitution, and studying their impact on evaluation metrics adapted from saliency models. We validate the proposed approach on a breast cancer metastases detection dataset PatchCamelyon with 327,680 patches of histopathological images of sentinel lymph node sections. Results show that GradCAM, Guided-GradCAM and gradient-based saliency map methods are sensitive to natural perturbations and correlate to the presence of tumor evidence in the image. Overall, this approach proves to be a solution for the validation of saliency map methods without introducing confounding variables and shows potential for application on other medical imaging tasks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号