首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于决策树的日志分析方法
引用本文:樊建昌,余 粟.基于决策树的日志分析方法[J].教育技术导刊,2020,19(1):99-102.
作者姓名:樊建昌  余 粟
作者单位:1. 上海工程技术大学 机械与汽车工程学院;2. 上海工程技术大学 工程实训中心,上海 201620
基金项目:上海市科委创新行动计划项目(17511110204)
摘    要:为了解决服务器运行过程中由于性能故障造成服务质量下降的问题,提出一种基于决策树的日志分析方法,以服务器日志文件中记录服务器关键性能指标的数据为研究对象,利用决策树中常用的ID3、C4.5和CART 3种算法预测服务器未来性能指标发展趋势。实验结果表明,在实际运行过程中,C4.5算法对服务器性能指标数据预测的准确率和召回率最好,分别达到了92.23%和95.37%,在3种决策树算法中拥有最高的准确率与召回率,且相比传统开发人员从日志文件中寻找故障的方法,准确率提高了20%左右,因此能够更好地预测服务器系统性能指标发展趋势。通过该方法可提前感知系统运行状况,并及时作出调整,从而有效降低实际生产过程中服务器故障发生概率,提高服务质量。

关 键 词:决策树算法  日志分析  Spark  大数据  
收稿时间:2019-03-19

Decision Tree Based Log Analysis Method
FAN Jian-chang,YU Su.Decision Tree Based Log Analysis Method[J].Introduction of Educational Technology,2020,19(1):99-102.
Authors:FAN Jian-chang  YU Su
Institution:1. School of Mechanical and Automotive Engineering, Shanghai University of Engineering and Technology;2. Engineering Training Center, Shanghai University of Engineering and Technology,Shanghai 201620,China
Abstract:In order to solve the degradation of service quality caused by performance failure during server operation, this paper proposes a log analysis method based on decision tree, which records the data of key performance indicators of server in the server log file, and uses the commonly used decision tree. The ID3, C4.5 and CART algorithms are used to predict the development trend of the server’s future performance indicators. The results show that in the actual operation process, the C4.5 algorithm has the best accuracy and recall rate for the server performance index data prediction, 92.23% and 95.37% respectively, and it has the highest accuracy and recall rate among the three decision tree algorithms, and the accuracy rate of traditional developers from the log file to find faults increased by about 20%. It can better predict the development trend of server system performance indicators. This method can greatly improve the system operation status in advance, adjust in time, solve the performance failure, and greatly reduce the probability of server failure in the actual production process to improve the quality of service.
Keywords:decision tree algorithm  log analysis  Spark  big data  
点击此处可从《教育技术导刊》浏览原始摘要信息
点击此处可从《教育技术导刊》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号