首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于并行计算的概率潜在语义分析算法研究
引用本文:赵伟.基于并行计算的概率潜在语义分析算法研究[J].安徽职业技术学院学报,2014(3):1-3.
作者姓名:赵伟
作者单位:安徽中澳科技职业学院 实训中心,安徽 合肥,230031
摘    要:概率潜在语义分析(Probabilistic Latent Semantic Analysis,PLSA)中通过将文档—单词关系转变成文档—主题—单词关系对文档进行排序、过滤、分类等操作,计算量巨大。文章设计了基于MPI(Message Passing Interface)的PLSA高效并行方案,对模型系统和训练数据处理以及并行算法加以优化,提出了一种大数据条件下PLSA并行算法,解决了以往数据规模太大难以计算的问题,算法较优化前训练速度有较大提升,具有扩展性和可行性。

关 键 词:PLSA  MPI  关系  并行计算  大数据

An Optimized Algorithm for Probabilistic Latent Semantic Analysis Based on Parallel Computing
Abstract:Probabilistic Latent Semantic Analysis(PLSA) is often used to turn the relationship of document-word into the relationship of document-theme – word so that documents can be sorted, filtered,classified and so on, which calls for large amount of calculation. In this paper an optimized algorithm for PLSA based on parallel computing in the big data environment is proposed. A model of PLSA is designed based on MPI(Message Passing Interface) and solves the problems that computation is difficult for the large data. The speed of optimized algorithm is raised with extensibility and feasibility.
Keywords:PLSA  MPI  relationship  parallel computing  big data
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号