首页 | 本学科首页   官方微博 | 高级检索  
     检索      

基于双流LSTM与自监督学习的 在线动作检测算法
作者姓名:朱嘉桐  卿来云  黄庆明
作者单位:中国科学院大学计算机科学与技术学院, 北京 100049
基金项目:国家自然科学基金(61872333)资助
摘    要:在线动作检测对安防和人机交互等应用非常重要,该问题要求模型在动作刚开始时就能检测到,而不是等待整个事件完整结束。由于在线动作检测只能基于观测到的部分视频进行判断,因此相比动作识别和动作检测等任务,模型需要挖掘更多信息辅助决策。基于在线动作检测问题中常用的长短时记忆网络(LSTM)模型,构建双流LSTM模型(2S-LSTM),并将在图像领域中被广泛使用的自监督学习思想引入到在线动作检测问题中。首先,双流网络2S-LSTM模型分别对RGB流与光流的时序信息采用LSTM建模。同时基于自监督学习的思想构建出2个新型的损失函数——时序相似度损失与光流稳定损失用于模型的训练。实验表明,与过去的在线动作检测方法RED、TRN、IDN相比,本文的模型在TVSeries与THUMOS’14这2个数据集上都取得了较好的结果。

关 键 词:自监督学习  双流LSTM(2S-LSTM)  在线动作检测  时序相似度损失  光流稳定损失  
收稿时间:2021-03-22
修稿时间:2021-05-31

Two stream LSTM based on self-supervised learning for online action detection
Authors:ZHU Jiatong  QING Laiyun  HUANG Qingming
Institution:School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China
Abstract:Online action detection plays very important role in many applications such as security and human-computer interaction. This mission requires that the system can detect the action when it just started, instead of waiting for the entire action comes to an end. Since in online action detection problem models can only make judgments based on the observed part of the video, so compared to other tasks such as action recognition and action detection, the model needs to dig out more from history information to assist decision-making for current frame. Based on the long short-term memory (LSTM) model commonly used in online action detection problems, this paper constructs a two-stream LSTM model called 2S-LSTM, and introduces the self-supervised learning idea, which is widely used in the image field, into the online action detection problem. First, the two-stream network 2S-LSTM model uses LSTM to model the temporal information of RGB flow and optical flow respectively. Moreover, based on the idea of self-supervised learning we construct two new loss functions:temporal similarity loss and optical flow stability loss for training. Experiments show that, compared with the past online motion detection methods such as RED, TRN, and IDN, our model in has achieved better results on both the TVSeries and THUMOS’14 datasets.
Keywords:self-supervised learning  two-stream LSTM networks(2S-LSTM)  online action detection  temporal similarity loss  optical flow stability loss  
点击此处可从《》浏览原始摘要信息
点击此处可从《》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号