DEEP POSITIONAL ATTENTION-BASED BIDIRECTIONAL RNN WITH 3D CONVOLUTIONAL VIDEO DESCRIPTORS FOR HUMAN ACTION RECOGNITION

Srilakshmi, N; Radha, N

DEEP POSITIONAL ATTENTION-BASED BIDIRECTIONAL RNN WITH 3D CONVOLUTIONAL VIDEO DESCRIPTORS FOR HUMAN ACTION RECOGNITION

dc.contributor.author	Srilakshmi, N
dc.contributor.author	Radha, N
dc.date.accessioned	2023-11-10T08:20:01Z
dc.date.available	2023-11-10T08:20:01Z
dc.date.issued	2021
dc.description.abstract	This article presents the Joints and Trajectory-pooled 3D-Deep Positional Attention-based Bidirectional Recurrent convolutional Descriptors (JTPADBRD) for recognizing the human activities from video sequences. At first, the video is partitioned into clips and these clips are given as input of a two-stream Convolutional 3D (C3D) network in which the attention stream is used for extracting the body joints locations and the feature stream is used for extracting the trajectory points including spatiotemporal features. Then, the extracted features of each clip is needed to aggregate for creating the video descriptor. Therefore, the pooled feature vectors in all the clips within the video sequence are aggregated to a video descriptor. This aggregation is performed by using the PABRNN that concatenates all the pooled feature vectors related to the body joints and trajectory points in a single frame. Thus, the convolutional feature vector representations of all the clips belonging to one video sequence are aggregated to be a descriptor of the video using Recurrent Neural Network (RNN)-based pooling. Besides, these two streams are multiplied with the bilinear product and end-to-end trainable via class labels. Further, the activations of fully connected layers and their spatiotemporal variances are aggregated to create the final video descriptor. Then, these video descriptors are given to the Support Vector Machine (SVM) for recognizing the human behaviors in videos. At last, the experimental outcomes exhibit the considerable improvement in Recognition Accuracy (RA) of the JTDPABRD is approximately 99.4% achieved on the Penn Action dataset as compared to the existing methods.	en_US
dc.identifier.uri	https://iopscience.iop.org/article/10.1088/1757-899X/1022/1/012017/pdf
dc.language.iso	en_US	en_US
dc.publisher	IOP Publishing Ltd	en_US
dc.title	DEEP POSITIONAL ATTENTION-BASED BIDIRECTIONAL RNN WITH 3D CONVOLUTIONAL VIDEO DESCRIPTORS FOR HUMAN ACTION RECOGNITION	en_US
dc.type	Other	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: DEEP POSITIONAL ATTENTION-BASED BIDIRECTIONAL RNN WITH 3D CONVOLUTIONAL VIDEO DESCRIPTORS FOR HUMAN ACTION RECOGNITION.pdf
Size:: 948.9 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.74 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

3.Conference Paper (13)