Browsing by Author "Srilakshmi, N"

Now showing 1 - 3 of 3

BODY JOINTS AND TRAJECTORY GUIDED 3D DEEP CONVOLUTIONAL DESCRIPTORS FOR HUMAN ACTIVITY IDENTIFICATION
(Blue Eyes Intelligence Engineering & Sciences Publication, 2019-10) Srilakshmi, N; Radha, N
Human Activity Identification (HAI) in videos is one of the trendiest research fields in the computer visualization. Among various HAI techniques, Joints-pooled 3D-Deep convolutional Descriptors (JDD) have achieved effective performance by learning the body joint and capturing the spatiotemporal characteristics concurrently. However, the time consumption for estimating the locale of body joints by using large-scale dataset and computational cost of skeleton estimation algorithm were high. The recognition accuracy using traditional approaches need to be improved by considering both body joints and trajectory points together. Therefore, the key goal of this work is to improve the recognition accuracy using an optical flow integrated with a two-stream bilinear model, namely Joints and Trajectory-pooled 3D-Deep convolutional Descriptors (JTDD). In this model, an optical flow/trajectory point between video frames is also extracted at the body joint positions as input to the proposed JTDD. For this reason, two-streams of Convolutional 3D network (C3D) multiplied with the bilinear product is used for extracting the features, generating the joint descriptors for video sequences and capturing the spatiotemporal features. Then, the whole network is trained end-to-end based on the two-stream bilinear C3D model to obtain the video descriptors. Further, these video descriptors are classified by linear Support Vector Machine (SVM) to recognize human activities. Based on both body joints and trajectory points, action recognition is achieved efficiently. Finally, the recognition accuracy of the JTDD model and JDD model are compared.
DEEP POSITIONAL ATTENTION-BASED BIDIRECTIONAL RNN WITH 3D CONVOLUTIONAL VIDEO DESCRIPTORS FOR HUMAN ACTION RECOGNITION
(IOP Publishing Ltd, 2021) Srilakshmi, N; Radha, N
This article presents the Joints and Trajectory-pooled 3D-Deep Positional Attention-based Bidirectional Recurrent convolutional Descriptors (JTPADBRD) for recognizing the human activities from video sequences. At first, the video is partitioned into clips and these clips are given as input of a two-stream Convolutional 3D (C3D) network in which the attention stream is used for extracting the body joints locations and the feature stream is used for extracting the trajectory points including spatiotemporal features. Then, the extracted features of each clip is needed to aggregate for creating the video descriptor. Therefore, the pooled feature vectors in all the clips within the video sequence are aggregated to a video descriptor. This aggregation is performed by using the PABRNN that concatenates all the pooled feature vectors related to the body joints and trajectory points in a single frame. Thus, the convolutional feature vector representations of all the clips belonging to one video sequence are aggregated to be a descriptor of the video using Recurrent Neural Network (RNN)-based pooling. Besides, these two streams are multiplied with the bilinear product and end-to-end trainable via class labels. Further, the activations of fully connected layers and their spatiotemporal variances are aggregated to create the final video descriptor. Then, these video descriptors are given to the Support Vector Machine (SVM) for recognizing the human behaviors in videos. At last, the experimental outcomes exhibit the considerable improvement in Recognition Accuracy (RA) of the JTDPABRD is approximately 99.4% achieved on the Penn Action dataset as compared to the existing methods.
MULTI-VIEW HUMAN ACTION RECOGNITION USING ADAPTIVE OPTIMIZATION ALGORITHM WITH SKELETON BASED GRAPH NEURAL NETWORKS (Conference Paper)
(Institute of Electrical and Electronics Engineers Inc., 2024-01-25) Srilakshmi, N; Radha, N
Recognizing human actions from multi-viewpoint video data is a complex task with applications in various fields, including surveillance, robotics, and human-computer interaction. An innovative approach is presented in this study to Multi-View Human Action Recognition (MV-HAR) using an Adaptive Optimization Algorithm (AOA) combined with a Skeleton-based Graph Neural Network (SGNN). The proposed architecture aims to leverage the complementary information from multiple viewpoints while effectively capturing the temporal and spatial dependencies inherent in human actions. The pre-processing pipeline aligns and fuses skeleton data extracted from different viewpoints, producing a coherent representation of the action across cameras. Each skeleton sequence is transformed into a graph structure, where joints represent nodes and edges encapsulate the relationships between joints over time. This sequence of graphs is then processed by the SGNN, which learns to capture the evolving dynamics of the action through multiple graph convolutional layers. On benchmark datasets, the proposed approach proves effective with an accuracy of 96.8%.