Questions tagged [action-recognition]
For questions regarding action recognition. This should be used when asking about what could be implemented that complements or harms this.
16 questions
0 votes
0 answers
35 views
How can I detect suspicious customer actions using computer vision?
I'm designing a computer vision system to detect suspicious customer behavior in a store, for example: unusual body movements near a cashier or shelf sudden hiding motions, loitering for too long in ...
2 votes
1 answer
62 views
How to handle random cropping in video frames where width is significantly larger than height to avoid cropping out objects of interest?
I am working on a video processing pipeline where the frames have a width that is much larger than the height (wide aspect ratio). My main goal is to apply action recognition on human-object ...
0 votes
1 answer
56 views
Is background segmentation effective for improving action recognition model on real-time human-object interaction videos?
I am working on an action recognition task involving human-object interactions using an I3D (3D CNN-based) model. The model was trained on pre-recorded videos, and it performed well during evaluation. ...
0 votes
1 answer
179 views
How to improve the performance when no shuffling of dataloader is needed?
I'm currently doing some researches on video recognition. What I'm trying to do is like this paper. The idea is that: for processing a specific input video clip (shape: [T, C, H, W]), it needs ...
0 votes
1 answer
155 views
Can I flip a video to generate more data for action recognition?
There are 8 distinct action classes and around 50+ videos per class. I was wondering if flipping videos from the training set can be a good option to generate additional data. Is it?
2 votes
1 answer
63 views
What type of neural network do you need if you want to detect an action or dynamic pattern instead of a static pattern?
Let's say that you want to detect if a man is running, walking, or dancing instead of just detecting a man still. What type of neural networks will you use for this purpose?
1 vote
0 answers
192 views
Is there a way, while training (with contrastive learning) the embedding network, to find the test accuracy?
I aim to do action recognition in videos on a private dataset. To compare with the existing state-of-the-art implementations, other guys published their code on Github, like the one here (for the ...
1 vote
0 answers
312 views
What are the pros and cons of 3D CNN and 2D CNN combined with optical flow for action recognition?
For action recognition or similar tasks, one can either use 3D CNN or combine 2D CNN with optical flow. See this paper for details. Can someone tell the pros/cons of each, in terms of accuracy, cost ...
3 votes
1 answer
227 views
How can I do video classification while taking into account the temporal dependencies of the frames?
I need to solve a video classification problem. While looking for solutions, I only found solutions that transform this problem into a series of simpler image classification tasks. However, this ...
1 vote
1 answer
457 views
What is "temporal depth"?
I need some explanation about the following paragraph (page 3) from the paper A Novel Approach for Robust Multi Human Action Detection and Recognition based on 3-Dimentional Convolutional Neural ...
2 votes
2 answers
187 views
Can PDDL be utilized for action recognition?
The Planning Domain Definition Language (PDDL) is known for its capabilities of symbolic planning in the state space. A solver will find a sequence of steps to bring the system from a start state to ...
3 votes
1 answer
156 views
How should continuous action/gesture recognition be performed differently than isolated action recognition
I am going to train a deep learning model to classify hand gestures in video. Since the person will be taking up nearly the entire width/height of the video and I will be classifying what hand gesture ...
4 votes
1 answer
154 views
What topologies support recognition of action sequences?
The ability to recognize an object with particular identifying features from single or multiple camera shoots with the temporal dimension digitized as frames has been shown. The proof is that the ...
3 votes
2 answers
802 views
Why do action recognition algorithms perform better on ucf101dataset than HMDB51 dataset?
If we look at state of the art accuracy on the UCF101 data set, it is around 93% whereas for the HMDB51 data set it is around 66%. I looked at both the data sets and both contain videos of similar ...
4 votes
1 answer
1k views
Applications of CNN for detecting crime from video surveillance cameras
Inspired by this discussion about recognizing human actions, I have found the Fall-Detection project which detects humans falling on the ground from a CCTV camera feed, and which can consider alerting ...