I have a bunch of videos with some statistics of what is happening inside a video. One such piece of information is given in terms of time of the video in seconds up to one decimal point.
To get the FPS of a video, I am using ffmpeg -i
But when I manually compute one particular frame's time using given FPS, it does not match.
For example, ffmpeg outputs FPS = 30.
I look at the video statistics, the frame at the 156.8 = 2.368 has to be 4704'th frame. I open the video using 'skvideo', read all the frames, and view the 4704'th frame. It is some frame around time 2.12?. I checked multiple such instances in multiple videos and this is a common behavior.
I do not understand why this is so and how can I get around the problem?
As such I am not bounded by ffmpeg. Skvideo is being used to read the videos. I tried opencv, as of now it does not work with VideoCapture, and reinstalling it is costly for me time wise. But I guess 'opencv/skvideo' should not matter, one can count the frames manually as well.
So, in the solution, I am looking out for -
Given timestamps of inside of a video, how can I find a frame of that particular time location?
In case someone might have already worked on this, this is related to
THUMOSdataset. I am on Ubuntu 16.04
EDIT_1
Actually I can be more specific as it is a publicly available data. The time bounds are of an important activity. For example, in a video, when does basketball dunk occurs? It is given in pairs - [start end]. Some videos have multiple activities, some have only one.
Here is a sample video, and following are the activity times.
[[ 16.5, 20.8], [ 26.6, 32.2], [ 34.8, 42.1], [ 47.8, 50.0], [ 58.1, 62.9], [ 65.6, 67.2], [ 68.5, 74.0], [ 76.4, 78.3], [ 78.7, 79.8], [ 80.8, 82.1], [ 85.0, 87.3], [ 90.1, 91.4], [ 98.5, 100.3]] I also tried checking manually, 32.87 FPS "almost" works for few videos but not for all. and almost means it is off by ~ 10 frames. This is a huge difference for my task, and I need exact frame.
Also, there has to be some way, because it can be visually observed with multiple video players that times in the dataset are correct.