#Video classification with non-trivial quantity of frames.

1 messages · Page 1 of 1 (latest)

boreal canopy
#

Any tips on how to do video classification on lengthy videos?
I am eyeballing this workflow:
https://www.mathworks.com/help/deeplearning/ug/classify-videos-using-deep-learning.html

But when I try to initialise an array to the size of my video (70 000 frames, 600 mb) the size of array is too large.
I have 256 gb ram at my disposal so I can do while resizing video to 224x224 before, but I'm wondering if there's less clumsy workflow for large videos?

vid = zeros(768,1024,3,78938)
unresponsive.
#

Another question:

while hasFrame(vr)
    i = i + 1;
    video(:,:,:,i) = readFrame(vr);
end

Is so unbelibevably slow

cerulean warren
boreal canopy
#

I've found out matlab initialises array to double by default that's why my memory gets sucked into oblivion

#

wip with uint8 right now :3

cerulean warren
worn crane
# cerulean warren <@353254810193231874> do you work with large videos as well? Wondering your take...

Hello! Actually yeah i've worked with large videos. Video classification requires a mix between CNN and RNN's. To effectively work with large vids, i split the training process into 2 parts: The feature extraction part which is done at the CNN stage - The classification part which is done at the RNN stage. Splitting the workflow into 2 parts allows me to save the features before passing them to the RNN. So let's say the "history" channel of the RNN is 50 frames, i would pass the video frames to the CNN as batches of 50 and save them on disk. when all the frames have passed, i would simply load the features from disk and pass them to the RNN. Saving the features can be done using the "npz" format

cerulean warren
worn crane
cerulean warren
#

I see!

boreal canopy
worn crane
# boreal canopy Oh nice. I was actually thinking about doing something like this. So you use a C...

Not really, I just produce "embeddings", if you will, for all the frames using a pre trained CNN, then feed those embeddings to an RNN in batches. Of course the batch has to preserve the frames' sequence. You can't just feed a random batch to the RNN. After training, I'll split the video into batches of the same size used in training then classify the sub-videos one at a time. Finally, just pick whichever label was predicted the most.

boreal canopy
#

Thanks for sharing your experience ❤️