Video classification with non-trivial quantity of frames. | Learn AI Together | Page 1

boreal canopy Nov 26, 2022, 2:43 PM

#

Any tips on how to do video classification on lengthy videos?
I am eyeballing this workflow:
https://www.mathworks.com/help/deeplearning/ug/classify-videos-using-deep-learning.html

But when I try to initialise an array to the size of my video (70 000 frames, 600 mb) the size of array is too large.
I have 256 gb ram at my disposal so I can do while resizing video to 224x224 before, but I'm wondering if there's less clumsy workflow for large videos?

vid = zeros(768,1024,3,78938)

unresponsive.

Classify Videos Using Deep Learning - MATLAB & Simulink

This example shows how to create a network for video classification by combining a pretrained image classification model and an LSTM network.

#

Another question:

while hasFrame(vr)
    i = i + 1;
    video(:,:,:,i) = readFrame(vr);
end

Is so unbelibevably slow

cerulean warren Nov 26, 2022, 8:10 PM

#

boreal canopy Any tips on how to do video classification on lengthy videos? I am eyeballing th...

@worn crane do you work with large videos as well? Wondering your take on this

boreal canopy Nov 26, 2022, 8:11 PM

#

I've found out matlab initialises array to double by default that's why my memory gets sucked into oblivion

#

wip with uint8 right now :3

cerulean warren Nov 26, 2022, 8:16 PM

#

boreal canopy wip with uint8 right now :3

Ahhh that makes sense. Sorry never worked with MATLAB, so I didn’t catch that 😅

worn crane Nov 27, 2022, 9:37 AM

#

cerulean warren <@353254810193231874> do you work with large videos as well? Wondering your take...

Hello! Actually yeah i've worked with large videos. Video classification requires a mix between CNN and RNN's. To effectively work with large vids, i split the training process into 2 parts: The feature extraction part which is done at the CNN stage - The classification part which is done at the RNN stage. Splitting the workflow into 2 parts allows me to save the features before passing them to the RNN. So let's say the "history" channel of the RNN is 50 frames, i would pass the video frames to the CNN as batches of 50 and save them on disk. when all the frames have passed, i would simply load the features from disk and pass them to the RNN. Saving the features can be done using the "npz" format

cerulean warren Nov 27, 2022, 3:18 PM

#

worn crane Hello! Actually yeah i've worked with large videos. Video classification require...

Say if you're classifying videos of variable length, but you're classifying the entire video, how would you approach it? Would you take majority vote of said batches?

worn crane Nov 27, 2022, 3:21 PM

#

cerulean warren Say if you're classifying videos of variable length, but you're classifying the ...

what i mentioned applies only for the training phase, when running inference, yeah i would do exactly that. split it into several batches and take the label with most votes

cerulean warren Nov 27, 2022, 3:22 PM

#

I see!

boreal canopy Nov 29, 2022, 6:28 PM

#

worn crane Hello! Actually yeah i've worked with large videos. Video classification require...

Oh nice. I was actually thinking about doing something like this.
So you use a CNN as classification network to determine which frames are useless and then you use RNN to classify ones which are not dropped?

worn crane Nov 29, 2022, 6:38 PM

#

boreal canopy Oh nice. I was actually thinking about doing something like this. So you use a C...

Not really, I just produce "embeddings", if you will, for all the frames using a pre trained CNN, then feed those embeddings to an RNN in batches. Of course the batch has to preserve the frames' sequence. You can't just feed a random batch to the RNN. After training, I'll split the video into batches of the same size used in training then classify the sub-videos one at a time. Finally, just pick whichever label was predicted the most.

boreal canopy Nov 29, 2022, 6:55 PM

#

worn crane Not really, I just produce "embeddings", if you will, for all the frames using a...

Right, in my case frames ordering is not that important. I am looking to weakly supervise finding and perhaps quantifying objects in the frames.

#

Thanks for sharing your experience ❤️

#Video classification with non-trivial quantity of frames.