#birdclef-2024

1 messages · Page 1 of 1 (latest)

fallow kite
#

Hello everyone, I wonder if someone wants to team up and work together on birdclef-2024? Please let me know if you're curious about working on this task together.

dense roost
#

How do we submit? I'm not sure where if we should take all files in the test directory?

whole surge
arctic steeple
red sun
hollow quarry
#

Hello guys, I am venkatkumar! If some one wants to teamup and work together birdclef 2024! I am curious to participate this competition

kaggle linke: https://www.kaggle.com/venkatkumar001

plucky portal
#

right now, i have code that reads all files in test_soundscapes, but os.listdir is only returning readme.txt

dense roost
plucky portal
dense roost
plucky portal
coarse drift
#

hello birdclef-2024

#

working to make submissions with untrained off the shelf base models in order to find out if 120 minutes run-time limit can be met

#

unlabeled_soundscapes ogg files seem 4-min long, one could use 1100 of them to estimate the submission time

tawny charm
#

The time limits are quite annoying in these competitions

plucky portal
#

yeah, any reason its only 2 hours?

#

lots of other ones are like 9 hrs

vague veldt
#

We want to deploy the winning solutions in the field where compute resources are limited, hence the computational constraints of the competition

minor tinsel
#

I guess that by this restriction people will tend to use already trained models

coarse drift
#

host confirmed train labels done by community and test labels done by highly skilled bird experts - no wonder why label smoothing is used

coarse drift
#

some of unlabeled_soundscapes ogg are not 4-min long such as birdclef-2024/unlabeled_soundscapes/1005741050.ogg

#

estimated run-time of EfficientNetV2B0 using unlabeled_soundscapes is about 30min

coarse drift
#

EfficientNetV2S is about 1 hr

#

at last, ready for real test submit

coarse drift
#

Submissions scored 0.49 and took same time as estimated run-times

#

now is the time to train the model and hopefully score better than 0.49, will see what happens

coarse drift
#

basic training in progress - no augmentation, just random 10-sec spec, gpu is almost not utilized, the bottle neck is decoding ogg and preparing spec, model has to wait for the image, estimated 1 hr per epoch, total 8 epochs, 8 hours, 🍝 and ☕

coarse drift
#

8 epochs were not enough make any difference on the leaderboard score. more epochs scored 0.50

coarse drift
#

there was a mistake in the submission code, when fixed, scored 0.61

tight trench
#

pytorch and other libraries offer audio->spectrogram transforms that work on the GPU. If it were possible to do ogg->waveform decoding on the GPU too then that would be really cool.

prisma anvil
tight trench
prisma anvil
tight trench
#

You can still compute the spectrograms ahead of time, just without splitting them up

prisma anvil
#

You could certainly do that, but since the model is evaluated in 5 second chunks, it probably doesn't matter much.

tight trench
#

During training you want the model to see as much variety of data as possible, but precropping would limit that

prisma anvil
#

You can feel free to generate your own spectrograms, there is a plethora of notebooks teaching you how to do so.

tight trench
#

Yes I have. Just curious as to why you’re taking this approach

minor tinsel
#

Is it possible to upload trained in Colab model when submitting? Or I should always use Kaggle to train my model?

coarse drift
#

model can be trained anywhere. kaggle is needed to submit.

coarse drift
#

🤔 no joy yet - random leaderboard scores of 0.55, 0.56, 0.57, 0.58, 0.59, 0.60, 0.61 by different model sizes and input sizes using random 10-sec specs without any augmentations - there seems to be no noticeable correlation between local validation and leaderboard performance - i wonder if i can improve the score by utilizing various augmentations and techniques...

silent reef
#

Hi; Just had a beginner level question as dataset size is nearly 24 GB can everything be processed on distributed Kaggle Notebooks (including mel spectrograms, Model Training etc tasks) in individual notebooks or do we need any external training environment and hardware resources ?

coarse drift
rancid flint
# silent reef Hi; Just had a **beginner level question** as dataset size is nearly 24 GB can e...

Agree with @coarse drift's comment. All should be good on kaggle itself. Only limitation is the 30 hrs of GPU, 8 hrs of notebook runtime, and storage space for saving a bunch of preprocessed data (usually not an issue but if you're saving like 100+ GBs of data, I think that's an issue).

If you haven't already discovered it, using del and gc.collect() will be your friend for not exploding your RAM if you're doing data preprocessing on a bunch of files.

Also, downloading data to a cloud like AWS, Azure, or what I have been hot on recently, RodPod, is pretty straightforward and only takes like 2+ hours using a smallish VM. Then you can go ham on GPU, granted, you have money for it. Happy kaggling on birds.

zinc saddle
#

Does anyone happen to know why every time I save version of my notebook the error occurs: nbclient.exceptions.DeadKernelError: Kernel died? It’s the third time it happens! This notebook is used for training my model (efficientvit_b1.r288_in1k), it doesn’t have any errors and doesn’t exceed the 9 hours’ limit (it hardly exceeds 1.5 hours!). Any help will be very appreciated

zinc saddle
civic falcon
#

Hi all,
I have trained a EfficientNetV2L (also tried B0) convolutional neural network with an extra 128 Dense layer tacked on the end. All times I have attempted to submit have timed out on CPU time, and I'm very confused how I could fix this.
I load the audio with librosa, convert to Dbs, convert to grayscale image with tensorflow and use that to predict my outputs.

Based on example solutions people have done I'm scratching my head why my submission is timing out. I've run my predictions on the unlabled soundscapes and it has a pretty decent prediction rate of around 1 5s clip per 2-3s

warm loom
civic falcon
#

I'm getting about 380-400ms per 5s segment

Sanity checking: I submit by splitting each and every audio file into each and every 5s segment, which I then predict individually.

Edit: im getting 90ms per 5s segment and its still timing out.

coarse drift
warm loom
#

are you getting a batch size of 48 on cpu?

coarse drift
#

Yes. training with gpu batch size of 64 and inferencing with cpu batch size of 48. what's your batch size?

civic falcon
#

Can confirm the batch size 48 solved my timing out issue

warm loom
#

it's 128 in training, but I was only doing 1 in inference. I might be able to submit a bigger model by doing 48

silent reef
#

Hi again a beginner level question! I am still confused regarding random cropping during training; For example if currently I am cropping 15 seconds of audio in 5 seconds chunks from beginning and ending of audio. arent there chances that the 5 second melspectrogram will not contain any bird call but will only contain noise as I am not even able to cross 50% accuracy with efficientnet models (I know scoring is in macroavg roc) THANK YOU SO MUCH FOR HELPING OUT

civic falcon
#

Yes, that is correct. There is a chance you won't get the bird noise. The keras starter notebook takes the approach of just using 10s windows for training and 5s for inference. Supposedly it transfers alright
(I am also beginner level)