#birdclef-2024 | Kaggle | Page 1

fallow kite Apr 6, 2024, 6:29 AM

#

Hello everyone, I wonder if someone wants to team up and work together on birdclef-2024? Please let me know if you're curious about working on this task together.

dense roost Apr 6, 2024, 8:00 AM

#

How do we submit? I'm not sure where if we should take all files in the test directory?

whole surge Apr 6, 2024, 10:05 AM

#

fallow kite Hello everyone, I wonder if someone wants to team up and work together on birdcl...

I would be interested in forming a team

arctic steeple Apr 7, 2024, 7:44 AM

#

fallow kite Hello everyone, I wonder if someone wants to team up and work together on birdcl...

I am interested to form a team

red sun Apr 7, 2024, 3:26 PM

#

fallow kite Hello everyone, I wonder if someone wants to team up and work together on birdcl...

Hello, nice to meet you. im also interested in a team. 🙂

hollow quarry Apr 8, 2024, 5:16 AM

#

Hello guys, I am venkatkumar! If some one wants to teamup and work together birdclef 2024! I am curious to participate this competition

kaggle linke: https://www.kaggle.com/venkatkumar001

VK | Master

Real time problem solver (AI (CV)) | Passionate about 2D,3D Computer vision, Time series, Reinforcement learning | Generative AI Developer

plucky portal Apr 9, 2024, 4:02 AM

#

dense roost How do we submit? I'm not sure where if we should take all files in the test dir...

hey, did you ever find out how to do this?

#

right now, i have code that reads all files in test_soundscapes, but os.listdir is only returning readme.txt

dense roost Apr 9, 2024, 5:46 AM

#

plucky portal hey, did you ever find out how to do this?

Hey. You should filter on .ogg files only. What you can do is that if the folder is empty, you take files from the unlabeled_sequences instead. The test folder will be populated only when you submit

plucky portal Apr 9, 2024, 5:50 AM

#

dense roost Hey. You should filter on .ogg files only. What you can do is that if the folder...

ah i see, but i mean when i submit it still happens. however, i think ive figured it out: is the notebook ran twice? once to commit and submit, and another hidden run to populate the test folder?

dense roost Apr 9, 2024, 5:59 AM

#

plucky portal ah i see, but i mean when i submit it still happens. however, i think ive figure...

Yes, it runs twice usually. You first need to commit then you can submit.

plucky portal Apr 9, 2024, 6:06 AM

#

dense roost Yes, it runs twice usually. You first need to commit then you can submit.

thanks for the answer. its my first time participating in a notebook-only competition

coarse drift Apr 12, 2024, 6:51 PM

#

hello birdclef-2024

#

working to make submissions with untrained off the shelf base models in order to find out if 120 minutes run-time limit can be met

#

unlabeled_soundscapes ogg files seem 4-min long, one could use 1100 of them to estimate the submission time

tawny charm Apr 12, 2024, 8:57 PM

#

The time limits are quite annoying in these competitions

plucky portal Apr 13, 2024, 3:33 AM

#

yeah, any reason its only 2 hours?

#

lots of other ones are like 9 hrs

vague veldt Apr 13, 2024, 7:17 AM

#

We want to deploy the winning solutions in the field where compute resources are limited, hence the computational constraints of the competition

minor tinsel Apr 13, 2024, 4:15 PM

#

I guess that by this restriction people will tend to use already trained models

coarse drift Apr 13, 2024, 6:02 PM

#

host confirmed train labels done by community and test labels done by highly skilled bird experts - no wonder why label smoothing is used

coarse drift Apr 13, 2024, 11:19 PM

#

some of unlabeled_soundscapes ogg are not 4-min long such as birdclef-2024/unlabeled_soundscapes/1005741050.ogg

#

estimated run-time of EfficientNetV2B0 using unlabeled_soundscapes is about 30min

coarse drift Apr 14, 2024, 12:34 AM

#

EfficientNetV2S is about 1 hr

#

at last, ready for real test submit

coarse drift Apr 14, 2024, 4:25 AM

#

Submissions scored 0.49 and took same time as estimated run-times

#

now is the time to train the model and hopefully score better than 0.49, will see what happens

coarse drift Apr 15, 2024, 11:41 PM

#

basic training in progress - no augmentation, just random 10-sec spec, gpu is almost not utilized, the bottle neck is decoding ogg and preparing spec, model has to wait for the image, estimated 1 hr per epoch, total 8 epochs, 8 hours, 🍝 and ☕

coarse drift Apr 17, 2024, 12:48 AM

#

8 epochs were not enough make any difference on the leaderboard score. more epochs scored 0.50

coarse drift Apr 17, 2024, 3:55 AM

#

there was a mistake in the submission code, when fixed, scored 0.61

tight trench Apr 18, 2024, 4:05 PM

#

coarse drift basic training in progress - no augmentation, just random 10-sec spec, gpu is al...

I always try to put the whole dataset into GPU memory to maximize utilization during experimentation (although the tradeoff is that you might have to reduce the resolution and/or precision of the data to get it small enough to fit)

#

pytorch and other libraries offer audio->spectrogram transforms that work on the GPU. If it were possible to do ogg->waveform decoding on the GPU too then that would be really cool.

prisma anvil Apr 18, 2024, 9:46 PM

#

For anyone who needs spectrogram data to train an image classification model, I made a dataset which contains most of the audio data separated into 5 second spectrograms. You can find the dataset here:
https://www.kaggle.com/datasets/nathaniellybrand/birdclef-2024-mel-spectrograms

BirdCLEF 2024 Mel Spectrograms

193,111 Single Channel Mel Spectrograms for BirdCLEF 2024

tight trench Apr 18, 2024, 10:49 PM

#

prisma anvil For anyone who needs spectrogram data to train an image classification model, I ...

Wouldn’t it be better to train on random crops instead of separating them ahead of time?

prisma anvil Apr 19, 2024, 3:13 AM

#

tight trench Wouldn’t it be better to train on random crops instead of separating them ahead ...

I am not sure what you mean by that. If you wanted you could shuffle the data so that they do not appear in order, I made the dataset so that people wouldn't have to convert the audio to spectrograms at runtime, since it takes longer to train.

tight trench Apr 19, 2024, 3:17 AM

#

prisma anvil I am not sure what you mean by that. If you wanted you could shuffle the data so...

I mean that random cropping gives training more diverse data. By precropping the model only sees 0-5, 5-10, 10-15, etc. Perhaps it learns to overfit on those particular crops.

But when random cropping, the model is given more arbitrary parts of a recording, like 2.34-7.34, 8-13 etc.

#

You can still compute the spectrograms ahead of time, just without splitting them up

prisma anvil Apr 19, 2024, 3:19 AM

#

You could certainly do that, but since the model is evaluated in 5 second chunks, it probably doesn't matter much.

tight trench Apr 19, 2024, 3:23 AM

#

During training you want the model to see as much variety of data as possible, but precropping would limit that

prisma anvil Apr 19, 2024, 3:24 AM

#

You can feel free to generate your own spectrograms, there is a plethora of notebooks teaching you how to do so.

tight trench Apr 19, 2024, 3:24 AM

#

Yes I have. Just curious as to why you’re taking this approach

minor tinsel Apr 24, 2024, 11:27 AM

#

Is it possible to upload trained in Colab model when submitting? Or I should always use Kaggle to train my model?

coarse drift Apr 24, 2024, 6:34 PM

#

model can be trained anywhere. kaggle is needed to submit.

coarse drift Apr 25, 2024, 5:27 PM

#

🤔 no joy yet - random leaderboard scores of 0.55, 0.56, 0.57, 0.58, 0.59, 0.60, 0.61 by different model sizes and input sizes using random 10-sec specs without any augmentations - there seems to be no noticeable correlation between local validation and leaderboard performance - i wonder if i can improve the score by utilizing various augmentations and techniques...

silent reef Apr 27, 2024, 4:36 PM

#

Hi; Just had a beginner level question as dataset size is nearly 24 GB can everything be processed on distributed Kaggle Notebooks (including mel spectrograms, Model Training etc tasks) in individual notebooks or do we need any external training environment and hardware resources ?

coarse drift Apr 27, 2024, 5:21 PM

#

silent reef Hi; Just had a **beginner level question** as dataset size is nearly 24 GB can e...

in my opinion, all can be done happily using kaggle resource, however, any additional external resources are always helpful.

rancid flint Apr 28, 2024, 12:50 AM

#

silent reef Hi; Just had a **beginner level question** as dataset size is nearly 24 GB can e...

Agree with @coarse drift's comment. All should be good on kaggle itself. Only limitation is the 30 hrs of GPU, 8 hrs of notebook runtime, and storage space for saving a bunch of preprocessed data (usually not an issue but if you're saving like 100+ GBs of data, I think that's an issue).

If you haven't already discovered it, using del and gc.collect() will be your friend for not exploding your RAM if you're doing data preprocessing on a bunch of files.

Also, downloading data to a cloud like AWS, Azure, or what I have been hot on recently, RodPod, is pretty straightforward and only takes like 2+ hours using a smallish VM. Then you can go ham on GPU, granted, you have money for it. Happy kaggling on birds.

zinc saddle Apr 28, 2024, 5:48 PM

#

Does anyone happen to know why every time I save version of my notebook the error occurs: nbclient.exceptions.DeadKernelError: Kernel died? It’s the third time it happens! This notebook is used for training my model (efficientvit_b1.r288_in1k), it doesn’t have any errors and doesn’t exceed the 9 hours’ limit (it hardly exceeds 1.5 hours!). Any help will be very appreciated

#

Here is my notebook - https://www.kaggle.com/demko1/train-birdclef-2024-pytorch-melspectrogram

Train BirdCLEF 2024 PyTorch MelSpectrogram

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

zinc saddle Apr 30, 2024, 7:44 AM

#

zinc saddle Does anyone happen to know why every time I save version of my notebook the erro...

As it turned out, the error arose because of the overflooding the RAM space or the GPU space. Already solved it!

civic falcon May 21, 2024, 6:19 AM

#

Hi all,
I have trained a EfficientNetV2L (also tried B0) convolutional neural network with an extra 128 Dense layer tacked on the end. All times I have attempted to submit have timed out on CPU time, and I'm very confused how I could fix this.
I load the audio with librosa, convert to Dbs, convert to grayscale image with tensorflow and use that to predict my outputs.

Based on example solutions people have done I'm scratching my head why my submission is timing out. I've run my predictions on the unlabled soundscapes and it has a pretty decent prediction rate of around 1 5s clip per 2-3s

warm loom May 22, 2024, 3:13 PM

#

civic falcon Hi all, I have trained a EfficientNetV2L (also tried B0) convolutional neural n...

I haven't been able to get submissions to not timeout unless they were around 8 seconds per file or under

civic falcon May 24, 2024, 5:29 AM

#

I'm getting about 380-400ms per 5s segment

Sanity checking: I submit by splitting each and every audio file into each and every 5s segment, which I then predict individually.

Edit: im getting 90ms per 5s segment and its still timing out.

coarse drift May 25, 2024, 8:16 PM

#

civic falcon I'm getting about 380-400ms per 5s segment Sanity checking: I submit by splitti...

predicting individually is likely to incur an unnecessary overhead. split audio file into 48 chunks of 5s segment, predict a batch of 48 chunks.

warm loom May 27, 2024, 3:31 PM

#

are you getting a batch size of 48 on cpu?

coarse drift May 27, 2024, 6:32 PM

#

Yes. training with gpu batch size of 64 and inferencing with cpu batch size of 48. what's your batch size?

civic falcon May 28, 2024, 3:42 AM

#

Can confirm the batch size 48 solved my timing out issue

warm loom May 28, 2024, 1:50 PM

#

it's 128 in training, but I was only doing 1 in inference. I might be able to submit a bigger model by doing 48

silent reef Jun 2, 2024, 12:41 PM

#

Hi again a beginner level question! I am still confused regarding random cropping during training; For example if currently I am cropping 15 seconds of audio in 5 seconds chunks from beginning and ending of audio. arent there chances that the 5 second melspectrogram will not contain any bird call but will only contain noise as I am not even able to cross 50% accuracy with efficientnet models (I know scoring is in macroavg roc) THANK YOU SO MUCH FOR HELPING OUT

civic falcon Jun 3, 2024, 2:43 AM

#

Yes, that is correct. There is a chance you won't get the bird noise. The keras starter notebook takes the approach of just using 10s windows for training and 5s for inference. Supposedly it transfers alright
(I am also beginner level)