#birdclef-2025 | Kaggle | Page 1

chilly bramble Mar 11, 2025, 1:55 AM

#

It is back!

robust kettle Mar 14, 2025, 9:29 PM

#

Hello

crimson trail Mar 18, 2025, 12:51 PM

#

Hello

wide monolith Mar 18, 2025, 4:34 PM

#

Ola

#

Dear competition creators, why are you counting failed submissions into no. of submissions, this doesn't make any sense.

#

I have made 4 failed submissions, all because of not getting any clarity on how to submit, now I am left with only 1 submisison

#

Now the 5th and last submission is going on if this also fails, I am done. It is really demotivating.

pale tundra Mar 18, 2025, 6:05 PM

#

wide monolith Now the 5th and last submission is going on if this also fails, I am done. It is...

this is usually done to deter probing, in competitions where probing is not an issue(like simulation competitions) failed subs are often not counted.

robust kettle Mar 18, 2025, 8:22 PM

#

I am an amateur. Coming back into Kaggle after 3+ years. Looking for teammate for this competition.

junior remnant Mar 19, 2025, 1:59 AM

#

robust kettle I am an amateur. Coming back into Kaggle after 3+ years. Looking for teammate fo...

Can you dm me?

wide monolith Mar 19, 2025, 7:43 AM

#

pale tundra this is usually done to deter probing, in competitions where probing is not an i...

aah, got it. But I get it once you understand how to submit, it won't be an issue.

junior remnant Mar 19, 2025, 11:29 AM

#

Hi, I'm looking for a team to add me as I am machine/deep learning practitioner and I have not worked with audio data in past I can do any task with text and tabular format if you have any spot vacant in your team count me in

wide monolith Mar 20, 2025, 5:54 AM

#

hey guys, does horizontal flipping the spectrograms make sense? I know, vertical flipping doesn't make any sense because our frequency bands will be altered but, flipping the time axis (horizontal flipping), is it good augmentation?

wide monolith Mar 20, 2025, 11:01 AM

#

has anyone tried training models in TPU. I am using tensorflow, but when I am setting up the strategy.

high flicker Mar 20, 2025, 2:37 PM

#

wide monolith hey guys, does horizontal flipping the spectrograms make sense? I know, vertical...

This will give you an undesirable distortion of the spectrogram. You're better off looking into pitch shifting and time stretching for augmentation.

wide monolith Mar 20, 2025, 4:50 PM

#

hi guys, I am running inference notebook. When I ran it on gpu, it got submitted successfully, but when I am running it on cpu, my submissions are failing

#

what can be the reason?

chilly bramble Mar 20, 2025, 6:25 PM

#

wide monolith Mar 20, 2025, 6:52 PM

#

I don't know after 11 mins of running it just got submission scoring error. And as I said the same GPU notebook when I am running with cpu only got failed.

To give some premise, I am joblib to process the audio data to mel specs for all audio segments, then I was running a 3 models ensemble. The notebook when running with GPU got submitted sucessfully. But just for a thought experiment I changed my accelerator to None (i.e. CPU only). It went to submission scoring error

#

😭 WHY

chilly bramble Mar 21, 2025, 1:56 AM

#

I think you need to use openvino or something like that

wide monolith Mar 21, 2025, 6:11 AM

#

Thanks man🎩 . Will explore

wide monolith Mar 21, 2025, 7:11 AM

#

I got my issue. I was job lib for processing all the audio files.

now after I am done with that I was then giving them to dataloaders for inference.

This 2 step approach was jamming my CPU and RAM. Now I am doing my processing in batches. Now it working perfectly with CPU.

wide monolith Mar 21, 2025, 6:39 PM

#

hi Guys, can someone tell me what is the use of the train_soundscapes? They have no labels right, so how can I use it?

arctic steppe Mar 21, 2025, 8:17 PM

#

wide monolith hi Guys, can someone tell me what is the use of the train_soundscapes? They have...

I believe they provide those as a means to give additional train data. You train your model without it and then utilize the model on it, identify labels, and then you can use the now labeled data for additional training. I believe previous year winners utilized it in this way for past BirdCLEF competitions where it was provided in this manner as well. The downside being if your model isn't good it may just confuse the model or otherwise reinforce inaccuracies with bad labeling.

wide monolith Mar 21, 2025, 9:15 PM

#

yeah little risky 😬

#

But Thanks for answering🎩

novel ember Mar 22, 2025, 1:04 PM

#

Hi everyone, I am new to this competition, I have just created a baseline but failed to submit notebooks.

#

https://www.kaggle.com/code/hlly34/birdclef2025-inference

BirdCLEF2025 - Inference

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

#

This is my baseline notebook, I am new to Kaggle so I wanted to team up with someone.

wide monolith Mar 22, 2025, 6:32 PM

#

novel ember Hi everyone, I am new to this competition, I have just created a baseline but fa...

Couple of questions:

is it GPU submission or CPU submission?
What is the error you got on submission: is it Submission scoring error or Timeout error

There is a limit of 90 mins for submission
Also, internet is off, so if you are downloading some model configs then that will be an issue
Check your submission file, I have attached my sample submission image, make sure the row_ids are properly generated

wide monolith Mar 23, 2025, 12:09 PM

#

Hello guys, I am seeing some audio files that contains human voices. Will the actual test data have them?

dawn abyss Mar 23, 2025, 10:35 PM

#

wide monolith Couple of questions: - is it GPU submission or CPU submission? - What is the err...

Not in this comp but since most don't allow internet access you can just upload the wheel files of any libs you need as a Kaggle dataset and use those to install any dependencies you need

wide monolith Mar 24, 2025, 7:45 AM

#

no no, I was replying to @novel ember question

brazen mica Mar 26, 2025, 2:20 PM

#

Is anyone interested in teaming up for this competition? I have experience with audio classification

lime meadow Mar 26, 2025, 7:29 PM

#

brazen mica Is anyone interested in teaming up for this competition? I have experience with ...

I am. Anyone want to collaborate?

novel ember Mar 28, 2025, 9:08 AM

#

lime meadow I am. Anyone want to collaborate?

I want to collaborate, can I dm to you ?

strange phoenix Mar 28, 2025, 2:44 PM

#

Hello there, I am looking to participate in this competition, and I have doubts about the dataset. I see some human and bird voices also. The names of the species are not present. Would like to get some insights from the community

lime meadow Mar 28, 2025, 4:54 PM

#

novel ember I want to collaborate, can I dm to you ?

sure

full isle Mar 29, 2025, 12:23 PM

#

I entered knowing im gonna do it just for the practice but 0.88 with 2 months to go is craaazy

wide monolith Mar 29, 2025, 12:37 PM

#

I created a notebook to make a dataset. In the notebook it shows there are 58k images. But when I am creating a dataset from the notebook's output. It shows only 500 files. What is happening? Am I doing anything wrong?

flat night Mar 30, 2025, 7:50 PM

#

Hi team! I'm a Kaggle newbie and had some questions on the competition setup:

Does training AND inference have to run in under 90 minutes? Or just inference? If it's the former, why do some example notebooks like this just load checkpoints?
What motivates folks to share their submissions and approaches on the open forum?
What's the policy on external training datasets? I would think they're not allowed, but posts like this have me confused

uneven sundial Mar 30, 2025, 8:05 PM

#

flat night Hi team! I'm a Kaggle newbie and had some questions on the competition setup: 1....

This is mentioned in the competition overview on kaggle. So external datasets and pretrained models are fine as long as they're freely and publically available.

Screenshot_2025-03-30-22-04-32-807_com.duckduckgo.mobile.android-edit.jpg

#

On 2, a lot of people have an innate desire to share their knowledge and understanding.

#

This satisfies a primitive urge and may in some cases improve their social status.

flat night Mar 30, 2025, 8:16 PM

#

Can you use external datasets and pretrained models for official submissions? If so, doesn't that contradict "internet access disabled"?

#

oh I guess running CPU inference on 700 samples in 90 minutes is itself quite limiting

vale brook Mar 30, 2025, 9:03 PM

#

I'm having problem, that when I submit my code I'm not given any data in the test_soundscapes

#

Anyone knows why this would be happening?

flat night Mar 30, 2025, 9:44 PM

#

Have you tried making an official solution like so?

vale brook Mar 31, 2025, 4:32 PM

#

flat night Have you tried making an official solution like so?

yes I did

vale brook Mar 31, 2025, 5:04 PM

#

At first, I thought that I have some blunder in my code. But with more adjustments I found out that even when I copied a few lines of code from BirdCLEF+ 2025: Simple Submission that were supposed to read all files in test_soundscapes/ and print length of that list, it printed out "0 files"

flat night Apr 1, 2025, 5:33 AM

#

oh the UI is potentially confusing. As far as I know, if you're able to print any output, you're likely running in a Kaggle notebook, in which case they intentionally leave the test set unpopulated.

just to make sure, if you're on this page, can you try hitting this button?

vale brook Apr 2, 2025, 2:17 PM

#

yes I tried this

#

it starts running and than fails and in the log I can see that when my code was trying to read data from test_soundscapes it couldn't because there weren't any

peak apex Apr 5, 2025, 1:07 PM

#

whats up with #1 having .902 😭

#

its gotta be overfitting public LB right?

chilly bramble Apr 7, 2025, 12:05 PM

#

Check this. https://www.kaggle.com/competitions/birdclef-2025/discussion/570837

BirdCLEF+ 2025

Species identification from audio, focused on birds, amphibians, mammals and insects from the Middle Magdalena Valley of Colombia.

chilly bramble Apr 9, 2025, 9:01 AM

#

I'm looking for motivated teammates.
I'm particularly interested in collaborating with people who are passionate, diligent, and eager to learn together. I'm from South Korea, so teammates comfortable with international collaboration and open communication would be ideal.

sonic tundra Apr 9, 2025, 10:48 PM

#

chilly bramble I'm looking for motivated teammates. I'm particularly interested in collaboratin...

I have been wanting to join a team and would be open to international collaboration. I

peak apex Apr 15, 2025, 4:51 AM

#

why do most of these birdclef solutions have CNN models that are trained on 5 folds and then they ensemble all 5 folds

#

why not just train on the entire dataset in 1 fold and use 1 model trained on everything?

arctic steppe Apr 15, 2025, 2:26 PM

#

peak apex why not just train on the entire dataset in 1 fold and use 1 model trained on ev...

The 5 folds allow the model to have the training data segmented into 5 buckets using 1 of the buckets as the validation set for 4 given training data at a time which is then ensembled for generalization purposes. The difficulty with this particular competition though and the 5 fold is that there isn't enough data for many of the labels to properly 5 fold without having some folds completely void of some labels. Some of the rarer classes only have like 2 oggs at all. One option would be to create multiple melspecs from the oggs such that you may have enough for multiple folds but you risk the data leakage there likely causing overfitting.

Perhaps the best option would be to round robin assign the rare classes and then only keep folds for testing that have one of the rare classes in. So 5 fold but utilizing only 2 of the folds given the rarity of the classes (where each of those 2 folds are the validation set).

For the full dataset you could do a train/validation split as long as you round robin the rare classes.
Perhaps the 5 folds though gives the best general answer given how rare the rare classes are though so folks relying on that end up doing well and the benefit of properly splitting the rare one is minor given there isn't much there to learn on.

peak apex Apr 16, 2025, 12:10 AM

#

arctic steppe The 5 folds allow the model to have the training data segmented into 5 buckets u...

I see thank you for such a detailed explanation!

quaint wharf Apr 18, 2025, 10:47 AM

#

Hi, I have one question. A lot of audio files are longer than the typical chunk duration of 5 or 10 seconds. Let's for example take the first two training examples that have the primary_label 1139490. The corresponding audio files are CSA36385.ogg and CSA36389.ogg. They are respectively 1:39 and 1:37 minutes long. Do you just truncate the audio files and only pick the first 5 or 10 seconds? Or is it not a better idea to create more training samples with this primary_label? If we take a chunk duration of 5 seconds, then it is possible to have (1:39 + 1:37) mod 5 training samples instead of 2. Or is this not advisable?

arctic steppe Apr 18, 2025, 12:05 PM

#

quaint wharf Hi, I have one question. A lot of audio files are longer than the typical chunk ...

One difficulty with using multiple from the same source is the frequency of animal calls.

I noticed:

Usually the subject makes a noise within the first 5/10s as the person uploading to the service cropped it so they are immediately being heard (since these come from those naturalist sites)
Often times after the first it can vary with some animals making noises consistently while others have a break between their vocalizations
Some recordings have humans either annotating after the sound or human voices intermixed with the animal sounds (the raccoons for example had at least one where there were people commenting on hearing raccoons and some author's samples have like 5s of the animal and then a minute of annotation consistently)
There's a chance that by including multiple from the same source file you'll overfit especially if the multiple end up in various folds / aren't grouped during splitting

I think the first 5/10s are probably the safest but you could include the others with caution to avoid overfitting. Some concern about uneven vocalizations could be offset by using a model like perch to detect if the bird was present in that segment: https://www.kaggle.com/models/google/bird-vocalization-classifier it doesn't contain all of the birds though for the competition so some instances but may help to identify segments of value for the given set.

There has been some discussion about identifying the human voices via silero-vad on the forum: https://www.kaggle.com/competitions/birdclef-2025/discussion/568886 I did notice some bird vocalizations are misclassified as humans with this approach though so some caution is needed.

round lintel Apr 21, 2025, 6:42 PM

#

Hey, a dumb question, but should my model consider audio segments with no bird sounds? I mean is it possible that the test soundscapes contain 5 second segments with no sounds at all? Do you guys do something about it during training?

heady cairn Apr 23, 2025, 6:32 AM

#

Hello Everyone,

As you all would have noticed, the audio files in the training dataset contains audio of the species' along with human annotations. I though of cleaning this by using VAD models(usually used to detect human speech segments for speech diarization processes).

I developed a python script to get the time stamps of non speech segments in the form of start and end timestamp lists for each audio sample.

My concern is, since we don't have any ground truths. Is there any way possible to evaluate the results. TIA

Here's the link to the notebook : https://www.kaggle.com/code/divyaprakashr/birdclef-2025-non-speech-activity-detection/edit

BirdCLEF 2025 - Non speech activity detection

Explore and run machine learning code with Kaggle Notebooks | Using data from BirdCLEF+ 2025

lapis night May 4, 2025, 11:47 AM

#

Hey there, I just came across this competition, checked the past years' records, and noticed the competitive score range has dramatically shifted upward. Does this suggest this competition became "easier" this year?
I can't seem to figure out the reason, since a wide variety of species has been added this time.
I'd be happy to hear from anyone, thanks!

real hull May 5, 2025, 4:46 PM

#

heady cairn Hello Everyone, As you all would have noticed, the audio files in the training...

It does not show the actual notebook sadly. But if you are using webrtcvad (which I was using), it will filter out many animal sounds as well, yielding a lot of audio without animals

real hull May 5, 2025, 4:46 PM

#

lapis night Hey there, I just came across this competition, checked the past years' records,...

One thing that helps is people can borrow succesful techniques from last year, and perhaps detecting one animal from, say, a spectogram isn't so different from detecting another animal from a spectogram. I see a lot are building on top of existing models, which have likely also gotten better over the past year. I am no expert though

lapis night May 6, 2025, 5:57 AM

#

real hull One thing that helps is people can borrow succesful techniques from last year, a...

Wonderful! And may I ask what model(s) people/you are making use of this year?

real hull May 6, 2025, 10:16 AM

#

lapis night Wonderful! And may I ask what model(s) people/you are making use of this year?

I've trained this one on basic versions of EfficientNet and regnet, but also used a more specific one like tf_efficientnetv2_s.in21k_ft_in1k. I believe the latter had been used in combination with focal loss and got number 8 last year

#

I'm playing around with early stopping as well but for some reason it makes performance worse more often than not and I'm not sure why 😅

lapis night May 6, 2025, 10:30 AM

#

real hull I've trained this one on basic versions of EfficientNet and regnet, but also use...

Thanks, how is your score looking with these?

real hull May 6, 2025, 4:07 PM

#

lapis night Thanks, how is your score looking with these?

Highest we've gotten is 0.807, but I believe it should be able to reach 0.829 with an untouched dataset with default stopping? Right now I am running my dataset cleanser to properly remove voices and silences this time, so hopefully that will go up

lapis night May 7, 2025, 2:32 AM

#

real hull Highest we've gotten is 0.807, but I believe it should be able to reach 0.829 wi...

Great, I'll look into it too!

simple quest May 8, 2025, 12:06 AM

#

Looking for 1 serious teammate for BirdCLEF 2025. Deadline is June 5. I haven’t started yet — just wrapping up Drawing with LLMs and Image Matching first (both end May 27–31).

My goal is Top 5 minimum. I’ll go all in on BirdCLEF starting June 1, but I want someone who can start groundwork now — loading the data, testing a few baseline models, figuring out label issues, and setting up basic training.

I don’t have compute. You’ll need to train on your end or use Colab. I can handle pipeline logic, ensembling, eval logic, and wrap-up once I’m free.

You should know audio modeling — spectrograms, CNNs, maybe wav2vec2 — and be down to win, not just submit something.

DM me with past comp or audio experience. No tourists.

quaint wharf May 14, 2025, 12:34 PM

#

We are trying to speed up our training, but nothing works. We already split the notebook into separate ones (preprocessing + training + testing&submitting). We now had the idea to precompute the Mel spectograms. But nothing works. Anyone any ideas that would help our training to run faster? One epoch with two folds takes already a few hours. I will provide our code in the attachment.

📎 birdclef-2025-training-simplecnn-and-timm-models4.ipynb

nova nebula May 27, 2025, 1:50 AM

#

Hi everybody. This is my first competition using this type of submit, and I am having a very hard time making it to run. This is the code I am using in the notebook:

test_soundscape_path = "/kaggle/input/birdclef-2025/test_soundscapes"
test_files = sorted(glob.glob(os.path.join(test_soundscape_path, "*/.ogg"),
recursive=True))
print(f"Found {len(test_files)} test audio files")

I have also used other similar ideas to iterate files in that folder, but the folder only contains the readme.txt file. I read similar posts, I asked chatgpt and qwen, and I cannot understand what am I doing wrong. I am submiting this code to the competition, not running it.

wind cedar May 27, 2025, 9:06 PM

#

nova nebula Hi everybody. This is my first competition using this type of submit, and I am h...

Try to check if test_soundscapes contains any .ogg - if not use train_soundscapes. Your if check will be successfull when submission will be computed at Kaggle side - then test_soundscapes will be "silently" filled up with ogg files

hallow lance Jun 1, 2025, 1:59 AM

#

A very very lame question:
IS 90 minutes CPU-only is for ALL the processing? Like, can I leave only fine-tune code part of model and load pre-trained model? As if I shrink my dataset gathered from input data files I get no satisfactory results. But if I train it in CPU-on;y mode it takes too long to fit in the 90 minutes.

#

I mean upload pre-trained model by me on my PC, using the code from Kaggle-notebook I upload for submission. So it was not published somewhere before, but I am ok someone can use it as the competition finishes. It is about the definition of terms in competition discription I am not really familiar with