#byu-locating-bacterial-flagellar-motors-2025 | Kaggle | Page 1

drifting magnet Mar 6, 2025, 6:44 AM

#

Hi

light junco Mar 6, 2025, 5:25 PM

#

Hey guys! My name's Andrew Darley and I'm one of the hosts for this Kaggle competition. Feel free to reach out if you have any questions! We're so excited to be doing this with you!

lime folio Mar 7, 2025, 2:00 AM

#

Hello all I'm Muhammad Yousif BS IT student and data science and ML practitioner I'm looking for team for this competition interested dm me thanks

minor verge Mar 8, 2025, 10:13 AM

#

light junco Hey guys! My name's Andrew Darley and I'm one of the hosts for this Kaggle compe...

Add me to your team i have 3yrs experience in IT and computer science

dim shoal Mar 14, 2025, 11:51 AM

#

minor verge Add me to your team i have 3yrs experience in IT and computer science

Andrew's the competition host, they're not competing

modern wharf Mar 17, 2025, 2:42 PM

#

Hi all! I'm looking to join a team. I'm a PhD student in Biostatistics with previous/current experience in data science and statistics. I have given the competition a start, and would love to partner up to continue on. Thanks!

lime folio Mar 17, 2025, 6:11 PM

#

modern wharf Hi all! I'm looking to join a team. I'm a PhD student in Biostatistics with prev...

can you dm me?

minor verge Mar 19, 2025, 4:16 AM

#

modern wharf Hi all! I'm looking to join a team. I'm a PhD student in Biostatistics with prev...

lets team up and build ourselves

spring panther Mar 23, 2025, 10:28 PM

#

Hello! Quick question, any reason why the series have the slices saved as separate files and not as a single 3D image? Wouldn't it be more beneficial for those looking to make use of 3D conv nets or other similar methods? I'm guessing the images are similar to MRIs with voxel spacing and all

mellow stirrup Mar 24, 2025, 1:53 AM

#

spring panther Hello! Quick question, any reason why the series have the slices saved as separa...

You can just do that with some preprocessing code

spring panther Mar 24, 2025, 8:51 AM

#

I mean yea I'm just asking if the images work in the same way as MRIs

balmy ginkgo Mar 24, 2025, 11:33 AM

#

"A tomogram is a 3D volumetric representation of an object. In this competition, each tomogram is provided as a set of 2D image slices (JPEG) stored in a unique directory."

spring panther Mar 24, 2025, 10:33 PM

#

I'm guessing voxel spacing will be in the jpeg metadata then, I'm planning on resampling to a common voxel spacing so I need to figure the averages out

balmy ginkgo Mar 25, 2025, 12:28 AM

#

Is on train.csv

mellow stirrup Mar 25, 2025, 3:30 AM

#

spring panther I'm guessing voxel spacing will be in the jpeg metadata then, I'm planning on re...

Don’t think it will work at inference time for submission because there is no CSV file for the test data. Would be great if there is some metadata in the image files themselves but not sure.

balmy ginkgo Mar 25, 2025, 8:07 AM

#

How good is xz/yz resolution? I mean, resize the 2D slices sounds good, but resize z is factible?

spring panther Mar 25, 2025, 9:55 AM

#

I mean it sounds odd to me, when working with MR images resampling to a common voxel spacing is a pretty common practice. Unsure if tomography has more nuance as to how it handles voxels but that seems odd to me

#

I'll check the image metadata in a bit

tiny flax Mar 25, 2025, 6:44 PM

#

I read someone saying they want our models to be able to generalize across tomograph data of different resolutions / spacing.

mellow stirrup Mar 25, 2025, 9:51 PM

#

spring panther I'll check the image metadata in a bit

I don’t see voxel spacing in the JPEG metadata so it has to be assumed unknown at inference

spring panther Mar 25, 2025, 10:53 PM

#

It's not an inference thing, it's just that we're unable to resample the tomograms to a common voxel spacing

#

I would have to guess that step is already taken care of by the comp hosts

mellow stirrup Mar 25, 2025, 11:04 PM

#

You can resample according to voxel spacing in the training data since it’s part of the labels file, but we don’t have that for the test set, that’s what I meant by inference

gloomy osprey Mar 25, 2025, 11:51 PM

#

spring panther Hello! Quick question, any reason why the series have the slices saved as separa...

because, when you are processing it, you r going to process slices

light junco Mar 26, 2025, 4:49 AM

#

balmy ginkgo How good is xz/yz resolution? I mean, resize the 2D slices sounds good, but resi...

These are examples of XZ and YZ slices

balmy ginkgo Mar 26, 2025, 9:02 AM

#

Thank you.

spring panther Mar 26, 2025, 9:53 AM

#

mellow stirrup You can resample according to voxel spacing in the training data since it’s part...

Yea I guess that's one of way of approaching it

spring panther Mar 26, 2025, 11:50 AM

#

How long does scoring take? It's been about an hour and still at it

spring panther Mar 26, 2025, 12:35 PM

#

This seems odd my notebook runtime was less than 1 hour (just testing the submission pipeline) and it's still here

grave rose Mar 26, 2025, 1:09 PM

#

The test dataset has 900 tomograms, not 3

#

It should run within 12 hours

spring panther Mar 26, 2025, 1:56 PM

#

gotcha, should be more mindful of future submissions then

#

ty

plush olive Mar 26, 2025, 7:31 PM

#

Anyone want to collaborate?

spring panther Mar 27, 2025, 10:50 AM

#

spring panther gotcha, should be more mindful of future submissions then

Note to self and anyone submitting: make sure your submission.csv doesn't have typos in the filename

tiny flax Mar 28, 2025, 4:10 AM

#

Anyone lmk if you want to collaborate. Im doing a 3d UNet but might be looking at object detection based solutions soon

pine mauve Mar 29, 2025, 12:33 PM

#

i have a question a image given is 2d should i process single image at a time with yolo or a sequence matter in this problem?

balmy ginkgo Mar 29, 2025, 8:43 PM

#

are 3D images, so you can consider it as a sequence. But would be simpler consider it a volume.

#

or just 2D images and stack predictions, not sure about popular approach

pine mauve Mar 30, 2025, 6:04 AM

#

spring panther This seems odd my notebook runtime was less than 1 hour (just testing the submis...

I had also submitted. How long it takes for scoring

#

?

pine mauve Mar 30, 2025, 9:59 AM

#

tiny flax Anyone lmk if you want to collaborate. Im doing a 3d UNet but might be looking a...

have you tried 3D unet ?

spring panther Mar 30, 2025, 10:45 AM

#

ok so my notebook runs fine and here's an example submissions_csv generated

the notebook for eval runs fine but then scoring gets timed out, has anyone else had this similar issue?

📎 submission.csv

pine mauve Mar 30, 2025, 10:48 AM

#

spring panther ok so my notebook runs fine and here's an example submissions_csv generated th...

which model are you using i used yolov8n model got score of 0.268 now training large model

spring panther Mar 30, 2025, 10:48 AM

#

i'm using a unet

pine mauve Mar 30, 2025, 10:48 AM

#

and goona try V-net

pine mauve Mar 30, 2025, 10:48 AM

#

spring panther i'm using a unet

3D?

spring panther Mar 30, 2025, 10:48 AM

#

2d

pine mauve Mar 30, 2025, 10:49 AM

#

oh Thanks for information

spring panther Mar 30, 2025, 10:50 AM

#

i mean on a small subset it does fine i just don't understand why it times out on scoring

#

is my csv set up wrong?

#

ok i think i found the issue, sec

spring panther Mar 30, 2025, 11:38 AM

#

yup it was incorrect csv formatting, don't forget to add index = false when saving your csv

balmy ginkgo Mar 30, 2025, 12:27 PM

#

hidden test has more samples than local test

tiny flax Mar 30, 2025, 2:16 PM

#

pine mauve have you tried 3D unet ?

I have been trying 3d unet for awhile with no luck. Ive seen people say it can work, but i cant get the model to learn anything. The data might be to large & sparse for unet

tiny flax Mar 30, 2025, 3:08 PM

#

Has anyone got 3d unet to learn?

mellow stirrup Mar 30, 2025, 6:16 PM

#

3D for me works well on training and validation locally but scored poorly on LB, not using UNet though

spring panther Mar 30, 2025, 9:40 PM

#

mellow stirrup 3D for me works well on training and validation locally but scored poorly on LB,...

How long does submission take? I'm 10 hours in with a 2D unet and still nothing

tiny flax Mar 30, 2025, 9:56 PM

#

mellow stirrup 3D for me works well on training and validation locally but scored poorly on LB,...

What kind of 3D model are you using?

mellow stirrup Mar 30, 2025, 10:25 PM

#

tiny flax What kind of 3D model are you using?

CNN

#

I think the “mask” would be too sparse for UNet — did try it but shifted away quickly

#

If you’re going 3D I think the challenge is more in how you’re preprocessing, what your objective/loss is, and any postprocessing

#

Also augmentation is key

tiny flax Mar 30, 2025, 10:38 PM

#

Absolutely. I was doing random crop, stretch & rotation using ~ 100^3 dim volumes w gaussian sphere target, but I think the data is too sparse. Tried many combinations of losses with no luck.

#

Going to look into 2d&3d object detection instead

mellow stirrup Mar 30, 2025, 11:06 PM

#

It’s possible to get some great validation scores on the data provided but I think one issue is that the array sizes vary between training and holdout

spring panther Mar 31, 2025, 11:31 AM

#

mellow stirrup I think the “mask” would be too sparse for UNet — did try it but shifted away qu...

I'm doing Unet on a small subset of the data and it's doing fine, my issue is my notebooks keep timing out and that's why I asked you how long submission takes with a 3d net. I made the unet shallower and trying a submission again

mellow stirrup Mar 31, 2025, 12:46 PM

#

spring panther I'm doing Unet on a small subset of the data and it's doing fine, my issue is my...

Took mine about 3.5 hours to run through the holdout data

#

T4x2

spring panther Mar 31, 2025, 12:46 PM

#

I see, I made my Unet shallower and testing again

mellow stirrup Mar 31, 2025, 12:48 PM

#

Are you not able to run it on Kaggle through some train or test data to see how long per tomogram then have an estimate for the full 900 holdouts?

#

The array sizes vary but I think it was pretty close to what I estimated by doing that

spring panther Mar 31, 2025, 6:18 PM

#

it was the model taking too long, finally got a submission to work

#

so proud of this

spring panther Mar 31, 2025, 6:40 PM

#

So heads up if anyone is trying to use a 2d U-Net I wouldn't go past depth 3, maybe with something like mixed-precision it could be faster but since the task is getting coordinates and not segmentation keep the depth to a minimum

rigid zealot Apr 2, 2025, 5:32 AM

#

Anyone want to colab?
I have no credentials or anything
I've participated in czii-2024 and gone through all the solutions to an extent
Thinking of trying out various object detection models from previous 1st place solution

pine mauve Apr 2, 2025, 7:05 AM

#

rigid zealot Anyone want to colab? I have no credentials or anything I've participated in czi...

I am open to collab

#

what are the way or model to denoise the 3D volume

light junco Apr 2, 2025, 4:32 PM

#

pine mauve what are the way or model to denoise the 3D volume

There are methods to denoise each slice individually that you could adapt
https://www.kaggle.com/code/andreipaulavets/byu-denoising-cryo-et-slices/notebook
https://www.kaggle.com/code/andreipaulavets/byu-denoising-cryo-et-with-noise2void

BYU Denoising Cryo-ET slices

Explore and run machine learning code with Kaggle Notebooks | Using data from BYU - Locating Bacterial Flagellar Motors 2025

BYU Denoising Cryo-ET with Noise2Void

Explore and run machine learning code with Kaggle Notebooks | Using data from BYU - Locating Bacterial Flagellar Motors 2025

little ginkgo Apr 3, 2025, 8:34 PM

#

I’m curious whether the dataset used in this competition is synthetic (generated) or derived from real experimental data. Thanks in advance!

rain kiln Apr 5, 2025, 6:53 PM

#

Pls have done his capstone project on the just concluded 5 days training

balmy ginkgo Apr 7, 2025, 1:37 PM

#

Don't you think CV fails because train might be already augmented so different folds can get same tomogram just rotated/translated/flipped/etc... ?

#

I mean, there is a lot of images that looks rotated...

light junco Apr 8, 2025, 5:11 PM

#

little ginkgo I’m curious whether the dataset used in this competition is synthetic (generated...

All the data is authentic. It is actually currently impossible to generate synthetic data as we can't model the noise

drifting magnet Apr 9, 2025, 9:01 AM

#

I'm looking for motivated teammates.
I'm particularly interested in collaborating with people who are passionate, diligent, and eager to learn together. I'm from South Korea, so teammates comfortable with international collaboration and open communication would be ideal.

wintry monolith Apr 14, 2025, 7:13 AM

#

How do I get to know that my yolo has reached its full potential? Current MAP@50 is 0.978 and Public LB is 0.769

#

I was wondering if I should start working on a custom model now or should I work on improving yolo only?

drifting magnet Apr 14, 2025, 8:14 AM

#

Try, more difficult argumentation or making new train code for 2.5d

fossil charm Apr 14, 2025, 1:24 PM

#

Hi, I was wondering if we need to turn off internet connection for notebook does that also mean we cannot install any packages from internet?

balmy ginkgo Apr 14, 2025, 1:45 PM

#

#

put there anything you need, save, komit, and submit komit notebook

fossil charm Apr 15, 2025, 9:33 AM

#

Is there any reason that my notebook still showing running a few hours after it's being finished. it's still not showing it's score

balmy ginkgo Apr 15, 2025, 9:52 AM

#

900 hidden samples

fossil charm Apr 15, 2025, 7:16 PM

#

Has anyone faced a public score of 0 before?! I'm getting good results on the training and validation part of the data. I think it might be related to something other than the model performance! appreciate any help in advance.

mellow stirrup Apr 15, 2025, 10:16 PM

#

fossil charm Has anyone faced a public score of 0 before?! I'm getting good results on the tr...

I also get great results on the provided data but poor performance on the test set, I think it’s because the image scale is different

fossil charm Apr 18, 2025, 2:42 PM

#

Just noticed that some folders inside train folder are not present in train label csv file. is that on purpose? or just something missing in data?

balmy ginkgo Apr 18, 2025, 4:04 PM

#

Oh, there is 8 folders more than labels, first time to notice

balmy ginkgo Apr 19, 2025, 12:47 PM

#

I've just checked and there is 648 of both, not sure where I've seen more folders than ids...

balmy ginkgo Apr 19, 2025, 1:18 PM

#

and just to be clear, all 648 folders are in labels csv