#ubc-ocean | Kaggle | Page 1

languid swan Oct 10, 2023, 7:00 AM

#

Hi, When I'm doing submission, I'm getting error "notebook out of memory". My notebook is running successfully though. Any solutions? Should I decrease my batch size for test dataloader?

shell ocean Oct 11, 2023, 12:44 PM

#

@languid swan s I have the same problem😅

languid swan Oct 11, 2023, 12:45 PM

#

Ok, thanks for getting the chat started. I tried both gpu and cpu inference and then submitted. I tried different batch sizes. Finally, one submission went through: cpu inference, batch size of one but it took like 5+ hours. Everything else failed. I'm sure there is a faster way to do it.

shell ocean Oct 11, 2023, 1:17 PM

#

languid swan Ok, thanks for getting the chat started. I tried both gpu and cpu inference and ...

Oh, I havn't tried cpu version. Thank you for sharing!

vocal tapir Oct 12, 2023, 1:34 AM

#

I think in the competition page it mentioned that some test set images were too big and they were looking into it

#

Not sure what the solution is gonna end up being

#

You all are using whole images or taking patches of them or something?

shell ocean Oct 12, 2023, 11:37 AM

#

vocal tapir Not sure what the solution is gonna end up being

Both of them, I tried.
But, threw OOM error.
Including when using CPU for me😅

vocal tapir Oct 12, 2023, 1:47 PM

#

verypainful

#

Hmm

#

I saw there are some library that lets you load patch without putting the whole image into memory gonna try that later

mighty scarab Oct 13, 2023, 12:20 AM

#

vocal tapir I saw there are some library that lets you load patch without putting the whole ...

what library is that?

vocal tapir Oct 13, 2023, 1:30 AM

#

https://www.kaggle.com/competitions/mayo-clinic-strip-ai/discussion/335976

Mayo Clinic - STRIP AI

Image Classification of Stroke Blood Clot Origin

#

openslide

languid swan Oct 13, 2023, 8:25 AM

#

vocal tapir I think in the competition page it mentioned that some test set images were too ...

what worked for me is to downsizing the image (using transforms) to train and same transformation I applied to test dataset. So when processing the image size is 128x128 or 224x224 and batch size of 1/2. It takes 8+ hours for the notebook to finish scoring. Hopefully when they release solution next week for handling large images in test set, things could be improved. If you know of any other workaround, please let me know.
Also, i ran the notebook on cpu not gpu; even with the above workaround, gpu notebook failed everytime. I guess it was due to the same issue they mentioned that test images are too large to fit in gpu memory.

vocal tapir Oct 15, 2023, 1:11 AM

#

https://www.kaggle.com/code/pjmathematician/ucbo-256-tiles-loading/notebook dataset of small patches of the images

UCBO_256_Tiles_Loading

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

mortal lantern Oct 17, 2023, 8:19 AM

#

Hey guys. I just prepared my submission notebook and it ran succesfully, but in the submission tab I always get the "Notebook Threw Exception" error. It's impossible to know where (or why) the error occured. Do some of you maybe know of any abnormalities in the test images?

thin minnow Oct 17, 2023, 1:45 PM

#

Hi everyone. How do you do with those heavy images just to get started

mortal lantern Oct 17, 2023, 2:35 PM

#

thin minnow Hi everyone. How do you do with those heavy images just to get started

I created a simple notebook that returns some patches of the original, huge image and saves them in the output folder. It runs for about 8h and then it is available as download.

lunar lichen Oct 21, 2023, 9:09 AM

#

languid swan what worked for me is to downsizing the image (using transforms) to train and sa...

Currently using 512 by 512, and it takes around 7 hours. I think there's a trade off between data size and model size

lunar lichen Oct 21, 2023, 9:13 AM

#

shell ocean Both of them, I tried. But, threw OOM error. Including when using CPU for me😅

Have you tried setting batch size and number of workers to 1?

shell ocean Oct 21, 2023, 10:59 PM

#

lunar lichen Have you tried setting batch size and number of workers to 1?

I could resolve it by using thumbnail.png pictures to WSI images in test dataset!
In my opinion, batch size shold be the same between train and test one.
And, workers depend on the CPU capacity, not GPU.

thin relic Oct 22, 2023, 12:17 PM

#

How do you train a tailed image, is it possible to train it with just a part of it? I think I need to train it with the whole thing, but I don't have a clue how to do that. Is there any code I can refer to?

tired cave Oct 24, 2023, 1:39 PM

#

Hi guys , I am facing that problem while I want to submit my notebook, any insights from you will very helpful 🙂

opal sedge Oct 24, 2023, 2:10 PM

#

in your notebook option you need to turn off internet, for submission:

tired cave Oct 24, 2023, 2:11 PM

#

opal sedge in your notebook option you need to turn off internet, for submission:

thank you so much @opal sedge

vocal tapir Oct 26, 2023, 2:05 PM

#

thin relic How do you train a tailed image, is it possible to train it with just a part of ...

What do you mean by tailed image?

wind cairn Oct 26, 2023, 6:46 PM

#

I got validation and test accuracy 1.00 , i trained model on CPU as my kaggle GPU and tpu resources exhausted and and i took 1 hrs 😔

shell ocean Oct 27, 2023, 11:48 AM

#

wind cairn I got validation and test accuracy 1.00 , i trained model on CPU as my kaggle GP...

Isn't it overfit?

wind cairn Oct 27, 2023, 3:01 PM

#

shell ocean Isn't it overfit?

Yes I'm getting a test accuracy of 0.85

wind cairn Oct 27, 2023, 3:02 PM

#

shell ocean Isn't it overfit?

But I don't know why I'm getting an error while submitting the scoring error also

#

rn_image_picker_lib_temp_3f73f9bc-9b6e-422c-8ddc-2925bc9ff1e3.jpg

vocal tapir Oct 28, 2023, 1:45 AM

#

Does it say in logs what error it is

vocal tapir Oct 28, 2023, 1:45 AM

#

wind cairn I got validation and test accuracy 1.00 , i trained model on CPU as my kaggle GP...

1 hr cpu training for that is pretty good

#

You trained on image patches?

#

Maybe there’s a problem with how you convert solution set to patches

wind cairn Oct 28, 2023, 7:09 AM

#

vocal tapir You trained on image patches?

what image patches ? i trained them on thumbnails. notbook run fines it succesfully run but it shows error ,what could be problem in coding ? also submission file is in right format

vocal tapir Oct 28, 2023, 7:25 AM

#

That's weird

#

if you can't find an error message then idk how you'll debug

#

start with a dead simple solution that just guesses the same category for everything and build up from there until you find the part that breaks it

wind cairn Oct 28, 2023, 12:18 PM

#

vocal tapir That's weird

It says about some hidden dataset I think hidden dataset have larger size , and i trained model on thumbnail images , and also normalised on thumbnail images this might be issue I have to train on actual data

vocal tapir Oct 28, 2023, 2:30 PM

#

Yea I guess just double check your code that loads the images

wind cairn Oct 28, 2023, 2:46 PM

#

vocal tapir Yea I guess just double check your code that loads the images

Did you trained on thumbnail data or actual large file data ?

vocal tapir Oct 28, 2023, 2:47 PM

#

vocal tapir https://www.kaggle.com/code/pjmathematician/ucbo-256-tiles-loading/notebook data...

I’m using this right now

#

Haven’t actually figure out how to process the test set though lol

vocal tapir Oct 28, 2023, 3:35 PM

#

@wind cairn in your code to load and use the test set, you can try treating the train image folder as if it was the test set and see if it breaks. Maybe your code works fine on the sample test set because it’s only one image but breaks if it has to load multiple

wind cairn Oct 28, 2023, 3:49 PM

#

vocal tapir <@1077622322258772070> in your code to load and use the test set, you can try tr...

But what u use for training trian_thumbnails or just train_image folder as even with data loader i try to run train_image folder with tpu it gives me memory full error

vocal tapir Oct 28, 2023, 3:51 PM

#

Oh I meant thumbnails since that’s what you are planning to use as your model input

#

So run your submission code but use train thumbnails instead of test thumbnails just to see if it crashes or not

wind cairn Oct 28, 2023, 3:53 PM

#

vocal tapir So run your submission code but use train thumbnails instead of test thumbnails ...

Ok so I need to use train_ thumbnail for testing purposes to see crashing or not

#

Test image took 150gb of ram to load

vocal tapir Oct 28, 2023, 3:53 PM

#

Yea use thumbnails

wind cairn Oct 28, 2023, 3:53 PM

#

Do i need to use train_image folder for training?

#

As even on tpu getting memory error with dataloader

wind cairn Oct 28, 2023, 3:55 PM

#

vocal tapir Yea use thumbnails

I will tell u
I used train_thumbnail for training
Then i tested it on test image

vocal tapir Oct 28, 2023, 3:56 PM

#

Yea so just stick with using thumbnails

#

So test on test_thumbnails too

#

If u just use the whole images it’ll probably run out of memory

wind cairn Oct 29, 2023, 11:03 AM

#

vocal tapir Yea so just stick with using thumbnails

I tried to test it on train_image folder to predict images but no luck i tried with one by one images by looping no luck also i tried with tile by tile sliding on a image but no luck 😔

vocal tapir Oct 29, 2023, 12:45 PM

#

Yea it’ll probably run out of memory if u try to run it on the full images

wind cairn Oct 29, 2023, 1:03 PM

#

vocal tapir Yea it’ll probably run out of memory if u try to run it on the full images

So how it is possible to run it on hidden dataset? Are they using thumbnail hidden set or actual high quality image hidden set

vocal tapir Oct 29, 2023, 1:04 PM

#

The hidden set has both full images and thumbnails I think?

#

So if you load test_thumbnails folder and submit then in the hidden submission it will replace the single image there with whatever the hidden dataset thumbnails

wind cairn Oct 30, 2023, 3:40 PM

#

vocal tapir The hidden set has both full images and thumbnails I think?

I tried to predict with train_thumbails and no memory errors,

But when i submit a notebook notebook submission is running but scoring fails before that , what could be the problem?

vocal tapir Oct 30, 2023, 4:08 PM

#

Can u post the notebook it’s p hard to tell without error message

wind cairn Oct 30, 2023, 4:23 PM

#

vocal tapir Can u post the notebook it’s p hard to tell without error message

Yes sure

wind cairn Oct 30, 2023, 4:25 PM

#

vocal tapir Can u post the notebook it’s p hard to tell without error message

i dme u

terse totem Oct 31, 2023, 10:46 AM

#

https://www.kaggle.com/code/aritrag/kerascv-train-and-infer-on-thumbnails

Just posting it here is this help people to get started with KerasCV.

[KerasCV] train and infer on thumbnails

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

gloomy tundra Nov 4, 2023, 3:28 PM

#

hello!

gloomy tundra Nov 5, 2023, 3:14 AM

#

is there any benchmark available on how long it takes to read all the train/test images?

keen orbit Nov 8, 2023, 9:38 PM

#

Hii everyone, I get the error of Notebook Out of Memory. Your notebook requested more memory (RAM) than is available when I submit my notebook. With some research, I have discovered the library pyvips that enables you to compress images.

I would like to know how to use pyvips to compress the train images so as to reduce memory usage.

obsidian sigil Nov 9, 2023, 7:13 AM

#

Is kaggle’s discussion search broken?

opal sedge Nov 9, 2023, 9:17 AM

#

keen orbit Hii everyone, I get the error of `Notebook Out of Memory. Your notebook requeste...

there is a guide here, take a look,
https://www.kaggle.com/code/aliabbasi/ubc-eda-pyvips-in-offline-mode

obsidian sigil Nov 10, 2023, 1:01 PM

#

Help me.
For some reason, code submission fails.

https://www.kaggle.com/code/yamitomo/i-need-help

I need help

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

gusty valve Nov 12, 2023, 6:11 PM

#

Hello, please help me.
The prediction submission fails with the error "Submission score error". The submit file is created with the image slice code, it filled all the image_id rows with labels and the output file "submission.csv", look at this.
Anyway I shared sending my notebook. Thanks for seeing me. https://www.kaggle.com/code/mdquilindo/ubc-submit-large-images

UBC - Submit large images

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

opal sedge Nov 21, 2023, 7:16 PM

#

Hi everybody,
How do you guys handling long running time of submission notebook due to large images? Any recommendations or sample notebooks to look at ?

My submission get timeout after 12 hours running, apparently it doesn’t run fast enough

opal sedge Nov 22, 2023, 11:50 AM

#

https://tenor.com/view/tumbleweeds-desert-hot-dry-drought-gif-21341711

Tenor

opal sedge Nov 28, 2023, 6:05 PM

#

welcome, what is your leaderboard score?

tiny tiger Nov 29, 2023, 5:05 AM

#

Yeah I'm looking for

visual hazel Dec 12, 2023, 12:36 AM

#

Just joined the group! Sorry if this was asked before, but is there a resized version of the dataset hosted somewhere that is lighter in size? I just cannot download about a TB of raw data unfortunately :/ (Thx!)

limpid mist Dec 13, 2023, 8:44 PM

#

https://www.kaggle.com/datasets/aifahim/ubc-ocean-jpeg-compress-datasetgunes-approach has the full-sized images converted to JPEG for a download size of 17 GB.

UBC-OCEAN - JPEG Compress Dataset(Gunes Approach)

opal sedge Dec 14, 2023, 12:58 PM

#

limpid mist https://www.kaggle.com/datasets/aifahim/ubc-ocean-jpeg-compress-datasetgunes-app...

but doesn't it lost quality ? apparantly there are tiny features which indicated which class belongs the image

limpid mist Dec 14, 2023, 8:51 PM

#

opal sedge but doesn't it lost quality ? apparantly there are tiny features which indicated...

Yes, there is some lost quality due to resizing and saving as a JPEG image at 80%. The reason to use the JPEG images is to be able to download the data and test out various ideas locally without having to have an Internet connection.

vivid wedge Dec 16, 2023, 12:54 PM

#

limpid mist https://www.kaggle.com/datasets/aifahim/ubc-ocean-jpeg-compress-datasetgunes-app...

Gives 404

limpid mist Dec 16, 2023, 7:05 PM

#

vivid wedge Gives 404

I just checked that link and it appears to be OK. The page should have a button to download a zip file which is stored on a Google service. I didn't create the dataset so if you still have problems accessing it, report the problem to Kaggle by creating a new Discussion topic in this competition.

vivid wedge Dec 16, 2023, 7:43 PM

#

limpid mist Yes, there is some lost quality due to resizing and saving as a JPEG image at 80...

Somehow now it worked, thanks.

civic hamlet Dec 17, 2023, 6:08 AM

#

going to try some weird technique, I came up with

thin glacier Dec 25, 2023, 5:44 AM

#

Hi @all.

#

Does submitting a notebook in GPU mode, consumes our GPU quota?

orchid flame Dec 26, 2023, 11:30 AM

#

Hello Everyone, I have had issues with submission for the previous three days. This is my first challenge in kaggle. I really need your help if anyone here can help

heavy cloud Dec 30, 2023, 4:26 PM

#

orchid flame Hello Everyone, I have had issues with submission for the previous three days. ...

hey @orchid flame , what is the issue with the submission?

orchid flame Dec 31, 2023, 2:08 PM

#

heavy cloud hey <@825403892903706635> , what is the issue with the submission?

Even right now, I have just submitted and I get the error: Notebook Threw Exception. Very frustrating

orchid flame Dec 31, 2023, 2:09 PM

#

heavy cloud hey <@825403892903706635> , what is the issue with the submission?

Here is a screenshort

orchid flame Dec 31, 2023, 2:11 PM

#

wind cairn I tried to predict with train_thumbails and no memory errors, But when i submit...

Hi Mukesh, I happen to be getting the same error, Did you solve yours.

wind cairn Dec 31, 2023, 3:14 PM

#

orchid flame Hi Mukesh, I happen to be getting the same error, Did you solve yours.

Nope

heavy cloud Dec 31, 2023, 7:14 PM

#

Hi @orchid flame Here's the approach that proved successful for me:

Save Models, Upload, and Create a Kaggle Dataset:

Save your trained models.
Upload them to Kaggle and organize them to create a dataset.
Create a New Notebook for Inference in the Competition:

Develop a new notebook specifically tailored for performing inference during the competition.
Add Your Model Dataset as Input:

Include your model dataset as an input to the notebook for seamless integration.
Write Code to Load Models and Establish Your Inference Pipeline:

Develop code to efficiently load your pre-trained models.
Establish a robust inference pipeline that aligns with the competition's requirements.
Optimize Preprocessing Steps for Computational Efficiency:

Ensure that your preprocessing steps are optimized for efficiency.
Minimize the computational time required for preprocessing to enhance overall performance.
Switch to CPU and Submit:

Consider switching to CPU for final submissions to meet any competition constraints.
Validate and fine-tune your code on CPU to ensure compatibility before submitting your results.

orchid flame Dec 31, 2023, 7:17 PM

#

heavy cloud Hi <@825403892903706635> Here's the approach that proved successful for me: Sa...

Thank @heavy cloud . This sounds solid. Am going to start with it tomorrow. I will be reporting back the results

indigo onyx Dec 31, 2023, 9:00 PM

#

Hello Dears,
I want to submit UBC Ovarian Cancer Subtype notebook, but there is a problem with submission, it failed after about 30 seconds and showed this message "Notebook Threw Exception",
It's really short time and definitely doesn't relate to submission I think.
please help, just 2 days left

orchid flame Jan 1, 2024, 10:49 AM

#

Hey @heavy cloud , I have just submitted my notebook after following what you told me. Though my score is not that good, I feel happy as this is my first competition. Thank you very much😆

heavy cloud Jan 1, 2024, 10:57 AM

#

@orchid flame Its always a first try. Mine didn't improve as well. Although my metrics for each model was good. the final pipeline for the test wasn't good as reflected on the LB. apprently it means the models are predicting only one set of class well whiles doing poorly on the others. I am waiting for the methods others used and i will learn from them.

orchid flame Jan 1, 2024, 11:03 AM

#

heavy cloud <@825403892903706635> Its always a first try. Mine didn't improve as well. Altho...

Indeed, this is a great challenge all together . Bur thank you for the well articulated direction you gave me. I found it straight, clear and precise.

indigo onyx Jan 1, 2024, 6:53 PM

#

heavy cloud <@825403892903706635> Its always a first try. Mine didn't improve as well. Altho...

dear @heavy cloud i have uploaded models through Data in input section,
but when i submit in offline mode, it cant load the model
and when i want to save and run it fails and when the internet is on(i mean notebook's internet) it runs and saves successfully.
this is my first submission, please help

heavy cloud Jan 1, 2024, 7:00 PM

#

@indigo onyx . you have to create a dataset from the models you have trained and saved. First download your saved model onto your local machine , then in your kaggle notebook click on add data. and upload them as a dataset. then import it into the input section as you mentioned. next, load them from your new dataset from the input and it will work

indigo onyx Jan 1, 2024, 7:13 PM

#

heavy cloud <@852228934359842892> . you have to create a dataset from the models you have tr...

thanks i will try

indigo onyx Jan 1, 2024, 7:36 PM

#

heavy cloud <@852228934359842892> . you have to create a dataset from the models you have tr...

thanks a lot it worked for me

rich crescent Jan 4, 2024, 12:14 AM

#

Congratulations to Team "bootstrap" for winning this competition.🏅

orchid flame Jan 5, 2024, 8:27 AM

#

But what is the difference between public and private leaderboard?

orchid flame Jan 8, 2024, 10:01 AM

#

I actually had to go and find out the difference.

#

😆