#rsna-2023-abdominal-trauma-detection

1 messages · Page 1 of 1 (latest)

vital axle
#

Greetings to everyone.
I want to join a team for this competition.
If needed, plz dm me.

bitter bone
#

Hello Friends, I am looking to team-up with somebody for this competition. I am a data analytics graduate and would prefer to join anymedical domain individual.

neon vine
#

Hi everyone, thanks for getting this channel kicked off! I see you've posted in the #👥┊looking-for-a-team channel, @vital axle. That's great, thank you! @bitter bone, you'll notice I've just posted a template in that channel if you'd like to post there as well!

thick escarp
#

Hello all - happy to be here. Am also looking to form a team or be part of a team

short bridge
#

@neon vine good timing, we clicked enter at the same time 😄

thick escarp
neon vine
thick escarp
#

yup - appreciate the efforts

sinful niche
#

getting error while submission can anyone say procees of submission

pale sonnet
#

Hi all. I’m just getting started on this competition. Thought I’d share my plan for a first model. I’m loading the entire 3D scan series, permuting it to get axial, sagittal, and coronal views and taking the middle slice of each. Then concatting them together into a 2D image. I still need to apply the normalization ideas people are talking about in the forum

#

I’m not expecting this model to perform very well, but it’s a start. Later I want to try using the segmentation data to grab a 3D crop of relevant organs, and then take middle slice of those and concat them.

sinful niche
slim igloo
#

Has anybody thought of how the 3D arrays (CT scans) could be properly normalized. The range of the pixel-values in each .dcm file is not the normal 0-255 range, so dividing by 255.0, as you usually do with images, does not seem to be a good normalization. Any idea?

thick escarp
slim igloo
# sinful niche anyone say about this

Can you tell us what the error message says? It's hard to tell what the issue is without that knowledge. Maybe take a look at a sample submission in the 'code'' section of the competition to see what a working submission looks like

sinful niche
#

showing an error submission CSV NOT found

#

where we need to upload it and from where we get itt plz solve this problem

#

see error

thick escarp
sinful niche
#

i have read it still not getting

slim igloo
rustic loom
#

It says submission csv isn't found. Something is happening that is making a file with submission.csv not show up. Either you have too many other files in your working directory or your code is failing before it writes out your submission file.

On the private leaderboard it will run on different data that you don't have exposure to so you have to write your code to be defensive against new data, you can't have anything hard coded in terms of file paths or naming that might throw it off.

#

The general process is load in the data, run your model against this data, write the predictions out to fill in the sample_submission.csv and then write it back out as submission.csv

sinful niche
#

we need to load our note book in data later ???

thick escarp
sinful niche
#

no

#

how to generate

#

bro

slim igloo
#

This would be an example (see the notebook i linked above):
The crucial part is (i) loading the submission file (first line) (ii) filling it with your predictions (second line) and (iii) submitting it by writing submission.to_csv('submission.csv', index=False)´

#

Here 'Injuries' is a list of the column names in the submission dataframe

sinful niche
#

plz say how to do

#

ok thank you bro

slim igloo
#

Did anybody do anything different? Any other ideas?

vast linden
#

This is where "Windowing" comes in Felix.

#

There's a problem with PyDICOM and it doesn't handle VOI LUTs properly from this dataset.

#

PyDiCOM does not apply the WindowWidth and WindowCenter values properly inside its apply_voi_lut() and apply_windowing() functions.

#

The "window" we see is almost a "bone" window and is not optimal for soft tissue.

#

This is what a PyDicom export looks like ..

#

This is what it should look like ..

#

I exported that image with non-python software and it honors the windowing values.

#

not the same slice, but the concept remains.

slim igloo
quick mortar
#

can someone let me know if this empty space in a windowed ct scan is expected? I'm using the default windowing values (400 and 50)

dark merlin
slim igloo
# dark merlin Why not sklearn minmax scaling?

Because the minimum/maximum varies a lot between different CT scans and i want a uniform way to normalize the data. But thanks for the recommendation, i was not aware of that function.

dark merlin
slim igloo
vast linden
#

the problem with that approach is that it's lossy. That is, we're going from 12-16 bit numbers down to 8 bit numbers.

#

some data is always lost during normalization. The key is to select which data is most important to the specific anatomy. That's why we "window" CTs.

dark merlin
# slim igloo Would that not require loading in all of the data at once into an array X?

Yes, need array. pydicom will take .dcm as array. sample for a single image
from sklearn.preprocessing import MinMaxScaler
import pydicom as dicom
scaler = MinMaxScaler()

img = dicom.dcmread('/kaggle/input/rsna-2023-abdominal-trauma-detection/train_images/10300/31085/10.dcm').pixel_array

print('max pixel before scalling =',img.max())
img = scaler.fit_transform(img)
print('max pixel after scalling =',img.max())

dark merlin
slim igloo
#

Yes, that would work, but the data is >400GB, so there's no way to load all the data into one array.

dark merlin
vast linden
# dark merlin what is 'window' CT, is it alternative of normalization?

"windowing" or "leveling" CT's is what radiologists do to dial in the proper image contrast. Most CTs have special DICOM tags that specify up to three "windows" to best view the image in. I posted two images above. The top one uses 0-1 normalization, and the bottom one is "Windowed" to the default DICOM tag values of 500/50 .. which is a common abdomen window.

#

Think of a CT as a latent image. Because it has pixel values greater than the total amount of shades of gray our consumer-grade monitors can display, we must choose some to view and some to throw away. If we choose properly, we can maximize the contrast between similar tissue. If we just normalize 0-1, we lose valuable information in most cases.

thick escarp
#

Hello all - Have a lot of reading to do - but I wanted to check if I am on the right track. I'll break my comment into multiple sections so that the replies can be tracked against each section. From the data provided, it appears this is a classification task - possibly within 13 multi-classifications? There seem to be 5 higher level classifications - two binary (bowel health / injury, extravasation health / injury), 3 ternary ( kidney healthy / low / high, liver healthy / low / high, spleen healthy / low / high). What is notable (seems to me) is that probabilities do not seem to add to 1 across all classifications. They seem to add to 1 across the 5 higher level classifications.

#

I am next wondering what constitutes the training data (features and labels). It seems to me that entries in train.csv are all labels (excepting ofcourse for ids); the information contained in the .dcm and .nii files is likely all the training data. I am not entirely sure yet where the meta data, the aortic_hu (hounsfield units) fits in yet. As you can see, I have a long way to go 😦

#

Lastly - I am still trying to read through past competitions / notebooks to understand the nature of the .dcm and .nii files, what preprocessing is needed, what models should be used/reused, and is there one model or ultimately multiple models that will somehow work together

thick escarp
slim igloo
slim igloo
lone mountain
#

hello, i'm preprocessing a slice as such:

# https://pydicom.github.io/pydicom/stable/old/working_with_pixel_data.html
def open_file(p):
    f = dicom.read_file(p)
    im = f.pixel_array
    im = apply_modality_lut(im, f) # convert to Hounsfield units
    im = apply_voi_lut(im , f) # windowing
    im = (im - im.min()) / (im.max() - im.min()) # scale to [0,1]
    return im * 255

is this correct? I do get some images that that "inverted" i.e. mostly white, can someone explain what causes that?

grave cobalt
#

probably you missing this

if dicom.PhotometricInterpretation == "MONOCHROME1":
        img = 1 - img
lone mountain
grave cobalt
#

Have you tried this

def standardize_pixel_array(dcm: pydicom.dataset.FileDataset) -> np.ndarray:
    # Correct DICOM pixel_array if PixelRepresentation == 1.
    pixel_array = dcm.pixel_array
    if dcm.PixelRepresentation == 1:
        bit_shift = dcm.BitsAllocated - dcm.BitsStored
        dtype = pixel_array.dtype 
        pixel_array = (pixel_array << bit_shift).astype(dtype) >>  bit_shift
    return pixel_array

from this topic https://www.kaggle.com/competitions/rsna-2023-abdominal-trauma-detection/discussion/427217 ?

lone mountain
sinful niche
#

can anyone say how to open or preprocee dicm files in train_images and how to seperate data and images in parquet file

sinful niche
#

@ anyone

vast linden
pale sonnet
#

I feel like I saw this somewhere but can't remember where. Is there some metadata the tells you how many mm each "slice" represents? When I view them in coronal or sagittal orientation, some images look really squashed and some look really stretched.

quick mortar
pale sonnet
#

Thank you!

vast linden
#

Don't use Slice Thickness to reformat images from plane-to-plane. Use ImagePositionPatient coordinates instead.

#

Slice Thickness does not always represent the distance between slices. In fact, Slice Thickness does not match in most cases.

grave cobalt
#

Is it legal in this competition to add some labels by hand to the training data?

late knoll
sinful niche
#

why they have provided segmentation data ?? as it is a classification problem there is no need of segmentations

#

can anyone comment on this

quick mortar
# sinful niche why they have provided segmentation data ?? as it is a classification problem t...

https://www.kaggle.com/competitions/rsna-2023-abdominal-trauma-detection/discussion/428538
"In order to provide more anatomical context to where the injuries will will be present, we have provided voxelwise segmentations on a subset of 206 training cases that are enriched for the presence of significant injuries."

pale sonnet
#

Wondering if anyone has any hints on dealing with weak labels? I got an 2.5D RGB representation that I think ought to work okay, but when I tried to train ViT on it, it seemed to just quickly learn to always say "no injury" for all categories. I'm guessing that's because in the training data you have like 90+% "healthy" for each category.

#

What are some strategies for dealing with this? Use image augmentation to synthetically inflate the number of positive examples? Pre-train on some coarser categories like combine "low" and "high" injuries at first to get more positive examples, and then fine tune later with them separated back out?

agile cave
#

@pale sonnet Yes, it's a very common problem when working with imbalanced dataset. You may try using different sampling strategy, so the healthy images and those with injures will appear more equally. In case of single target classification, you can use e.g. WeightedRandomSampler from PyTorch - https://pytorch.org/docs/stable/data.html . For multi target, as in case of this challenge I'm not sure how to solve it yet.

smoky star
#

hi, I am new to this competition and I do not know how to work on such large datasets. Should I work on a subset of images or is there some way to process the dicom files into arrays and input the entire dataset in a neural network?

thick escarp
# smoky star hi, I am new to this competition and I do not know how to work on such large dat...
smoky star
thick escarp
smoky star
#

Thanks, ill check them out.

modest creek
junior totem
#

Is anyone having trouble with the Public API for this competition? In google colab when I run: !kaggle competitions files rsna-2023-abdominal-trauma-detection

#

It seems like a bunch of the train_image patient ids are missing.

junior totem
fringe jolt
#

Certainly, 591. However, this outcome is not surprising.

scenic badge
junior totem
restive crane
#

Hi guys I am new to this challenge and kaggle. I am wondering what is the optimal way to deal with this big dataset. I have few questions:
Is pydicom slow? Is it better to process the whole dataset into other formats? I want create 25D image for each subject, meaning I have to load the entire volume for coronal and saggital views.

smoky star
#

I spilt the data into train and val then i created tensorflow pipeline for inputting the data but when i run this for training, kaggle throws an error and tells me to shift to google cloud notebook. This is probably becuase of my batch_size being 64 and kaggle does not have enough RAM to accomodate that. My doubt was that people who are intensively working on this, are they using google cloud??

quick mortar
#

Hi , how to handle the exteremely large dataset? What is the best way to deal with this?

restive crane
#

google cloud's VMs are very confusing. And I tried every location and never got an actual gpu

#

always unavailable

quick mortar
#

Is it possible to write code for this competition without GPU

restive crane
#

I dont think so. that would take days to train one epoch. If you're using any NN

#

I mean you can process the data with GPU, of course

#

but I dont think you can do the training with GPU

smoky star
#

should we use some sort of sampling techniques to work on a small amount of data to test the neural network and the results, and based on those results we could apply the same architechture to the entire dataset?

quick mortar
#

Thanks

fringe jolt
restive crane
#

Hello guys. Have you guys encountered bottleneck using online gpus? I rented a A6000 with dataset downloaded on persisent storage(I tested with about 500MB/s). However, my model is significantly slower than running on Kaggle's P100. About 5 times.

#

I noticed model took most of the time loading data to gpu. but I don't think there is bottleneck in disk speed.

#

Nevermind guys. I increased num_worker in dataloader and it run very fast now

junior totem
grave rain
#

Hi, I am new to kaggle contests, and I have a question about what we should submit. Do we need to submit the code that process the data, trains the model and makes the prediction? and that should run in less that 9 hours? Or just, the trained model should run in less than 9 hours when running the predictions?

heavy rover
grave rain
kind elk
#

Does anyone having some progress ? I'm facing a lot of issues about the datasets...

chrome dome
#

Hi, I just started with this competition and looked at the best scored notebook(0.66), can someone please explain what exactly is it doing?

heavy rover
# chrome dome Hi, I just started with this competition and looked at the best scored notebook(...

The notebook creates constant probabilities for each test sample.
The probabilities for each class was calculated by calculating the mean label of each class.
You can read the notebook, the code is short and simple.
https://www.kaggle.com/code/mirenaborisova/rsna-0-66-lb

chrome dome
restive crane
#

Guys is it true that for testing extravastation you need you have two phases of scans?

#

If true, does the dataset always provide two scans for potential extravastation subjects?

hard flint
#

Will resizing the data from 512×512 to 256×256 have any impact on model performance?

#

Am thinking of cropping instead. Any suggestions?

sterile heart
heavy rover
heavy rover
chrome dome
urban lynx
#

Hello guys, I have a problem with submitting my predictions. It always fails and gives me the following error: "Submission Scoring Error
Your notebook generated a submission file with incorrect format. Some examples causing this are: wrong number of rows or columns, empty values, an incorrect data type for a value, or invalid submission values from what is expected. See more debugging tips". I inspected the notebook and compared it to the sample_submission.csv and there is no difference in the shape/format.

junior totem
urban lynx
#

ty very much for your time gonna look into that soon

kind elk
#

All the files are in .nii or .dcm format, wich causes a major problem...the Pytorch simply can't read.
How to fix it ?

slim igloo
tawdry ruin
#

Hey, I need a quick summary of how to submit. I've seen several notebooks, been following discussions, but the available notebooks load the submission file and modify it using the weights of the sample results. When I create my DataFrame based on predictions from the model and save it exactly as the submission looks, I get an error during scoring.

grave rain
#

Hi guys, this is my first Kaggle competition. I have a quick and easy question. Do I need to fill the submission.csv file with the results of my model running in training images? Or should I use another data source to evaluate and send the submission.csv file? Should I use the data that is in testing folder?

hot cedar
forest olive
#

Hello everyone. According to train_series_meta.csv, there are incomplete organs in the training set. Does anybody know that whether test set hasor not has incomplete organs?

lunar radish
heavy rover
lunar radish
#

Thank you

grave rain
#

Hi, I am new in kaggle. I submitted my submission.csv file and the scoring process is taking more than 3 hours. Is it ok? for just 3 test data images? I read the log and it takes some minutes to run, but does the scoring step take hours?

slim igloo
wary sequoia
grave rain
#

Hey, I am blocked when generating the csv file. I use the to_csv function from pandas dataframe, I removed the index before saving the file. Also I am using the sample_submission.csv and adding all the rows at the end. But I am still getting Submission Scoring Error. I checked the file encoding, and don't know what else to do. I tried this way https://www.kaggle.com/competitions/icr-identify-age-related-conditions/discussion/412946 and nothing. Do you have experienced the same issue? Thanks for your help!

spice spruce
#

Hello, would be happy to hear your thoughts, thank you in advance!
There's a successful participator who recommends to use 3DCNN. Do you think it's possible to train those models only having the machine kaggle offers for free? Thank you in advance for your answer.
Im talkin about this person and their model
https://www.kaggle.com/code/hengck23/lb0-55-2-5d-3d-sample-model

sinful niche
#

from kaggle_helper import *
from kaggle_metric import *
while importing it is showing no kaggle_helper module

#

can anyone help

sterile heart
# sinful niche from kaggle_helper import * from kaggle_metric import * while importing it is sh...

If you ran into this while executing the notebook @spice spruce shared, I believe kaggle_helper.py and kaggle_metrics.py are separate files created by the author of the notebook.
You can checkout this notebook to see how that can be done:
https://www.kaggle.com/code/rtatman/import-functions-from-kaggle-script

sinful niche
#

how we will do for helper and metrics am not getting

sterile heart
worldly finch
#

Help me,please. Failed submission from notebook with CNN in pytorch.do I need to process the dataframe in some way before submission?

sinful niche
#

from slice_model import Net as SliceNet any idea regarding this model

#

where we find it

worldly finch
#

How to fix it:dicomsdl-0.109.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl is not a supported wheel on this platform?

sinful niche
worldly finch
#

how to fix error:from kaggle_helper import * No module named 'kaggle_helper'?

sinful niche
#

same issue with me ..can anyone help

tall zephyr
#

@everyone
Now
Webinar, TRAUMA Abdominal Detection Competitions

Please connect:

https://us02web.zoom.us/j/82192733489?pwd=VThkMTZRZkx0aUREdU1nRlZxMElXZz09

versed fable
#

Hi everyone! I'm stuck on that scoring error issue. I made sure that every label is distributed according to its type like binary ones sum to 1 where triple categories sum to 1. I have no idea where I'm doing wrong. Can you please take a look to my notebook and advise me what I should do? Thank you. https://www.kaggle.com/code/noelyoda/fork-of-dr-oi-model-w-h5/settings?scriptVersionId=145408457

kindred bobcat
#

Has anyone tried running the Keras Infer Kernel that was pinned in the competition? https://www.kaggle.com/code/aritrag/kerascv-starter-notebook-infer I have not had any success running this and keep getting the attached error message. This is my first competition and I'm really struggling with these technical details 😕 If anyone has thoughts to share, I'd greatly appreciate it!

sterile heart