storm moon Jun 2, 2024, 5:19 PM

#

while using ngrok on kaggle

#

so i need to upload my model file to in w-okada files model_dir folder

#

as u see theres 3-2-1-0

sly inlet Jun 2, 2024, 5:24 PM

#

storm moon as u see theres 3-2-1-0

I'm not on my laptop now, but can you try uploading the model on GitHub and then cloning repo in kaggle notebook

#

Or can you try downloading from gdrive , using wget or curlcommand

storm moon Jun 2, 2024, 5:25 PM

#

sly inlet Or can you try downloading from gdrive , using `wget or curl`command

well it would be nice

#

but i dont know how to use

#

wget and curl commands

#

#

yea i cant upload on app

#

i got some error

#

allifeelispain

#

4 hour wasted for nothing

#

trying to fix these things :D

#

nah bro kaggle is idk

#

well i think ill pay collab

sly inlet Jun 2, 2024, 5:31 PM

#

storm moon well i think ill pay collab

Haha , yes that will be a better option

#

More storage as well

storm moon Jun 2, 2024, 5:32 PM

#

sly inlet More storage as well

no need storage tbh

#

i wanna just use voice changer via ngrok

#

i need gpu for that

#

im using paperspace for training models

#

but paperspace not support ngrok

sly inlet Jun 2, 2024, 5:32 PM

#

storm moon i need gpu for that

So you just want to host your model then ?

storm moon Jun 2, 2024, 5:32 PM

#

if it it would be perfect

storm moon Jun 2, 2024, 5:33 PM

#

sly inlet So you just want to host your model then ?

i have a voice model i trained for w-okada voice changer. i need a gpu notebook that i can use w-okada through ngrok to use it.

sly inlet Jun 2, 2024, 5:34 PM

#

storm moon i have a voice model i trained for w-okada voice changer. i need a gpu notebook ...

You want it run 24x7

storm moon Jun 2, 2024, 5:34 PM

#

in order to use this model, I need to load this model into the w-okada voice changer files that I have installed on the notebook. you can load it directly through w-okada, but when using the voice changer through ngrok, it is difficult to do this, it gives a warning. so I need to load my model directly into the w-okada files in the notebook, so I have to move my model with the pth extension into that file

sly inlet Jun 2, 2024, 5:34 PM

#

You can check huggingface service as well

storm moon Jun 2, 2024, 5:34 PM

#

sly inlet You want it run 24x7

not that much

#

8-9 hours per day maybe

#

max

storm moon Jun 2, 2024, 5:35 PM

#

sly inlet You can check huggingface service as well

didnt try it before

#

when i try, same things happen i guess

#

cuz i dont know how to do on huggingface

#

like in the kaggle

#

whats this even i dont get it

#

on collab didnt i get kind of errors

#

but collab getting so much money so

sly inlet Jun 2, 2024, 5:36 PM

#

storm moon on collab didnt i get kind of errors

Yes , its more user friendly

sly inlet Jun 2, 2024, 5:37 PM

#

storm moon but collab getting so much money so

Let me check huggingface services if they support what you want

sly inlet Jun 2, 2024, 5:37 PM

#

storm moon but collab getting so much money so

Kaggle? How much do they cost

storm moon Jun 2, 2024, 5:37 PM

#

sly inlet Kaggle? How much do they cost

30 hours per week is free

#

so

#

idk they have paid service

#

30 hours per week is 5000x better than collab

#

cuz collab gaves

#

1hours free per account ::D

sly inlet Jun 2, 2024, 5:39 PM

#

storm moon 1hours free per account ::D

No I believe they give 3 hours daily

#

But it can disconnect anytime,

storm moon Jun 2, 2024, 5:40 PM

#

sly inlet No I believe they give 3 hours daily

oh i forgot then

#

https://media.discordapp.net/attachments/1129507816697241822/1246871455715819620/image.png?ex=665df708&is=665ca588&hm=6166f8c4e92fa39e6b8ea0d71b445f5fcdaed43348cde86fc53e49d77fa28cd2&=&format=webp&quality=lossless

#

i have pth file

storm moon Jun 2, 2024, 5:41 PM

#

storm moon https://media.discordapp.net/attachments/1129507816697241822/1246871455715819620...

i just need to move this in model_dir file here

sly inlet Jun 2, 2024, 5:41 PM

#

storm moon i just need to move this in model_dir file here

I will check, how we can do it,

#

On my way home

sly inlet Jun 2, 2024, 5:42 PM

#

storm moon i just need to move this in model_dir file here

You can upload your model on huggingface

sly inlet Jun 2, 2024, 6:07 PM

#

@storm moon did you try this

Screenshot_2024-06-02_at_11.37.23_PM.png

storm moon Jun 2, 2024, 6:08 PM

#

storm moon

as u see

#

@sly inlet

storm moon Jun 2, 2024, 6:08 PM

#

sly inlet You can upload your model on huggingface

yea but after

sly inlet Jun 2, 2024, 6:09 PM

#

storm moon yea but after

you can download model from there

sly inlet Jun 2, 2024, 6:09 PM

#

storm moon

you have it on kaggle notebook ? or locally ?

sly inlet Jun 2, 2024, 6:10 PM

#

storm moon i just need to move this in model_dir file here

ok , so you need to move your pth file from one folder to another , right ?

storm moon Jun 2, 2024, 6:14 PM

#

sly inlet ok , so you need to move your pth file from one folder to another , right ?

YEA

#

but

storm moon Jun 2, 2024, 6:14 PM

#

storm moon

this one

#

input

storm moon Jun 2, 2024, 6:14 PM

#

storm moon https://media.discordapp.net/attachments/1129507816697241822/1246871455715819620...

this one output

sly inlet Jun 2, 2024, 6:15 PM

#

move from input to output ?

storm moon Jun 2, 2024, 6:15 PM

#

sly inlet move from input to output ?

how

#

i dont know how to -*-

sly inlet Jun 2, 2024, 6:16 PM

#

 
!mv /kaggle/working/source_folder/mode.pth /kaggle/working/destination_folder/

storm moon Jun 2, 2024, 6:16 PM

#

lets try

sly inlet Jun 2, 2024, 6:16 PM

#

change the model input path and and output folder path there

#

first path should be of models pth file and second. path should be of folder you want to move pth file to

storm moon Jun 2, 2024, 6:21 PM

#

sly inlet first path should be of models pth file and second. path should be of folder you...

is it possible to create a file

#

at output

sly inlet Jun 2, 2024, 6:21 PM

#

like what file ?

#

folder

#

or you mean copy model weights from on folder to another

storm moon Jun 2, 2024, 6:37 PM

#

sly inlet folder

ah yea

#

folder

#

is it possible to create folder

#

in model_dir

sly inlet Jun 2, 2024, 6:37 PM

#

yes

storm moon Jun 2, 2024, 6:37 PM

#

folder

#

okey can u give me code line creating folder in model_dir

#

but

#

i closed already

sly inlet Jun 2, 2024, 6:38 PM

#

mkdir destination_folder/new_folder_name

sly inlet Jun 2, 2024, 6:38 PM

#

storm moon i closed already

closed what

storm moon Jun 2, 2024, 6:38 PM

#

kaggle

sly inlet Jun 2, 2024, 6:38 PM

#

then you will lose model , won't you ?

#

or did you download model locally or saved it on kaggle

storm moon Jun 2, 2024, 7:33 PM

#

sly inlet or did you download model locally or saved it on kaggle

model is already exist in my pc

#

i trained 1 month ago

sly inlet Jun 2, 2024, 7:33 PM

#

nice

#

you can upload it on kaggle as well

storm moon Jun 2, 2024, 7:33 PM

#

tbh i dont want cuz i dont want to public these my models

sly inlet Jun 2, 2024, 7:34 PM

#

as private model , so only you can access it

storm moon Jun 2, 2024, 7:34 PM

#

sly inlet as private model , so only you can access it

It doesn't mean much cuz i dont need -*-

sly inlet Jun 2, 2024, 7:35 PM

#

storm moon It doesn't mean much cuz i dont need -*-

it will just ease out your process , but if you don't want then its ok : )

storm moon Jun 2, 2024, 7:35 PM

#

storm moon

i already uploaded private

#

as u see

#

anyway its not working w-okada on kaggle

#

idk why

#

laggy or something

#

i think ill continue on collab

sly inlet Jun 2, 2024, 7:36 PM

#

storm moon anyway its not working w-okada on kaggle

what is the problem ?

sly inlet Jun 2, 2024, 7:36 PM

#

storm moon as u see

ohh, sorry i didn't know

storm moon Jun 2, 2024, 7:36 PM

#

I'm done dealing with this :D but thanks for asking

#

it been 6-7 hours

#

still trying to do on kaggle

#

waste of time

sly inlet Jun 2, 2024, 7:37 PM

#

just tell me the problem so i will try to see in my free time

#

don't need to waste your time

storm moon Jun 2, 2024, 7:37 PM

#

https://www.kaggle.com/code/hinabl/public-w-okada-voice-changer

Public W-okada Voice Changer .

Explore and run machine learning code with Kaggle Notebooks | Using data from No attached data sources

#

just run this

#

and use w-okada program

#

normally

#

if u do this will be good

sly inlet Jun 2, 2024, 7:37 PM

#

ok

storm moon Jun 2, 2024, 7:38 PM

#

someones can do

#

but i cant so

#

ill keep to use colalb

#

collab*

sly inlet Jun 2, 2024, 7:38 PM

#

storm moon but i cant so

if someone can the you should also do it

sly inlet Jun 2, 2024, 7:39 PM

#

storm moon https://www.kaggle.com/code/hinabl/public-w-okada-voice-changer

but i ran this notebook before

#

when you first asked help

#

and it does worked for me as i send you the screenshot above

#

storm moon Jun 2, 2024, 7:50 PM

#

its worked for me but i cant upload a mode

#

model and i cant choose mode

#

model

sly inlet Jun 2, 2024, 7:52 PM

#

like how

#

that notebook don't have any code related to model

sly inlet Jun 2, 2024, 7:54 PM

#

storm moon model and i cant choose mode

you mean , you have to speciify model path here , which one of the below is used to specify model path ?

--content_vec_500 pretrain/checkpoint_best_legacy_500.pt \
  --content_vec_500_onnx pretrain/content_vec_500.onnx \
  --content_vec_500_onnx_on true \
  --hubert_base pretrain/hubert_base.pt \
  --hubert_base_jp pretrain/rinna_hubert_base_jp.pt \
  --hubert_soft pretrain/hubert/hubert-soft-0d54a1f4.pt \
  --nsf_hifigan pretrain/nsf_hifigan/model \
  --crepe_onnx_full pretrain/crepe_onnx_full.onnx \
  --crepe_onnx_tiny pretrain/crepe_onnx_tiny.onnx \
  --rmvpe pretrain/rmvpe.pt \

sly inlet Jun 2, 2024, 7:55 PM

#

storm moon i already uploaded private

you said you had model on kaggle here ,

#

then what is the problem ? could you be more clear

storm moon Jun 2, 2024, 7:58 PM

#

sly inlet

click this blue text

#

open w okadar

#

okada

#

up there ull see edit button

#

click there

#

also theres edit button too choose blank one edit button

#

choose ur pth file

#

upload if u can

sly inlet Jun 2, 2024, 8:00 PM

#

ohhk , i never open that link , 😅

#

stupid of me

#

let me re run the notebook

sly inlet Jun 2, 2024, 8:21 PM

#

storm moon whats this even i dont get it

ok , so i'm getting this error as well

#

you won't get this error on google colab then ?

storm moon Jun 2, 2024, 8:31 PM

#

sly inlet you won't get this error on google colab then ?

i can direct upload my model to in model_dir folder

#

so

#

cuz these files in drive google i can access easily

pulsar merlin Jun 3, 2024, 1:01 AM

#

Hi, I am trying to pass speech signals to my model by extracting features using Non negative matrix factorization. but unable to find the correct results. Can anyone guide me?

sly inlet Jun 3, 2024, 12:42 PM

#

storm moon i can direct upload my model to in model_dir folder

show me your colab file structure , i will try to do the same in kaggle.

#

it should be done in both

storm moon Jun 3, 2024, 12:42 PM

#

sly inlet show me your colab file structure , i will try to do the same in kaggle.

https://colab.research.google.com/github/hinabl/voice-changer-colab/blob/master/Hina_Modified_Realtime_Voice_Changer_on_Colab.ipynb#scrollTo=86wTFmqsNMnD

Google Colab

sly inlet Jun 3, 2024, 12:43 PM

#

i don't want notebook

#

i want to know where and how did you locate model in that directory?

#

pth model ,
!mv source_path dest_path should be able to move model to specific directoy

obsidian bone Jun 4, 2024, 12:29 PM

#

can anyone tell me how to modify LLM layers (hugging face text generation models) and how to add custom heads to them?

misty roost Jun 4, 2024, 1:31 PM

#

Hey guys. I want to rank up on Kaggle. In order to become "Kaggle Expert", do I have to become "expert" in every category (competitions, datasets, notebooks, discussions)? Or just becoming "expert" in one of those categories is enough?

sly inlet Jun 4, 2024, 1:43 PM

#

obsidian bone can anyone tell me how to modify LLM layers (hugging face text generation models...

what do you want to do ?

#

you can create new layer and pass the input from base model to it ,

#

that's how LM head is added to base model , for generation purpose

obsidian bone Jun 4, 2024, 1:46 PM

#

sly inlet you can create new layer and pass the input from base model to it ,

so far this is what I've done

I get the model using AutoModel.from_pretrained
then I passed in tokenized text
I get the outputs of the model using .last_hidden_state
but the last hidden state has a shape of [batch size, Sequence length, Embedding size]
which is not constant... it changes from one sentence to another
I saw some people get the output using out_ids.last_hidden_state[:, 0, :] but that only takes the embedding of the first token...

#

I want to take outputs from the LLM model and feed it into a custom pytorch model

#

but I've having trouble with dealing the last hidden state... i don't know how to work with it....

#

so thought I might modify it or change it...

sly inlet Jun 4, 2024, 1:54 PM

#

ok so which model are you using

sly inlet Jun 4, 2024, 1:55 PM

#

obsidian bone so far this is what I've done 1) I get the model using AutoModel.from_pretrained...

yes model output will change according to your tokens

obsidian bone Jun 4, 2024, 1:56 PM

#

sly inlet yes model output will change according to your tokens

so if I want to pass it to a fully connected network

#

am I supposed to change input dimension of fully connected network every time i pass a new sentence?

#

that's not logical, it's like instantiating a new model every sentence passed

#

example:

first sentence passed -> last_hidden_state.shape = [1, 59, 3072]
which means first layer of fc_model= torch.nn.Linear(59*3072, 256)

second sentence passed -> last_hidden_state.shape = [1, 92, 3072]
which means first layer of fc_model= torch.nn.Linear(92*3072, 256)

you get what I'm saying?

sly inlet Jun 4, 2024, 2:23 PM

#

obsidian bone am I supposed to change input dimension of fully connected network every time i ...

you can add paddings to get constant length outputs

#

that's what everyone do when training with batches

obsidian bone Jun 4, 2024, 2:25 PM

#

ok for example padding=128 and it will be last_hidden_state.shape = [1, 128, 3072]
input to torch.nn.Linear(128*3072, 256)

#

is that how people create custom heads for llms?

#

I found this,

class BertForSequenceClassification(BertPreTrainedModel):
    def __init__(self, config):
        super().__init__(config)
        self.num_labels = config.num_labels
        self.config = config

          
        self.bert = BertModel(config)
        classifier_dropout = (
            config.classifier_dropout if config.classifier_dropout is not None else config.hidden_dropout_prob
        )
        self.dropout = nn.Dropout(classifier_dropout)
        self.classifier = nn.Linear(config.hidden_size, config.num_labels)

    self.init_weights()

as far as I know bert hidden size is 768, so in self.classifier it will be nn.Linear(768, num_labels=2 for example)

#

the output will then be [Batch size, sequence, num_labels]

sly inlet Jun 4, 2024, 2:29 PM

#

that's why i'm asking you for model

#

you want to train bert for classifier

obsidian bone Jun 4, 2024, 2:29 PM

#

no....

#

this is just example

#

what I mean is

sly inlet Jun 4, 2024, 2:30 PM

#

tell me what exactly are you trying to do

obsidian bone Jun 4, 2024, 2:30 PM

#

if padding size is constant

sly inlet Jun 4, 2024, 2:31 PM

#

https://github.com/huggingface/transformers/blob/821b772ab915e53870aabba6cb765e883501a1e6/src/transformers/models/llama/modeling_llama.py#L1181

GitHub

transformers/src/transformers/models/llama/modeling_llama.py at 821...

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. - huggingface/transformers

#

lm head for llama

sly inlet Jun 4, 2024, 2:31 PM

#

obsidian bone if padding size is constant

yes

obsidian bone Jun 4, 2024, 2:32 PM

#

for example say padding_size=128,
then the last_hidden_state.shape = [1, 128, 3072]
this is 3d tensor
torch.nn.Linear accepts [batch size, in_dim]

#

if i do 128*3072, that's a huge number

#

I just want to know how people connect the head efficiently

obsidian bone Jun 4, 2024, 2:33 PM

#

obsidian bone is that how people create custom heads for llms?

in this case for bert, they just passed hidden_size without multiplying it with sequence length...

#

so where did sequence length go...

sly inlet Jun 4, 2024, 2:34 PM

#

obsidian bone in this case for bert, they just passed hidden_size without multiplying it with ...

for bert in case of classification we use first token [:,0,:]

obsidian bone Jun 4, 2024, 2:35 PM

#

sly inlet for bert in case of classification we use first token `[:,0,:]`

yes ok good

#

now the question is

#

[:, 0, :] only takes embbedding of first token

#

what about the rest of tokens

sly inlet Jun 4, 2024, 2:35 PM

#

as CLS tokens contains the whole information of sentence

obsidian bone Jun 4, 2024, 2:35 PM

#

their information won't be passed

obsidian bone Jun 4, 2024, 2:35 PM

#

sly inlet as `CLS` tokens contains the whole information of sentence

wait what?

#

wdym contains the whole information of the sentence?

sly inlet Jun 4, 2024, 2:36 PM

#

obsidian bone their information won't be passed

we use whole embedding if we want to do token classification

sly inlet Jun 4, 2024, 2:36 PM

#

obsidian bone wdym contains the whole information of the sentence?

thats what standard method is

#

it is what it is , CLS tokens has all the info we need for classification

obsidian bone Jun 4, 2024, 2:36 PM

#

so CLS token having the whole sentence's information is only applicable for bert? or is it applicable for all LLMs?

sly inlet Jun 4, 2024, 2:36 PM

#

obsidian bone so CLS token having the whole sentence's information is only applicable for bert...

no

obsidian bone Jun 4, 2024, 2:37 PM

#

I am using Gemma

#

does it have CLS token?

sly inlet Jun 4, 2024, 2:37 PM

#

are you running it locally or on collab

obsidian bone Jun 4, 2024, 2:37 PM

#

sly inlet are you running it locally or on collab

kaggle notebook

sly inlet Jun 4, 2024, 2:37 PM

#

obsidian bone does it have CLS token?

its different architecture

sly inlet Jun 4, 2024, 2:37 PM

#

obsidian bone kaggle notebook

wait a little i will show you the flow

obsidian bone Jun 4, 2024, 2:38 PM

#

wait lemme share my code

#


tokenized_sentence = tokenizer(sentence, return_tensors='pt', padding="max_length", max_length=128)

out_ids = model(**tokenized_problem)

class Decision_Model(torch.nn.Module):
    def __init__(self, in_dim, out_dim):
        super(Decision_Model, self).__init__()
        
        self.fc = torch.nn.Sequential(
            torch.nn.Linear(in_dim, out_dim, dtype=torch.bfloat16),
            torch.nn.Softmax(dim=1)
        )
    
    def forward(self, x):
        return self.fc(x)

basic_des = Decision_Model(what dimension to put here? , n_labels)

//Problem is here, how do I pass out_ids.last_hidden_state with shape of [1, 128, 3072] to basic_des??

outputs = basic_des(out_ids.last_hidden_state) ???

sly inlet Jun 4, 2024, 2:46 PM

#

obsidian bone ``` tokenized_sentence = tokenizer(sentence, return_tensors='pt', padding="max_...

so you are using base gemma model not GemmaForCausalLM

#

what dimension output you get from gemma model

obsidian bone Jun 4, 2024, 2:47 PM

#

sly inlet so you are using base gemma model not GemmaForCausalLM

Yes base model only

sly inlet Jun 4, 2024, 2:47 PM

#

and what do you want to do

obsidian bone Jun 4, 2024, 2:47 PM

#

sly inlet and what do you want to do

I showed you in the above code

obsidian bone Jun 4, 2024, 2:47 PM

#

obsidian bone ``` tokenized_sentence = tokenizer(sentence, return_tensors='pt', padding="max_...

this

#

just want to pass out_ids.last_hidden_state to torch.nn.Linear

sly inlet Jun 4, 2024, 2:48 PM

#

i mean end goal ,

obsidian bone Jun 4, 2024, 2:48 PM

#

that's all, that's the whole problem

sly inlet Jun 4, 2024, 2:48 PM

#

what will you achieve after doing it

obsidian bone Jun 4, 2024, 2:48 PM

#

classification

#

sentence

sly inlet Jun 4, 2024, 2:48 PM

#

obsidian bone that's all, that's the whole problem

ok i will reach out to you soon , i will head home now

obsidian bone Jun 4, 2024, 2:48 PM

#

ok

#

thanks for your time tho

sly inlet Jun 4, 2024, 2:49 PM

#

but they already have GemmaForSequenceClassification

#

https://github.com/huggingface/transformers/blob/821b772ab915e53870aabba6cb765e883501a1e6/src/transformers/models/gemma/modeling_gemma.py#L1257

GitHub

transformers/src/transformers/models/gemma/modeling_gemma.py at 821...

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX. - huggingface/transformers

obsidian bone Jun 4, 2024, 3:00 PM

#

sly inlet https://github.com/huggingface/transformers/blob/821b772ab915e53870aabba6cb765e8...

[inside] def __init__(self, config):
    self.num_labels = config.num_labels
    self.model = GemmaModel(config)
    self.score = nn.Linear(config.hidden_size, self.num_labels, bias=False)

[inside] def forward(self,...):
    transformer_outputs = self.model(
            input_ids,
            attention_mask=attention_mask,
            position_ids=position_ids,
            past_key_values=past_key_values,
            inputs_embeds=inputs_embeds,
            use_cache=use_cache,
            output_attentions=output_attentions,
            output_hidden_states=output_hidden_states,
            return_dict=return_dict,
        )
    hidden_states = transformer_outputs[0]
    logits = self.score(hidden_states)

their passing of hidden_states to self.score is strange...

#

I will look more into this

sly inlet Jun 4, 2024, 3:02 PM

#

obsidian bone ``` [inside] def __init__(self, config): self.num_labels = config.num_labels...

No it's the first element of output so its last states

obsidian bone Jun 4, 2024, 3:02 PM

#

i know that

#

hidden_states shape is still [batch_size, token_length, hidden_size]

#

wait we can pass 3d tensor to torch.nn.Linear?

sly inlet Jun 4, 2024, 3:03 PM

#

Yes

#

The only thing that matters is last dim should match to input dim

#

If you want to know more try adding print statements in transformers lib code

tiny anvil Jun 4, 2024, 3:59 PM

#

Hey guys I was looking for some help with this problem. I was hoping to figure out a different solution to the one posted (which calls the function made in the previous problem). The issue I think I am having is the elif in the second for loop at the bottom which checks if the key is not in the match list.

sly inlet Jun 4, 2024, 4:26 PM

#

tiny anvil Hey guys I was looking for some help with this problem. I was hoping to figure ...

you are reassigning values every time

tiny anvil Jun 4, 2024, 4:27 PM

#

Right, I was wondering how I can get the elif statement to only activate if the key isnt in the match list

#

If I remove the elif statement, it works, but the issue is if there is a key thats in keywords that isnt in the match list it just ignores it instead of adding an empty value.

sly inlet Jun 4, 2024, 4:30 PM

#

tiny anvil Right, I was wondering how I can get the elif statement to only activate if the ...

then initialize x with with {key:[]} and just append the index or i in this case

tiny anvil Jun 4, 2024, 4:42 PM

#

sly inlet then initialize x with with `{key:[]}` and just append the `index` or `i` in thi...

Oh gotcha, ill go back and try that in a sec, but the key is a local variable in the second for loop, so I dont think I can initialize it with that right?

sly inlet Jun 4, 2024, 4:42 PM

#

tiny anvil Oh gotcha, ill go back and try that in a sec, but the key is a local variable in...

send me that code i will refactor it

tiny anvil Jun 4, 2024, 4:44 PM

#

sly inlet send me that code i will refactor it

x={}
    for i, doc in enumerate(doc_list):
        infos = doc.split()
        match = [info.rstrip('.,').lower() for info in infos]
            
        for key in keywords:
            if key.lower() in match:
                x[key] = [i]
    
    return x```

sly inlet Jun 4, 2024, 4:45 PM

#

could you send the whole function

tiny anvil Jun 4, 2024, 4:45 PM

#

def multi_word_search(doc_list, keywords):
    """
    Takes list of documents (each document is a string) and a list of keywords.  
    Returns a dictionary where each key is a keyword, and the value is a list of indices
    (from doc_list) of the documents containing that keyword

    >>> doc_list = ["The Learn Python Challenge Casino.", "They bought a car and a casino", "Casinoville"]
    >>> keywords = ['casino', 'they']
    >>> multi_word_search(doc_list, keywords)
    {'casino': [0, 1], 'they': [1]}
    """
    
    x={}
    for i, doc in enumerate(doc_list):
        infos = doc.split()
        match = [info.rstrip('.,').lower() for info in infos]
            
        for key in keywords:
            if key.lower() in match:
                x[key] = [i]
    
    return x

# Check your answer
q3.check()```

sly inlet Jun 4, 2024, 4:51 PM

#

tiny anvil ```py def multi_word_search(doc_list, keywords): """ Takes list of docum...

you can delete your code blocks now, just to make it more cleaner

#

def multi_word_search(doc_list, keywords):
    """
    Takes list of documents (each document is a string) and a list of keywords.  
    Returns a dictionary where each key is a keyword, and the value is a list of indices
    (from doc_list) of the documents containing that keyword

    >>> doc_list = ["The Learn Python Challenge Casino.", "They bought a car and a casino", "Casinoville"]
    >>> keywords = ['casino', 'they']
    >>> multi_word_search(doc_list, keywords)
    {'casino': [0, 1], 'they': [1]}
    """
    
    x= {k:[] for k in keywords}
    for i, doc in enumerate(doc_list):
        infos = doc.split()
        match = [info.rstrip('.,').lower() for info in infos]

        for key in keywords:
            if key.lower() in match:
                x[key].append(i)
    
    return x

# Check your answer

tiny anvil Jun 4, 2024, 4:53 PM

#

sly inlet ```python def multi_word_search(doc_list, keywords): """ Takes list of d...

Wow that was so simple, thanks

#

I think I need to practice using list comprehension and for loops a bit more

sly inlet Jun 4, 2024, 5:35 PM

#

@obsidian bone did you get it to work , the way you wanted ?

obsidian bone Jun 4, 2024, 5:38 PM

#

sly inlet <@305317021661528066> did you get it to work , the way you wanted ?

I was doing laundry, just finished

#

Will look into it later and tell you

haughty mulch Jun 4, 2024, 6:27 PM

#

By looking at raw data from a dataset, how can one decide whether it is necessary to add features (say, mean, variance, etc.)?
Is there any standard approach, or should I just do a trail and check?
If there is an approach, how do I know which features I should include?
Thank you

obsidian bone Jun 5, 2024, 7:37 AM

#

@sly inlet

tokenized_problem = tokenizer(sentence, return_tensors='pt')

out_ids = model.forward(tokenized_problem['input_ids'].to(DEVICE))

debug_layer = torch.nn.Linear(hidden_size, n_labels, dtype=torch.bfloat16).to(DEVICE)

debug_out = debug_layer(out_ids[0])

debug_out.mean(1)

debug_out.shape = [batch_size, n_labels]

#

I checked Gemma's sequence classification, they used last token which is [:, -1, :] to get their logits,

#

but i wanted to accomodate all the tokens, so seems like taking the mean is the only option here

#

thanks for your help

sly inlet Jun 5, 2024, 7:45 AM

#

obsidian bone I checked Gemma's sequence classification, they used last token which is [:, -1,...

i believe that is standard way for it then

sly inlet Jun 5, 2024, 7:45 AM

#

obsidian bone but i wanted to accomodate all the tokens, so seems like taking the mean is the ...

i don't think it will work better with this approach , but can try

obsidian bone Jun 5, 2024, 7:46 AM

#

sly inlet i don't think it will work better with this approach , but can try

I'll try both ways, with mean and with taking the last token and see what happens

sly inlet Jun 5, 2024, 7:46 AM

#

last token method will pretty much work as Transformers have implemented it.

#

but i'm curious to see how mean of embeddings will behave

obsidian bone Jun 5, 2024, 7:49 AM

#

sly inlet last token method will pretty much work as Transformers have implemented it.

oohhh...

#

so transformers only depend on last token to predict next word?

sly inlet Jun 5, 2024, 7:51 AM

#

obsidian bone so transformers only depend on last token to predict next word?

like last hidden states

obsidian bone Jun 5, 2024, 7:53 AM

#

sly inlet like last hidden states

I am really having trouble understanding transformers at all.... in theory they are something, and on practice they are totally different thing

obsidian bone Jun 5, 2024, 8:15 AM

#

sly inlet but i'm curious to see how mean of embeddings will behave

it behaves terrible...💀

#

it keeps outputting the same label on every sentence

#

will try last token now

#

it's the same with last token

sly inlet Jun 5, 2024, 8:21 AM

#

obsidian bone it's the same with last token

What ? Try it for the first token then

obsidian bone Jun 5, 2024, 8:21 AM

#

sly inlet What ? Try it for the first token then

k wait

obsidian bone Jun 5, 2024, 8:26 AM

#

sly inlet What ? Try it for the first token then

sly inlet Jun 5, 2024, 8:41 AM

#

obsidian bone

last try with their default implementation of GemmaForSequenceClassification

#

i would love to test this at myside as well

#

but i'm busy with my office work now , i will check out this variations later

obsidian bone Jun 5, 2024, 8:42 AM

#

sly inlet but i'm busy with my office work now , i will check out this variations later

good luck with your office work

sly inlet Jun 5, 2024, 8:43 AM

#

obsidian bone I am really having trouble understanding transformers at all.... in theory they ...

its hard to understand the code without playing with , i will later show you how much i understood after trying things with it for a week

#

but input_ids , attention_ids, position_ids and attention_mask are important things to consider

#

you can try generating a answer with padded prompt , without attention_ids it fails but with it it works fine.
its all about rotary embeddings

#

see you later

abstract ridge Jun 5, 2024, 3:12 PM

#

Greetings floks, am looking for a dataset that has Artworks with their descriptions (i only need about the name of the artwork with the semantic description, like what it is and what it means )
hope someone knows a dataset that suits this case

ruby oar Jun 5, 2024, 3:13 PM

#

Can i get a job here after becoming a newbie ML and DL engineer? harold

plucky vector Jun 5, 2024, 3:47 PM

#

define "here" 😅

weak compass Jun 6, 2024, 9:55 AM

#

Hello. When we are using two GPUs in kaggle notebook, how do include both of them when running a LLM with huggingface pipeline?
GPUs available: [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU')]
But currently, i am doing something like:
pipe = pipeline("text2text-generation", model="grantslewis/spelling-correction-english-base-finetuned-places", device = 0)
Which i think only makes use of one of GPUs

vapid mantle Jun 6, 2024, 12:16 PM

#

hello guys, i found a dataset that someone has worked on that i would like to use aswell and visualize just like he did on tableau, but despite trying to follow the instructions im so lost on how to download his jupyternotebook. can someone help me? thank you very much

undone nexus Jun 6, 2024, 6:50 PM

#

Hello, I want to learn more about Machine Learning and want to know whether anybody has any suggestions as to which resources(research papers, online courses, books/textbooks,etc.) one could use to cover a lot of topics in a reasonable depth such that you could ascertain which fields of AI/ML you like or don't like. Coming from a complete beginner standpoint at ML here, but have the basic prerequisites of statistics, multivariable calculus, and linear algebra covered.

plucky vector Jun 7, 2024, 4:02 PM

#

undone nexus Hello, I want to learn more about Machine Learning and want to know whether anyb...

I did the machine learning basic and advanced courses of Kaggle so far. They cover some basic concepts like train/test etc. which I'm sure you will need no matter which direction you head later. Each of them is estimated 4 hours long (text + training).

midnight oriole Jun 9, 2024, 4:01 AM

#

hello, i am new to ML and was wondering how much an apple M1 with 16Gb will be able to handle for training? I found a dataset with 5 million data that i would like to work on but i'm sure that this is way too much for a laptop. what is a general range a laptop could handle? should i slice it down to 100-500k data? Also, should i just use google colab instead of jupyter?

sly inlet Jun 11, 2024, 10:42 AM

#

obsidian bone I am really having trouble understanding transformers at all.... in theory they ...

have you seen this
https://youtu.be/l8pRSuU81PU?si=xOnR0-fXZDXhEyIz

YouTube

Andrej Karpathy

Let's reproduce GPT-2 (124M)

We reproduce the GPT-2 (124M) from scratch. This video covers the whole process: First we build the GPT-2 network, then we optimize its training to be really fast, then we set up the training run following the GPT-2 and GPT-3 paper and their hyperparameters, then we hit run, and come back the next morning to see our results, and enjoy some amusi...

▶ Play video

obsidian bone Jun 11, 2024, 10:43 AM

#

sly inlet have you seen this https://youtu.be/l8pRSuU81PU?si=xOnR0-fXZDXhEyIz

yeah I did, i added to watch later list lol

sly inlet Jun 11, 2024, 10:43 AM

#

yeah , kind of webseries on GPT-2

obsidian bone Jun 11, 2024, 10:44 AM

#

i am excited to watch

sly inlet Jun 11, 2024, 10:44 AM

#

but i like whatever this man posts

obsidian bone Jun 11, 2024, 10:44 AM

#

but got some essays to finish this week... writing is so boring

sly inlet Jun 11, 2024, 10:44 AM

#

i'm glad i finally over that crap

obsidian bone Jun 11, 2024, 10:44 AM

#

sly inlet but i like whatever this man posts

yeah, he explains it well too, I watched his transformers from scratch video

#

he has a discord server as well

sly inlet Jun 11, 2024, 10:45 AM

#

yes

#

i was there but didn't feel that good to me , so i left : )

obsidian bone Jun 11, 2024, 10:46 AM

#

man I already miss doing AI... it's been 3 days I didn't touch kaggle, feel like something important missing from my life

#

I hate university

obsidian bone Jun 11, 2024, 10:46 AM

#

sly inlet i was there but didn't feel that good to me , so i left : )

aah i see

sly inlet Jun 11, 2024, 10:46 AM

#

obsidian bone I hate university

CS major ?

obsidian bone Jun 11, 2024, 10:46 AM

#

sly inlet CS major ?

ELectrical and electronics engineering 🥲

sly inlet Jun 11, 2024, 10:47 AM

#

obsidian bone ELectrical and electronics engineering 🥲

hey, i'm also from electronics 😁

obsidian bone Jun 11, 2024, 10:47 AM

#

sly inlet hey, i'm also from electronics 😁

for realll?

#

u finished or still studying?

sly inlet Jun 11, 2024, 10:47 AM

#

graduated last year

obsidian bone Jun 11, 2024, 10:47 AM

#

oooh congrats

sly inlet Jun 11, 2024, 10:48 AM

#

which year you are in ?

obsidian bone Jun 11, 2024, 10:48 AM

#

atleast you are saved from it

obsidian bone Jun 11, 2024, 10:48 AM

#

sly inlet which year you are in ?

my last year, lmfao. Finals are on 24th june, and if I pass this semester I'll graduate

sly inlet Jun 11, 2024, 10:48 AM

#

ngl, CS things are so simple , we got pure madness of maths and theorms

obsidian bone Jun 11, 2024, 10:49 AM

#

sly inlet ngl, CS things are so simple , we got pure madness of maths and theorms

i know right? unlike Electronics

#

u got motor and voltage source, good luck connecting driver to them

#

and make sure you don't end up exploding ICs

#

I hate this major...

#

wish I went software engineering instead

sly inlet Jun 11, 2024, 10:50 AM

#

its better in elctronica at least you will get hurt by max 24v not like electical , playing with live AC

obsidian bone Jun 11, 2024, 10:50 AM

#

sly inlet its better in elctronica at least you will get hurt by max 24v not like electica...

actually we played with 220v AC

#

ended up blowing a light bulb

sly inlet Jun 11, 2024, 10:50 AM

#

obsidian bone wish I went software engineering instead

i had fun though , connecting and building stuff, even though i didn't understand a thing

obsidian bone Jun 11, 2024, 10:50 AM

#

glad we didn't get injured

obsidian bone Jun 11, 2024, 10:51 AM

#

sly inlet i had fun though , connecting and building stuff, even though i didn't understan...

eehh idk, for me I didn't have that much fun

#

arduino has most of it's codes ready, u just connect components copy past the code and run the circuit Tadaaa!

sly inlet Jun 11, 2024, 10:51 AM

#

obsidian bone actually we played with 220v AC

i didn't like that my teacher always force me to actively participate in electrical labs , like i like my life as everyone else do

#

auto transorformers , delta , star

obsidian bone Jun 11, 2024, 10:52 AM

#

yeah transformers lmfao

#

we know 2 transformers now

#

one in electrical and one in AI

#

that are completely unrelated

sly inlet Jun 11, 2024, 10:52 AM

#

obsidian bone one in electrical and one in AI

bet , AI has long way to go

#

search now and you will get electrical one

obsidian bone Jun 11, 2024, 10:53 AM

#

sly inlet search now and you will get electrical one

or the movie one

sly inlet Jun 11, 2024, 10:53 AM

#

too much competition

obsidian bone Jun 11, 2024, 10:53 AM

#

sly inlet Jun 11, 2024, 10:53 AM

#

remove s

obsidian bone Jun 11, 2024, 10:54 AM

#

sly inlet Jun 11, 2024, 10:54 AM

#

but HF needs to do more publicity

obsidian bone Jun 11, 2024, 10:54 AM

#

wow. you are master at prompting

#

just by removing S we got different result

sly inlet Jun 11, 2024, 10:54 AM

#

obsidian bone wow. you are master at prompting

tbh, i hate this part ,

#

i wish it won't come to me at work place

obsidian bone Jun 11, 2024, 10:54 AM

#

honestly I hate it cause it doesn't feel like real engineering

#

but it does bring results

sly inlet Jun 11, 2024, 10:55 AM

#

change few letters and you get something new, how am i supposed to get all answer with single prompt i don't know

obsidian bone Jun 11, 2024, 10:55 AM

#

chain of thoughts and self consistency for example, they do improve LLMs reasoning

obsidian bone Jun 11, 2024, 10:55 AM

#

sly inlet change few letters and you get something new, how am i supposed to get all answe...

yeah... LLMs are weird

sly inlet Jun 11, 2024, 10:55 AM

#

obsidian bone chain of thoughts and self consistency for example, they do improve LLMs reasoni...

upto certain point

obsidian bone Jun 11, 2024, 10:55 AM

#

sly inlet upto certain point

still better than zero shot prompting tho

sly inlet Jun 11, 2024, 10:55 AM

#

as your prompt gets bigger it start hallucinating

sly inlet Jun 11, 2024, 10:56 AM

#

obsidian bone still better than zero shot prompting tho

yes

sly inlet Jun 11, 2024, 11:10 AM

#

obsidian bone honestly I hate it cause it doesn't feel like real engineering

check HF servers general chat

#

how a space can turn things over

obsidian bone Jun 11, 2024, 11:12 AM

#

sly inlet how a space can turn things over

Was it Mamba architecture the one that didn't have issues with tokenization?

#

cause looking at transformers architecture based, they are really unstable.

sly inlet Jun 11, 2024, 11:21 AM

#

yes they were training model on byte level

sly inlet Jun 11, 2024, 11:22 AM

#

obsidian bone Was it Mamba architecture the one that didn't have issues with tokenization?

here's recent try on it
https://arxiv.org/pdf/2402.19155

#

long way to go for byte level models to compete with transformers , its hard to find pattern with bytes on text data mostly

obsidian bone Jun 11, 2024, 11:30 AM

#

sly inlet long way to go for byte level models to compete with transformers , its hard to ...

interesting will look into it, thanks

azure fractal Jun 11, 2024, 12:09 PM

#

https://www.kaggle.com/code/abhishek0032/marathon-analysis-indian-atheletes hello everyone i have created a project check it out and if you have any suggestions let me know if you like it please upvote

Marathon_Analysis_Indian_Atheletes

Explore and run machine learning code with Kaggle Notebooks | Using data from The big dataset of ultra-marathon running

tropic yarrow Jun 11, 2024, 12:12 PM

#

Can someone explain why my code is running out of RAM? (please ignore the commented parts, I'm doing a test run to see if the submission is working correctly without training the models first)

📎 message.txt

obsidian bone Jun 11, 2024, 12:30 PM

#

tropic yarrow Can someone explain why my code is running out of RAM? (please ignore the commen...

in the code there is datasetsTrain variable. Can I know what's the len(datasetsTrain)?

#

#

I added the length of 1000, and it took 2GB of rams when appending the model to models

tropic yarrow Jun 11, 2024, 12:31 PM

#

obsidian bone in the code there is `datasetsTrain` variable. Can I know what's the `len(datase...

Umm, okay so here's the original problem https://www.kaggle.com/competitions/facial-keypoints-detection

Facial Keypoints Detection

Detect the location of keypoints on face images

obsidian bone Jun 11, 2024, 12:31 PM

#

your neuralnetwork model has 288685 paramters to train, which is also something to consider.

tropic yarrow Jun 11, 2024, 12:32 PM

#

So I'm initialising a different dataloader and model for each feature to be predicted

obsidian bone Jun 11, 2024, 12:32 PM

#

tropic yarrow So I'm initialising a different dataloader and model for each feature to be pred...

why not initialize only 1 model

tropic yarrow Jun 11, 2024, 12:32 PM

#

obsidian bone I added the length of 1000, and it took 2GB of rams when appending the model to ...

but what im confused about is that it crashes only during prediction and not training

tropic yarrow Jun 11, 2024, 12:33 PM

#

obsidian bone why not initialize only 1 model

I could do that and store weights for each feature in a different file

obsidian bone Jun 11, 2024, 12:33 PM

#

tropic yarrow but what im confused about is that it crashes only during prediction and not tra...

most probably the len(datasetsTrain) is so large

#

and you initiating a model for that times of length

tropic yarrow Jun 11, 2024, 12:33 PM

#

I was just experimenting with whatever I could think of at the time, and it working for training and validation so i didn't optimise it

tropic yarrow Jun 11, 2024, 12:33 PM

#

obsidian bone and you initiating a model for that times of length

but there is no error during training or validation, only during submission
and that's what im confused about

obsidian bone Jun 11, 2024, 12:34 PM

#

tropic yarrow but there is no error during training or validation, only during submission and ...

hmmm lemme check

tropic yarrow Jun 11, 2024, 12:34 PM

#

I also copied portions of the code and didn't remove the comments so ignore the double comments, im sorry

#

looks pretty ugly

#

i had commented out the training loop for checking if the submission works correctly

#

while training

obsidian bone Jun 11, 2024, 12:37 PM

#

tropic yarrow i had commented out the training loop for checking if the submission works corre...

alright I'm trying the code on dataset, I'll return to you in couple of minutes

tropic yarrow Jun 11, 2024, 12:37 PM

#

oh nvm its going up

#

i see, thank you

obsidian bone Jun 11, 2024, 12:37 PM

#

wait you using this in your local machine or kaggle notebook?

tropic yarrow Jun 11, 2024, 12:37 PM

#

colab

obsidian bone Jun 11, 2024, 12:38 PM

#

oh

obsidian bone Jun 11, 2024, 12:48 PM

#

tropic yarrow colab

yeah seems like your submission function is doing something weird

#

17.4 GB right

#

now

#

bruh what did u do lol

#

wait lemme check the submission function

#

ok notebook crashed

tropic yarrow Jun 11, 2024, 12:51 PM

#

For each row, I was using the image id to access the image from the test dataset and using the encoded feature name to access the model to be called (which could be optimised by saving weights)

obsidian bone Jun 11, 2024, 12:58 PM

#

tropic yarrow For each row, I was using the image id to access the image from the test dataset...

ok fixed the problem

#

#

def submission(models, lookupDataset, imageDataset, organEncodings):
    for model in models:
        model.eval()
    for i in tqdm(range(len(lookupDataset))):
        # print(lookupDataset[i])
        # print(imageDataset[int(lookupDataset[i][0][1])-1].unsqueeze(0))
        # print(imageDataset[int(lookupDataset[i][0][1])-1].unsqueeze(0).shape)
        #print(f"Row Number {i}")
        #print(f"Before:\n{lookupDataset[i]}\nAfter:")
        with torch.no_grad():
            lookupDataset[i][0][3] = models[int(lookupDataset[i][0][2])](imageDataset[int(lookupDataset[i][0][1])-1].unsqueeze(0))
        #print(lookupDataset[i])
    return lookupDataset

#

you should add with torch.no_grad() before you feed the data into models

#

because without it, it calculates the backprops of the models and accumulates it

#

which takes space in ram

tropic yarrow Jun 11, 2024, 1:02 PM

#

obsidian bone because without it, it calculates the backprops of the models and accumulates it

Even when I don't call loss.backward()?

obsidian bone Jun 11, 2024, 1:02 PM

#

tropic yarrow Even when I don't call loss.backward()?

yeah

tropic yarrow Jun 11, 2024, 1:03 PM

#

What's loss.backward() for then? Doesn't it store the gradients?

obsidian bone Jun 11, 2024, 1:04 PM

#

tropic yarrow What's loss.backward() for then? Doesn't it store the gradients?

no, it computes gradients of parameters in neural network layers

tropic yarrow Jun 11, 2024, 1:04 PM

#

Um
somehow I used a dataloader and now it works fine

obsidian bone Jun 11, 2024, 1:04 PM

#

but when you do forward pass

#

the model create differential graph

#

if it's gradient enabled

tropic yarrow Jun 11, 2024, 1:04 PM

#

obsidian bone because without it, it calculates the backprops of the models and accumulates it

so what's it calculating here?

obsidian bone Jun 11, 2024, 1:05 PM

#

tropic yarrow so what's it calculating here?

oh sorry i meant it's accumulating the gradients for parameters

#

no backprops

tropic yarrow Jun 11, 2024, 1:06 PM

#

obsidian bone oh sorry i meant it's accumulating the gradients for parameters

I'm still confused

#

doesn't zero-grad simply set them to 0?

#

and how are they accumulating it if loss.backward() is never called?

obsidian bone Jun 11, 2024, 1:07 PM

#

tropic yarrow doesn't zero-grad simply set them to 0?

zero grad resets the accumulated gradient

obsidian bone Jun 11, 2024, 1:07 PM

#

tropic yarrow and how are they accumulating it if loss.backward() is never called?

by forward pass

#

wait lemme show u sth

tropic yarrow Jun 11, 2024, 1:08 PM

#

wdym by accumulates gradients

obsidian bone Jun 11, 2024, 1:08 PM

#

#

on the second one with torch.no_grad(), it returns the output without recording the graph

tropic yarrow Jun 11, 2024, 1:09 PM

#

how large are the gradients?

obsidian bone Jun 11, 2024, 1:10 PM

#

tropic yarrow how large are the gradients?

what gradients?

tropic yarrow Jun 11, 2024, 1:10 PM

#

i mean, those grad fn objects

#

I have 12k entries, so is each object like 1 GB?

#

nvm 30k entries actually

obsidian bone Jun 11, 2024, 1:11 PM

#

tropic yarrow Jun 11, 2024, 1:12 PM

#

Thanks, this makes sense

obsidian bone Jun 11, 2024, 1:12 PM

#

basically when you do forward pass, the tensors record history of the computation graph

#

which takes space in memory

#

but with torch.no_grad

#

u tell the model not to track the computation graph

tropic yarrow Jun 11, 2024, 1:12 PM

#

I gtg now, can we discuss more of this later? (Ill read it)

obsidian bone Jun 11, 2024, 1:12 PM

#

but just do calculation and output number

tropic yarrow Jun 11, 2024, 1:13 PM

#

Thanks for your help!

obsidian bone Jun 11, 2024, 1:13 PM

#

tropic yarrow I gtg now, can we discuss more of this later? (Ill read it)

k

fervent jolt Jun 11, 2024, 3:47 PM

#

Hello everyone, I'm gonna try out one Kaggle competition, but I don't understand what "internet access disabled" means. Can you detail what it means???

obsidian bone Jun 12, 2024, 7:07 AM

#

fervent jolt Hello everyone, I'm gonna try out one Kaggle competition, but I don't understand...

when you are on notebook, on the right side panel there is "session options", under that there is a toggle for "internet on".

and as far as I remember you need to verify your account to have that thing.

stable dragon Jun 12, 2024, 7:30 AM

#

fervent jolt Hello everyone, I'm gonna try out one Kaggle competition, but I don't understand...

Internet will be disabled and only the things on Kaggle can be used

fervent jolt Jun 12, 2024, 7:51 AM

#

obsidian bone when you are on notebook, on the right side panel there is "session options", un...

I already verified my account. Thank you anyway!

obsidian bone Jun 12, 2024, 7:51 AM

#

fervent jolt I already verified my account. Thank you anyway!

Yeah I misread your question, I thought you were asking on where the button is lol

#

Sarvesh answered it

fervent jolt Jun 12, 2024, 7:52 AM

#

stable dragon Internet will be disabled and only the things on Kaggle can be used

@stable dragon Does that mean I can't upload any packages or data to my kernel? Can't I download the datasets offered by Kaggle? If I run the code in my local and upload it to my kernel, is it a violation to the rule?

stable dragon Jun 12, 2024, 7:56 AM

#

fervent jolt <@461429364077101063> Does that mean I can't upload any packages or data to my k...

You can use anything that's available on Kaggle. You can also upload stuffs on Kaggle and use those, it's just that anything outside it can't be used

fervent jolt Jun 12, 2024, 7:59 AM

#

@stable dragon Thanks for your quick reply. I'm getting closer.... Can you give me specific examples that are not allowed to enlighten this newbie 😔 ?

stable dragon Jun 12, 2024, 9:27 AM

#

Would recommend checking with a public notebook, that would help you more.

Alternatively to experiment with things, turn off the internet for the notebook that you are working on, and try executing the code

fervent jolt Jun 12, 2024, 10:03 AM

#

stable dragon Would recommend checking with a public notebook, that would help you more. Alt...

@stable dragon Thank you so much!!!

fierce canopy Jun 12, 2024, 2:38 PM

#

Hi! Can I run stable diffusion and kobold ai on kaggle without be banned?
I'm too recording a course about stable diffusion, can I use kaggle to teach and the students use without be banned? How it works?
Can I train checkpoints to stable diffusion on keaggle without be banned?

plucky vector Jun 12, 2024, 3:08 PM

#

fervent jolt <@461429364077101063> Thanks for your quick reply. I'm getting closer.... Can yo...

I think there are notebooks that load dynamically stuff from GitHub, this would be not allowed

fervent jolt Jun 12, 2024, 3:28 PM

#

plucky vector I think there are notebooks that load dynamically stuff from GitHub, this would ...

@plucky vector Thank you! Could that be the only restriction?

stable dragon Jun 12, 2024, 8:31 PM

#

Can anyone tell me how to import a python notebook with all the inputs to the local via API.

Right now I need to individually download all the input files

light nest Jun 13, 2024, 8:55 AM

#

Hey guys I have a question , can we build sequential model ?
Like is it possible to train a model on X1_i Inputs Y1_i Output and then the second one is running on X1_i + Y1_i to give output Y2_i ??
Context : (I am trying to build this for a product where we are predicting what the user is likely to select. I have learned about Supervised Learning Algorithims including ensemble techniques)

obsidian bone Jun 13, 2024, 9:56 AM

#

light nest Hey guys I have a question , can we build sequential model ? Like is it possible...

yeah it's possible

vernal mist Jun 13, 2024, 4:59 PM

#

So I am new to the whole data science thing and just started working with kaggle. Is there anyone that might be able to give me some pointers?

plucky vector Jun 13, 2024, 6:14 PM

#

vernal mist So I am new to the whole data science thing and just started working with kaggle...

Do the Courses on Kaggle to learn the basics. They are for free and with many hands-on exercises

vernal mist Jun 13, 2024, 6:16 PM

#

plucky vector Do the Courses on Kaggle to learn the basics. They are for free and with many ha...

I started the courses and I think there is a link broken in one of them cause it wont load a dataset that is in the input unless I'm doing something very wrong

plucky vector Jun 13, 2024, 6:17 PM

#

vernal mist I started the courses and I think there is a link broken in one of them cause it...

There is a discussion area for every training in the courses, and also an area where you get hints how to get the right solution. Did you look there?

#

At least in the courses Machine Learning I + II and Neural Nets that I did

vernal mist Jun 13, 2024, 6:20 PM

#

plucky vector There is a discussion area for every training in the courses, and also an area w...

I was doing the Getting staRted tutorial. And its done in a notebook. I'm doing a full career pivot here don't know too much about data science besides what I've learn in a coursera cert so my knowledge in all this is very limited.

plucky vector Jun 13, 2024, 6:27 PM

#

vernal mist I was doing the Getting staRted tutorial. And its done in a notebook. I'm doin...

Ah, I didn't do that one, so I probably can't help specifically. Did the data load in the other training notebooks, in the other lessons?

vernal mist Jun 13, 2024, 6:28 PM

#

plucky vector Ah, I didn't do that one, so I probably can't help specifically. Did the data lo...

the dataset loaded in the previous training notebook and I tried a trick that I did in my own notebook. But even if the dataset shows in the input it won't load with R.

plucky vector Jun 13, 2024, 6:29 PM

#

That's strange. Did you select R as the language instead of Python for your cell?

#

I must admit, I didn't work with R so far on Kaggle

vernal mist Jun 13, 2024, 6:36 PM

#

plucky vector That's strange. Did you select R as the language instead of Python for your cell...

Yea R is the default on it. So it's really confusing me.

vernal mist Jun 13, 2024, 6:37 PM

#

plucky vector I must admit, I didn't work with R so far on Kaggle

On that subject the only reason I'm working with R right now is that's the programing language that the Google cert I took taught us. But I haven't really come across anyone that uses it for data analytics. Does it matter if I stick with R or should I switch to learning python?

plucky vector Jun 13, 2024, 6:43 PM

#

Well, as far as I understand with R you can do only data statistics, and with python you can do almost everything you can do with any other programming language too -- program scripts, for example, and especially on Kaggle also do machine learning stuff.

#

I don't know what your background is, but in the natural sciences python is very widespread -- because it's easy to learn and most scientists aren't programmers. They want to put some code together to do something for them without diving into the concepts of stuff like object-oriented programming

vernal mist Jun 13, 2024, 7:06 PM

#

plucky vector I don't know what your background is, but in the natural sciences python is very...

I'm getting ready to retire from the Marine Corps and I want to pivot into data analytics. I was just in a class where they told me Python is a more preferred skill, so I think I'll go back and do some classes on that and start working some more case studies again.

plucky vector Jun 13, 2024, 7:08 PM

#

You can learn both. The languages are a bit different, but the concepts of the analytics themselves are the same. Like math books in different languages will still contain the same math

vernal mist Jun 13, 2024, 7:16 PM

#

plucky vector You can learn both. The languages are a bit different, but the concepts of the a...

Yea I know. But I'm trying to find a job so I want to start with a programming language that is more widely used... I think.

elder flower Jun 13, 2024, 8:01 PM

#

There are news that US government forbid any IT services for Russia after 1st September. Is there any information about kaggle, will it stop working for russian users?

plucky vector Jun 13, 2024, 8:18 PM

#

elder flower There are news that US government forbid any IT services for Russia after 1st Se...

I don't know

#

https://tenor.com/view/anonymous-hacker-hack-hacking-vpn-gif-10891422

Tenor

vapid mantle Jun 14, 2024, 8:42 AM

#

anyone truly knowledgeable about big data & business intelligence here? i got a uni assignment and im cooked

#

please dm me

muted plover Jun 14, 2024, 2:08 PM

#

I have posted a question on how to become an ML expert with my current knowledge, if anyone wants to help I would appreciate it. Thank you in advance 🙂 https://www.kaggle.com/discussions/general/512308

How to become ML expert | Kaggle

How to become ML expert.

next widget Jun 14, 2024, 2:54 PM

#

Has anyone run into the problem where the kaggle python package cannot find kaggle.json? I have double and tripple checked that the kaggle.json file is in the "~/.kaggle" directory and I have also run the chmod 600 command on the file.

EDIT:
I figured it out. I had to set the KAGGLE_CONFIG_DIR to be the full path to the .kaggle directory. For some reason the relative paths ("~/") do not work. The permissions were set to 600.

midnight oriole Jun 14, 2024, 9:52 PM

#

hello, does anyone know any good resources of understanding the "dataset"? I'm not so good at concepts such as feature engineering and understanding the visualization of the data. So I guess like data preprocessing?

serene pier Jun 15, 2024, 9:52 AM

#

i was training some model and the progress epchs stops and i refresh it now the session stuck in booting kernel but when i see the console its still running. how to solve this ?

jaunty tangle Jun 16, 2024, 9:56 AM

#

Hello, I'm having a problem with something really simple but I can't seem to make it work.
When I try running the code below, it doesn't recognize min_frequency as a parameter despite it being in the documentation. So I'm trying to upgrade scikit-learn to a newer version but it doesn't seem to be working. I run my code on kaggle. Help would be much appreciated 🙇‍♂️

!pip install --upgrade scikit-learn --use-deprecated=legacy-resolver
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OrdinalEncoder

# Define the columns to be encoded
categorical_cols = ["Gender"]  # replace with your column names
gender_categories=['M','F']
# Create the encoder
encoder = OrdinalEncoder(min_frequency=10,
                         categories=[gender_categories])```

#

For more context, I got this error the first time I tried to ugrade:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
spopt 0.6.0 requires shapely>=2.0.1, but you have shapely 1.8.5.post1 which is incompatible.```

The version also hasn't seemed to change yet

candid canyon Jun 17, 2024, 12:07 PM

#

I am having a problem while predicting using YOLOv8 with 2 GPUs i ahve set a batch size of 16 but i am still having the same problem

high sigil Jun 18, 2024, 6:13 AM

#

vernal mist On that subject the only reason I'm working with R right now is that's the progr...

yes R is also used for data science, if you are comfortable you can use it

compact temple Jun 18, 2024, 11:14 AM

#

Hi this may sound like a bit of a silly question but what channel is for the discussions on the Titanic competition?

plucky vector Jun 18, 2024, 12:55 PM

#

#🚢┊titanic I would guess

#

If you go to id:customize you can see many more channels than in the standard list

rocky bolt Jun 18, 2024, 9:13 PM

#

Hi I've cloned from a git repository, and i tried to open the files under the output/RadioUNet directory but to no avail. Is the behaviour as expected? I tried googling about it and chatgpt did say it is possible to open the files by clicking on it

marble raptor Jun 18, 2024, 9:13 PM

#

I can't seem to find the channel for the kagglex competition even though I've verified myself on discord, any help?

glacial moth Jun 19, 2024, 1:22 AM

#

import plotly.express as px
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns

medical_df = pd.read_csv('/kaggle/input/test12ssda/medical-charges.csv')

sns.set_style('darkgrid')
matplotlib.rcParams['font.size'] = 14
matplotlib.rcParams['figure.figsize'] = (10, 6)
matplotlib.rcParams['figure.facecolor'] = '#00000000'

fig = px.histogram(medical_df, 
                   x='age', 
                   marginal='box',
                   nbins = 47, 
                   title='Distribution of Age')
fig.update_layout(bargap=0.1)
fig.show()```

#

Someone please help

#

it's been an hour since I started trying to fix it

#

it doesn't show the graph but doesn't give any errors

#

and yes I printed medical_df and it was fine

#

so nothinf with the dataframe

#

@ruby pewter

stable dragon Jun 19, 2024, 3:03 AM

#

Use copilot or LLMs to fix these errors, would be faster and easier

undone fulcrum Jun 19, 2024, 1:16 PM

#

Does anyone know good ways to approach fitting classifier models on high dimensional vector embedding data?

#

As apart of my research at my university we are testing the accuracy of ML classifier models in network anomaly detection after the data has been converted to a textual format and embedded using LLM sentence embeddings. The goal is to measure the change in performance of the models with the data with and without LLM embeddings used in preprocessing.

A problem I’m encountering is the datasets I work with are typically 1.5 million instances and when running the data through a sentence transformer model the dimensionality of the data skyrockets (about 350+ embedding features for the current model I’m using) and fitting the models we’re testing (RFC, ET, SVM, ADA, XGB, GB) takes a significant amount of time which is not good as my code gets run in Google Colab and constantly having to re up on compute tokens is not cost efficient at all.

I’ve thought about running the embeddings through PCA but I think the dimension would have to be reduced so much that it would be significantly diminishing the strength of the embeddings. Before I try this route, are there any suggestions on better ways to approach this problem that I don’t know of? Thank you!

glacial moth Jun 19, 2024, 1:41 PM

#

stable dragon Use copilot or LLMs to fix these errors, would be faster and easier

It says it's correct and it locally runs fine

obsidian pulsar Jun 19, 2024, 4:55 PM

#

glacial moth ```import pandas as pd import plotly.express as px import matplotlib import matp...

for numerical variables, use px.histogram with the histfunc parameter set to count or sum

obsidian pulsar Jun 19, 2024, 5:08 PM

#

jaunty tangle Hello, I'm having a problem with something really simple but I can't seem to mak...

😫 Well as I see it, the problem you are facing is caused by the fact that OrdinalEncoder in scikit-learn does not have a min_frequency parameter
It only has a category and handle_unknown parameter.

#

Why not preprocess the data using the counters in the ingest module before encoding the data?

glacial moth Jun 19, 2024, 5:44 PM

#

obsidian pulsar for numerical variables, use px.histogram with the histfunc parameter set to cou...

Are you sure that patameter exists?

jaunty tangle Jun 19, 2024, 7:55 PM

#

obsidian pulsar Why not preprocess the data using the counters in the ingest module before encod...

Hi, Thanks for the reply! It's true that Scikit learn version 1.2.2 doesn't have min_frequency and I can just make it myself, but some newer features like TargetEncoder take a lot more effort to make from scratch. For now I just moved to running the code locally on Scikit learn 1.5.0 and I'm no longer getting errors, but I wish there was a way to run it straight on Kaggle notebooks

obsidian pulsar Jun 20, 2024, 12:25 AM

#

jaunty tangle Hi, Thanks for the reply! It's true that Scikit learn version 1.2.2 doesn't have...

You're welcome!
they may not always have the latest versions of popular libraries.
but you can try
You can create a custom environment on Kaggle by installing the required packages in a new environment to use the latest versions of scikit-learn and others.

glacial moth Jun 20, 2024, 1:52 AM

#

so they have to update the libraries for the platform themselves?

#

I thought it was something similar to a VM

glacial moth Jun 20, 2024, 1:53 AM

#

obsidian pulsar You're welcome! they may not always have the latest versions of popular librarie...

How can I create the custom environment?

red gust Jun 20, 2024, 5:37 AM

#

So I'm relatively new to the data science world (experience in JAVA, but thats pretty much it). I've been trying to learn Python, but from what I have seen, it seems that most data science work involves using python libraries more than anything–– coding jargon isn't too necessary. Correct me if im wrong, as I am still learning python

#

And while we're at it, can someone introduce me to the basic functions (with examples, don't need to be too in-depth) of each python library?

marble raptor Jun 20, 2024, 4:59 PM

#

I had one question, in the leaderboards it says:

This leaderboard is calculated with approximately 20% of the test data. The final results will be based on the other 80%, so the final standings may be different.

how will you calculate the remaining 80% of the data given you dont know which model I used

plucky vector Jun 20, 2024, 6:00 PM

#

I suppose, you have to submit a model or kaggle notebook on this competition

sleek relic Jun 20, 2024, 6:36 PM

#

Is there a good way to propagate image augmentation transforms to a label when loading training data (torchvision)? I am trying to train some model with pixel coordinates on the image as labels, which means I need to figure what transforms are being applied to the image to transform the label as well

thorny marsh Jun 20, 2024, 7:38 PM

#

Hello everyone, I need some advice on where to dedicate my time. Let's say I wanted to start a company down the road in the automotive/consulting business, if I'm working as a automotive mechanical design engineer, during my free time would you guys suggest learning more about the automotive industry/business or machine learning? I enjoy machine learning a decent bit but idk how transferable it will be for a business
I know that that is where the industry is going tho, EVs, auto driving, etc

obsidian pulsar Jun 21, 2024, 7:28 AM

#

thorny marsh Hello everyone, I need some advice on where to dedicate my time. Let's say I wan...

👍

obsidian pulsar Jun 21, 2024, 7:31 AM

#

marble raptor I had one question, in the leaderboards it says: ``` This leaderboard is calcula...

Have you tried using OOB estimation?

obsidian pulsar Jun 21, 2024, 7:34 AM

#

sleek relic Is there a good way to propagate image augmentation transforms to a label when l...

Yes, I've had that experience too.
I used the torchvision library from pytorch.

obsidian bone Jun 21, 2024, 2:05 PM

#

anyone was able to download LLM files from huggingface using git clone? mine one stuck

marble raptor Jun 21, 2024, 6:36 PM

#

obsidian pulsar Have you tried using OOB estimation?

never heard about it, will give it a read thanks!

stoic cove Jun 23, 2024, 1:20 AM

#

Hi there, I am working on a competition which requires me to submit the notebook and runs it in offline mode, I want to import SentenceTransformer module, how can I import that

#

?

sleek relic Jun 23, 2024, 3:12 AM

#

You need to include the wheel in the notebook input. See this link that describes it: https://www.kaggle.com/code/narnaoot/installing-packages-without-internet-for-kaggle

Installing Packages without Internet for Kaggle

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

arctic glade Jun 23, 2024, 5:27 AM

#

Is there a way to download the dataset of this competition https://www.kaggle.com/competitions/prediction-of-the-cefr-level-of-english-texts/overview ? Im aware that the competition is over, does that mean that the dataset is not downloadable anymore?

Prediction of the CEFR Level of English Texts

Challenge Data - IMT Nord Europe (29/01 - 02/02/2022)

plucky vector Jun 23, 2024, 11:34 AM

#

arctic glade Is there a way to download the dataset of this competition https://www.kaggle.co...

Hm, if I go to the "data" tab I see all the data with a download button

arctic glade Jun 23, 2024, 11:39 AM

#

plucky vector Hm, if I go to the "data" tab I see all the data with a download button

same, but when i click it, it redirects to the rule tab, and it ends there

#

no way to approve the rules, bcuz it says its not required

#

ive tried using Kaggle API also might i add

#

i got 403 permission denied

lapis pendant Jun 23, 2024, 1:46 PM

#

anyone here ? I'm new.... I had a question, maybe in general too... so if I let's say had embeddings for my train set, would I have to run that over every time when I submit or can I save the embeddings on my kaggle directory and just look up the numpy array when I submit? I'm doing the essay prediction learning one...

obsidian pulsar Jun 23, 2024, 3:37 PM

#

lapis pendant anyone here ? I'm new.... I had a question, maybe in general too... so if I le...

Yes, when working with embeddings, it's a good idea to save them separately from your model and recall them when you need them

main mica Jun 23, 2024, 6:17 PM

#

Hello guys the dat viz course in kggle

#

How to complete their final project huh

lapis pendant Jun 24, 2024, 6:43 PM

#

how are people in the essay learning scoring one using the LB for their own models? seems like that won't fly in final submission?

lapis pendant Jun 24, 2024, 7:31 PM

#

obsidian pulsar Yes, when working with embeddings, it's a good idea to save them separately from...

can the entire model based on training just be saved as a model so that it can run in a few minutes instead of using embeddigns to find the features directly first in the test.csv? wondering if there's a way to just make a model and quickly lookup scores in the test.csv without the intermediate steps?

quiet crystal Jun 25, 2024, 7:34 AM

#

Hey guys, I have one nube question.

I'm trying to make my first submission to a code competition. I did my work on Jupyter Colab by going through some baseline examples.

Pretty much every examples use deberta-v3-base as their model and tokenizer, and they create the these by doing something like:

class PATHS:
    model_path = '/kaggle/input/huggingfacedebertav3variants/deberta-v3-base'

self.tokenizer = AutoTokenizer.from_pretrained(PATHS.model_path)
model = AutoModelForSequenceClassification.from_pretrained(PATHS.model_path, num_labels=CFG.num_labels)

I am not sure from where and how you can add the models/tokenizers to the kaggle/input directory. I tried this tutorial (https://www.kaggle.com/code/shravankumar147/save-huggingface-model-to-local-for-no-internet/notebook), but I still don't see deberta_v3_small_pretrained_model_pytorch created in my kaggle/input directory when browsing from the Jupyter notebook created through competition's submission page.

Would love to get help on this. Thanks!

obsidian pulsar Jun 25, 2024, 3:08 PM

#

lapis pendant can the entire model based on training just be saved as a model so that it can r...

Yes, model serialization.
you can use the save() method to save your model
model.save(your model)

lapis pendant Jun 25, 2024, 4:14 PM

#

can you submit a notebook via good colab or is that only for files like submission.csv? I ran out of GPU and they won't let me submit with GPU enabled now

deft fox Jun 25, 2024, 4:32 PM

#

arctic glade Is there a way to download the dataset of this competition https://www.kaggle.co...

You have to join the competition before downloading the data - even after the competition is finished.

arctic glade Jun 25, 2024, 4:33 PM

#

deft fox You have to join the competition before downloading the data - even after the co...

i see, so now that the competition is over, there is no way to download the data?

deft fox Jun 25, 2024, 4:33 PM

#

arctic glade i see, so now that the competition is over, there is no way to download the data...

Join the competition in “Data” tab.

arctic glade Jun 25, 2024, 4:40 PM

#

deft fox Join the competition in “Data” tab.

Im not sure how to do that, Ive clicked the license, it redirected me to the rules tab, but there is nothing to accept, when i click download, the same things happen (i.e rule tab). Ive tried the cli script, i got 403 permission denied. Can you guide me on how to join the competition via the data tab? thanks

deft fox Jun 25, 2024, 6:09 PM

#

arctic glade Im not sure how to do that, Ive clicked the license, it redirected me to the rul...

It seems that particular competition is not set up for late submissions, so maybe there’s no way to join and access the data.

arctic glade Jun 25, 2024, 6:10 PM

#

deft fox It seems that particular competition is not set up for late submissions, so mayb...

roger that, thanks for the confirmation

stoic lion Jun 25, 2024, 11:19 PM

#

sorry to double post, but very urgent given two days before competition deadline on aimo: my team requires a dataset upload to submit our notebooks and it seems this is broken. can anyone help? submission deadline in 40 minutes

stray ledge Jun 26, 2024, 5:00 AM

#

Hi Everyone, need help. I am trying to submit my notebook for AIMO competition but every time I am getting "Notebook threw exception" for the submission. I have put most of the code in try, except blocks to catch any exception. I just want to know what exception where it is throwing, please help I have spent days and nights to come this far. Here is the link to my notebook. https://www.kaggle.com/code/kiritidesarkar/aimo-deepseekmath-db-solved-trainingquestions. There are only 2-3 days left for the competition and I am not able to get even a score because of this exception.

stray ledge Jun 26, 2024, 5:01 AM

#

stray ledge Hi Everyone, need help. I am trying to submit my notebook for AIMO competition b...

I have checked the logs but no exceptions I could see there

lost dove Jun 26, 2024, 9:41 AM

#

I am a beginner who just started learning and I had a doubt "Should we do imputation even when more than 50% of the values are missing ?"
if so what will be the best method?

tame aurora Jun 26, 2024, 10:11 AM

#

lost dove I am a beginner who just started learning and I had a doubt "Should we do imputa...

impute when you feel that the variable is extremely critical and imputation won't logically hamper your output in any way

#

but if a variable has 50% of its values missing i'd rather just drop it if it is insignificant anyways

#

try imputing and check the correlation with the dependent variable using
sns.heatmap(df.corr())

flint copper Jun 26, 2024, 5:09 PM

#

hey, I had a doubt related to the Titanic Dataset I have recently started Learning ML can anyone tell me what columns I should target to give the required output?

#

I am not able to understand what output does they want?

graceful axle Jun 26, 2024, 6:24 PM

#

who will be the winner?

stray mango Jun 26, 2024, 6:31 PM

#

flint copper I am not able to understand what output does they want?

the output it wants is, what was the percentage of women and men that survived on your test data

flint copper Jun 27, 2024, 6:27 AM

#

stray mango the output it wants is, what was the percentage of women and men that survived o...

okay thanks

balmy yoke Jun 27, 2024, 9:20 PM

#

Hello, I have little experience in the world of data science, but I want to participate in a competition to test my skills and acquire new ones. Could you recommend one, please?

graceful axle Jun 28, 2024, 7:09 AM

#

How to get room direction in indoor photo/image.

stray mango Jun 28, 2024, 8:12 AM

#

balmy yoke Hello, I have little experience in the world of data science, but I want to part...

You can find beginner level problems on kaggle

#

Try to build those problems model

#

It will surely help you gain a better understanding of model

broken nimbus Jun 28, 2024, 5:54 PM

#

Hi, Is there any structured course that gives proper directions to Kaggle beginners on building profile? or any mentor in the group? Kindly guide. TIA!

stray mango Jun 28, 2024, 6:40 PM

#

broken nimbus Hi, Is there any structured course that gives proper directions to Kaggle beginn...

Yes there is, check the beginners project , it gives guidance on how to build the model (ex: titanic)

#

You should be okay once you do this guided projects

broken nimbus Jun 28, 2024, 6:41 PM

#

not the single project, but the structured course

#

cz a lot of people are like me, new to kaggle but dont know how to build the profile. To start with , its titatnic project , but then whats next, as in series of tasks

sour crystal Jun 29, 2024, 2:53 PM

#

Hi everyone, I’m new here and need some help. Can someone please explain what I need to do for this Skin Cancer competition in a simple way?

What do I need to do? - Develop a program to identify dangerous skin spots from images.
What language and tools should I use? - Use Python and tools like TensorFlow or PyTorch.
Which data should I use? - Use the Training and Test data provided in the competition files or data from outside.

Thanks a lot!

fervent jolt Jun 30, 2024, 12:45 PM

#

pheeww.. another newbie question:wWhen I submit in a Notebook competition, does Kaggle simply replace test.csv with the hidden dataset? 🤔 My submission is taking too much time, so if that’s correct, I need to re-engineer my approach.

wild perch Jun 30, 2024, 1:47 PM

#

Hi everyone, do you know if there exists a dataset for fake (LLM-generated) online reviews in german language? I am looking for somthing similar to https://osf.io/tyue9/ but in german.
Thanks!

OSF

Fake Reviews Dataset

The generated fake reviews dataset, containing 20k fake reviews and 20k real product reviews. OR = Original reviews (presumably human created and authentic); CG = Computer-generated fake reviews.
Hosted on the Open Science Framework

small musk Jun 30, 2024, 9:26 PM

#

When I try to get the code, I get the too many requests message

rocky lintel Jun 30, 2024, 10:48 PM

#

I'm looking for anyone interested in collaborating with me on an ML model for turning raster curves into vector vector (Bezier) curves? Is there a section on this Discord most appropriate for asking?

agile veldt Jul 1, 2024, 1:50 AM

#

I'm calculating pAUC_80 using the implementation found on the icic skin cancer competition, but it is not matching up with the outputted score, any ideas on how to fix this? I want to test my models true value without wasting a submission

wintry shell Jul 1, 2024, 11:47 AM

#

Hello everyone, i have a question and i would really appreciate your assistance. pika_wow
I have 2 networking and ip addresses data files with .RR format (ex: myipv6add.RR, myipv6add2.RR) and i want to extract into MySQL file .. how can i write a script in python to do that ? harold

wanton orchid Jul 2, 2024, 7:22 AM

#

The kaggle quota of GPU resets every week right?

fervent jolt Jul 2, 2024, 8:20 AM

#

fervent jolt pheeww.. another newbie question:wWhen I submit in a Notebook competition, does ...

I'm sorry if my question is too basic in Kaggle, but I really can't find an answer. Any answer would be appreciated. >> When I submit in a Notebook competition, does Kaggle simply replace test.csv with the hidden dataset? 🤔 My submission is taking too much time, so if that’s correct, I need to re-engineer my approach.

uncut prism Jul 2, 2024, 7:26 PM

#

when will participants for kaggleX fellowship program be notified if they got accepted or not?

verbal crest Jul 2, 2024, 9:53 PM

#

wanton orchid The kaggle quota of GPU resets every week right?

Yep, it resets each week.

agile veldt Jul 3, 2024, 2:14 AM

#

fervent jolt I'm sorry if my question is too basic in Kaggle, but I really can't find an answ...

Are you training your model in your submission?

fervent jolt Jul 3, 2024, 3:01 AM

#

agile veldt Are you training your model in your submission?

Thanks for your comment. Yes. I'm working on the metric competition (USPTO-explainable-AI).

agile veldt Jul 3, 2024, 3:04 AM

#

If you can export your model you might want to do that

#

and import it into another notebook

#

so you arent training it everytime you submit

vale steeple Jul 3, 2024, 3:11 AM

#

hello guys, are there any source that specifically collects data on traffic?

quick crag Jul 3, 2024, 4:59 AM

#

hello i am working on projects accross the web and participating in competetions as well as starting as a freelancer to enhance approach in the industry and work on real world data. DMs are open.

dapper yoke Jul 3, 2024, 3:21 PM

#

Hey guys, how do I access my GPU and TPU in my kaggle notebooks?

#

Do i need to verify my phone number or smth?

sly inlet Jul 3, 2024, 4:31 PM

#

dapper yoke Do i need to verify my phone number or smth?

yes

dapper yoke Jul 3, 2024, 4:33 PM

#

sly inlet yes

alright, thank you very much! and after that, you can modify the settings in the notebook, right?

sly inlet Jul 3, 2024, 4:33 PM

#

yes , you will get option for accelerator

dapper yoke Jul 3, 2024, 4:38 PM

#

thanks man

quick spruce Jul 3, 2024, 5:14 PM

#

hi, I recently got banned from my other account due to using prohibited code. How to get my account unbanned, I really learned my lesson now 😭

verbal hornet Jul 3, 2024, 6:04 PM

#

Hi, i need some support on 1 dataset i've created, because it's very biased

sly inlet Jul 3, 2024, 6:32 PM

#

quick spruce hi, I recently got banned from my other account due to using prohibited code. Ho...

what were you exactly doing ? and what is prohibited code ?

wispy eagle Jul 3, 2024, 9:10 PM

#

how do I team up with a friend?

verbal crest Jul 3, 2024, 9:34 PM

#

@wispy eagle There is a "team" tab that allows you to add friends to your team to work together. You might need to refresh after accepting the rules.

quick spruce Jul 4, 2024, 1:40 AM

#

sly inlet what were you exactly doing ? and what is prohibited code ?

hi, I git cloned facefusion from github, which is not allowed. Now I really want to get my account back (kaggle: imduckman), it has my other valuable code :(, how to contact support, I'm crying rn 😭

sour adder Jul 4, 2024, 10:04 AM

#

I have been working on the Image Segmentation task, but most of the notebooks I reffered have used MSE as the loss function to train the model, they are comparing the generated mask and the ground truth using the MSE Loss function. Generally for the image segmentation IoU should be used, when I tried that, my IoU loss stopped decreasing and it stagnated at 90,while MSE loss decreased quite well until 0.02. Can anyone suggest,why this is happening. Any help or suggestion will be quite helpful thanks in advance

quick spruce Jul 4, 2024, 12:10 PM

#

could anyone help me on getting my Kaggle account back 😔, I've been so depressed and stressed... My kaggle account is imduckman, I got banned after using prohibited code. My account has lots of my valuable code that now I cannot access. I really realized my wrongdoing and learned my hard lesson now... I really apologize for my violation, I promise I would never violate the rules again. Thanks a lot

lapis bane Jul 4, 2024, 12:34 PM

#

hii everyone 👋, how do I upgrade to kaggle pro ? and I recently ran out of memory when deploying gamma model in my kaggle notebook, can someone tell how to get more memory from kaggle?

Thanks in advance!!

wooden halo Jul 5, 2024, 3:17 PM

#

Hi, please when is the next application to register as a mentee opening ?

hallow tundra Jul 5, 2024, 7:42 PM

#

how do i start with kaggle? I have done a machine learning course

fervent jolt Jul 6, 2024, 2:42 AM

#

agile veldt If you can export your model you might want to do that

@agile veldt Could you elaborate your explanation? I'm wondering how my notebook submission is scored. The host has hidden dataset. The hidden dataset should follow my algorithm on my notebook to get the results, and the results will be scored by the host's metrics. Then, the hidden dataset should replace test.csv in my notebook. Am I missing something in my logic?

high sigil Jul 6, 2024, 6:37 AM

#

uncut prism when will participants for kaggleX fellowship program be notified if they got ac...

did you get any info?

#

@wind silo Hello and sry for the ping :)
but, any updates on the results of KaggleX

uncut prism Jul 6, 2024, 6:46 AM

#

high sigil did you get any info?

no updates regarding KaggleX fellowship program acceptance or not.

graceful axle Jul 6, 2024, 7:49 AM

#

Hello, everyone
I want to create new similar style music from 100 ambient music.

#

Please help me.

stuck monolith Jul 6, 2024, 8:53 AM

#

hello guys, I have been doing ML for about 1 month and I have no problem understanding the models and maths

#

but whenever i start implementing it myself

#

on any datset

#

i just go blank

#

and dont know what to do

#

can someone please help ??

eager fossil Jul 7, 2024, 6:17 AM

#

Hello everyone,I am new to kaggle .I want to participate in competitions and started learning ml but don't know as a beginner how to participate in competition ect , please anyone guide me.

dapper yoke Jul 7, 2024, 4:02 PM

#

eager fossil Hello everyone,I am new to kaggle .I want to participate in competitions and sta...

use the free courses provided by kaggle

wind silo Jul 8, 2024, 2:17 AM

#

high sigil <@986364182742040596> Hello and sry for the ping :) but, any updates on the res...

Hi @high sigil, thanks for your message following up on the application status. We are in the process of reviewing applications. We will be notifying applicants at the end of July. Please visit kaggle.com/kagglex or our discussion board for updates.

Link to KaggleX discussion: https://www.kaggle.com/discussions/general/357233

KaggleX Program Q&A | Kaggle

KaggleX Program Q&A.

eager fossil Jul 8, 2024, 7:51 AM

#

thanks @dapper yoke

real shore Jul 8, 2024, 3:20 PM

#

hello everry one i 'm looking for anyone interested in collaborating with me on an ML

dapper yoke Jul 9, 2024, 5:09 PM

#

How can you implement the volov8 or volov5 architecture in tensorflow for image classification?

obsidian bone Jul 9, 2024, 6:34 PM

#

quick spruce hi, I git cloned facefusion from github, which is not allowed. Now I really want...

where is it written that it's not allowed?

#

cause i've seen some notebooks git cloning the repo

shrewd scarab Jul 10, 2024, 1:34 AM

#

Hey guys, does anyone know if there is a data api that I can use to access my user data (ex: competitions, leaderboard rankings, notebooks)? I want to programmatically access my information so that I can display it to others without having to redirect them to kaggle.

quasi holly Jul 10, 2024, 3:30 AM

#

stuck monolith hello guys, I have been doing ML for about 1 month and I have no problem underst...

i had the same problem, do i a few guided projects, start to solve, if u cant read through the solution entirely once, then implement it with looking at the solution.

stable dragon Jul 10, 2024, 11:38 AM

#

I get this error while downloading the data via API, any suggestions?

403 - Forbidden - You must accept this competition's rules before you'll be able to download files.

PS: I have already accepted the rules for the competition

halcyon island Jul 10, 2024, 6:09 PM

#

hello guys
my add-ons is not showing up in my kaggle notebooks
i have tried to sign out sign in multiple times and created mulitple notebooks yet I am unable to add on my secret keys

obsidian bone Jul 10, 2024, 6:29 PM

#

guys how do I use the attention mask like this when we have batche of sentences with different lengths?

#

like do I multiply attention mask with tokenized input, or do I input it seperately to the transformer model?

#

context: I am not using Hugging face library, I built custom transformers based model

tacit ibex Jul 10, 2024, 7:09 PM

#

How would i gain followers on kaggle and upvoted by people?

jovial leaf Jul 10, 2024, 8:06 PM

#

Hi, i'm relatively new to ML. And i always get this far when i make models (then i make the plot of predicted values vs actual values) but that's it. How does this apply to the real world? What tools do you use? Do you have documentation of that or recommendations?
Because if I go to an interview and show them my script, i'll just sit there and not know how i could implement it into something "real."

candid tinsel Jul 11, 2024, 3:03 AM

#

hey guys any attempt to use get_gcs_path() function just results in an error, does anyone know how to fix?

tacit ibex Jul 11, 2024, 6:14 AM

#

jovial leaf Hi, i'm relatively new to ML. And i always get this far when i make models (then...

Ohk...use flask in which you create model exactly like this but working should be in pipeline format means you have to code logging with this code which u done right now that just one step closer

tacit ibex Jul 11, 2024, 6:15 AM

#

tacit ibex Ohk...use flask in which you create model exactly like this but working should b...

And make projects that is for ml .. nothing more it have

#

https://github.com/Utkgitdev-07

GitHub

Utkgitdev-07 - Overview

Utkgitdev-07 has 19 repositories available. Follow their code on GitHub.

#

That's my GitHub follow me and see my diamond price prediction project

#

https://www.kaggle.com/utkarshyadav07

Utkarsh Yadav | Contributor

Kaggle profile for Utkarsh Yadav

#

Also follow me on kaggle please

#

Hey can anyone help me how I would I make my own new dataset
I completed ml and I am very much introduced to the thing's but for making dataset from where we decide.colums and rows and specially feature in it and its data values
I am confused so anyone who makes dataset can HELP me please

prime tusk Jul 11, 2024, 6:57 AM

#

Does any one help me to find some end to end data science industry project?
Mainly I looking for finding fraudster/bill defaulters using their credit history

Or finding out customer key insight of a business store using their all their customer transaction history
Mainly who is their target customer and who can be in future whom should they focus on

idle vortex Jul 11, 2024, 3:16 PM

#

I am not able to download the model file from the working directory

#

any fix for that ?

dull sky Jul 11, 2024, 5:49 PM

#

is there a way to early stop flaml automl? I did find hyperband and early stop, but not a standalone function

simple orchid Jul 11, 2024, 6:00 PM

#

Whenever I want to tune hyperparameter with keras tuner it raises exception saying "RuntimeError: Number of consecutive failures exceeded the limit of 3". Can anyone help me how to solve this problem?

next glade Jul 12, 2024, 3:29 AM

#

Hi, i was wondering if anyone knew any libraries that have a bot in a maze, and the bot has to try and find its way to a goal known that may or may not move, and is unknown, using a neural network?

quasi holly Jul 13, 2024, 6:40 AM

#

Hey guys, i am relatively new, i was thinking if it is possible to
https://www.kaggle.com/datasets/abcsds/pokemon/data
predict pokemon type 1 from its attributes with classical machine learning , i got 25.5 % (better than random guessing )
https://www.kaggle.com/code/anikeetgarg1/pokemon-type-prediction
is it possible? how? what did i do wrong?, can someone also know how to use mutual info?

Pokemon with stats

721 Pokemon with stats and types

#

thank you so much please tag me if there are any answer to the question

tacit ibex Jul 13, 2024, 10:58 AM

#

Can anyone help me how to make dataset..means I have idea about it and column name...but how can I find data for that column

dull sky Jul 13, 2024, 12:31 PM

#

is there an IDE ,that has conda and jupyter integration and works reasonably well? I tried pycharm community, but the notebook is a payed a payed feature...

golden lotus Jul 13, 2024, 9:53 PM

#

#

Please Why is this not working please
i have successfully imported all the other documents for train and test

quasi holly Jul 14, 2024, 2:00 AM

#

golden lotus

you might nott have run the cell above it

warm stream Jul 14, 2024, 4:05 AM

#

how do you guys find unclean datasets on kaggle? I am doing data cleaning on structured data and i am looking for dirty datasets like housing prices or loans

#

or price predictions

#

it can be either classification or regression

subtle nexus Jul 14, 2024, 7:18 AM

#

Hey, could you advise me on your favorite python libraries to quickly try out different model types (SVM, RandomForest, etc) on a dataset? Thank you!

plush oyster Jul 14, 2024, 11:35 AM

#

Sklearn

lone roost Jul 14, 2024, 1:04 PM

#

hi, i am new to kaggle. Can someone tell me how to make a comment in discussion?
Whenever i open any discussion page, i didn't have the permission to make a comment.

maiden imp Jul 14, 2024, 4:42 PM

#

can i give kaggle competition on my own? no team.

golden lotus Jul 14, 2024, 9:50 PM

#

quasi holly you might nott have run the cell above it

i did

quasi holly Jul 14, 2024, 11:52 PM

#

golden lotus i did

The issue cited is that train_data is not defined
Either variables name is wrong or
You might not have run all the cells after you edited them

I don't know if there is any other reason that could happen, try rerunning all the cells

real ravine Jul 15, 2024, 5:32 PM

#

Where can I better understand how a notebook submission should look like for a competition? Thanks ❤️

earnest shell Jul 15, 2024, 6:06 PM

#

Hi everyone, I'm stuck on a part of my code, I need to predict the price of gold in India.
I chose to use the ARIMA model, but for some reason the value is repeated after the ninth index.

I already tried:

Put the ''Date'' column in the index.
Keep the date column in the dataframe.
Turned the ''Price'' column into a list, and it didn't work.

Someone could help me?
PS: Sorry for my bad english, I'm brazilian, and actually I don't have time for became more fluent

quasi holly Jul 15, 2024, 9:22 PM

#

real ravine Where can I better understand how a notebook submission should look like for a c...

look at other notebook how they submit it, i do that

stuck monolith Jul 16, 2024, 6:26 AM

#

#❓┊ask-a-question guys I do not have a master degree is it good for me to continue pursuing ML (I am currently 2nd year b.tech) ??

#

are there any jobs I can do without a masters degree?

real ravine Jul 16, 2024, 8:56 AM

#

quasi holly look at other notebook how they submit it, i do that

Thanks for the answer. I've been looking through notebooks under the "Code" section of the competition, but I don't get how I could know if this is a submission notebook or just some general exploration or something else. Is there any way to know what an actual submission would look like?

marble raptor Jul 16, 2024, 12:33 PM

#

real ravine Thanks for the answer. I've been looking through notebooks under the "Code" sect...

most notebooks have a score attached to them, even if that is not the case I have observed most notebooks to be submission notebooks

real ravine Jul 16, 2024, 12:34 PM

#

Thanks for answering. I think my question is really this:
What is the notebook submission supposed to have?
Does it have to load data, train a model, and write results?
Many thanks

marble raptor Jul 16, 2024, 2:00 PM

#

real ravine Thanks for answering. I think my question is really this: What is the notebook s...

in getting started competitions you will observe a sample submission.csv already loaded, thats all you have to submit nothing else (as in the same format)

#

you dont submit the notebooks, just that csv file having the predicitions

real ravine Jul 16, 2024, 2:51 PM

#

But now I'm trying to do the ISIC competition with friends, so I guess I need to submit a notebook so that I can generate predictions on the hidden test set

clear crescent Jul 17, 2024, 8:40 AM

#

hey . i uploaded my prediction csv file. how i can know if it is ok or not? how to chec k ?

marble raptor Jul 17, 2024, 8:51 AM

#

real ravine But now I'm trying to do the ISIC competition with friends, so I guess I need to...

you predict on the test data in your notebook itself, no need to submit a notebook, as I said

pulsar dagger Jul 17, 2024, 8:59 AM

#

hey everyone, I'm having trouble verifying my account using a phone number, I actually tried two of them but in both I get that "too many requests" orange message, and I've tried more than once in different times of the day but still to no avail, any help would be highly appreciated 🙏

real ravine Jul 17, 2024, 10:23 AM

#

marble raptor you predict on the test data in your notebook itself, no need to submit a notebo...

Thanks for the constant answers. I'm still dumbfounded on how it's possible to predict on the test data and generate responses if that data is hidden from me.
Is there a function that loads the test data?
Thanks so much

golden lotus Jul 17, 2024, 7:36 PM

#

#

Please why am i getting this error

upper panther Jul 17, 2024, 8:08 PM

#

Hi guys I just submit the first prediction titanic with the tutorial, but I am wondering how it goes from now, I do not think it is over. Right?!? tks

verbal crest Jul 18, 2024, 7:01 AM

#

upper panther Hi guys I just submit the first prediction titanic with the tutorial, but I am w...

The titanic competition runs forever since its a tutorial. After making your first submissin you should try to improve your score by trying different techniques, then you can move onto other competitions.

verbal crest Jul 18, 2024, 7:06 AM

#

real ravine Thanks for the constant answers. I'm still dumbfounded on how it's possible to p...

The best way to figure this out is to look at what other public notebooks do to make submissions: https://www.kaggle.com/competitions/isic-2024-challenge/code

ISIC 2024 - Skin Cancer Detection with 3D-TBP

Identify cancers among skin lesions cropped from 3D total body photographs

celest sphinx Jul 18, 2024, 5:26 PM

#

Hey hey hey

#

Ive been trying to apply quantile loss function to my xgboost regression but something weird is happening

#

#

The 0.05 quantile seems to be working fine, but the 0.5 is wayyy off mark and the 0.95 is not even trying

#

(On the picture i made the xgboosts overfit the data just to confirm each quantile was doing its job properly)

#

And getting a flatline when trying to overfit tells me something is weird

celest sphinx Jul 18, 2024, 6:00 PM

#

Ok i think i figured out the problem is that the gradient of quantile loss is always the same value if you re doing bad there is no incentive to go up or down so the xgboost is just stuck at 0

#

So ill try to modify the quantile loss function so it knows it is getting closer

celest sphinx Jul 18, 2024, 6:52 PM

#

Now it works well ☺️

golden lotus Jul 18, 2024, 7:07 PM

#

golden lotus

still waiting for help

celest sphinx Jul 18, 2024, 10:39 PM

#

golden lotus still waiting for help

Idk looks something like you're accessing the kernel multiple times or something but i think i recognize kaggle interface so it has to do with your connection to their website

#

Chatgpt works very well with copy paste-ing error messages

#

Give it a shot

gusty hound Jul 19, 2024, 8:01 AM

#

hello everyone, I am planning to go to university, but I can’t decide on the program, can you tell me how important mathematics is in data science and how many thousands of hours it is best to spend on it in order to study it to the required level to work as a data scientist. Thank you very much in advance

stable dragon Jul 19, 2024, 8:08 AM

#

Data Science is all about Mathematics

clever bay Jul 19, 2024, 3:37 PM

#

In order to work in data science, is there a minimum of what skills you must have?

#

Also is freelance data science a thing?

winter jay Jul 21, 2024, 4:55 PM

#

Hi, guys! Do I have to mention on the Kaggle forum that I'm using external data? I remember there was an External Data Thread in every competition but not anymore.

fierce canopy Jul 21, 2024, 6:11 PM

#

Hi, how can I have more space on Kaggle?

verbal crest Jul 22, 2024, 8:11 PM

#

winter jay Hi, guys! Do I have to mention on the Kaggle forum that I'm using external data?...

You should check in each competition specifically for the rules that apply to it.

stray lichen Jul 23, 2024, 2:03 PM

#

in a project I need to get the following api keys.
LIVEKIT_API_KEY=...
LIVEKIT_API_SECRET=...
DEEPGRAM_API_KEY=...
OPENAI_API_KEY=... I have difficulty getting the LIVEKIT_APi keys. how can I create one?

sly inlet Jul 23, 2024, 3:51 PM

#

i believe you know they are not free

stray lichen Jul 23, 2024, 5:14 PM

#

sly inlet i believe you know they are not free

Aside from openai, the other two are free

maiden imp Jul 24, 2024, 11:47 AM

#

Not related to kaggle

https://www.youtube.com/watch?v=gDF_qGzYEYQ&list=PLqXS1b2lRpYTUHPp2MYkgXS7v6_qA-JsF&index=5

guys i was thinking of replicating this project in my laptop, will my 3050 6gb will be enough?

YouTube

Python Simplified

Create GUI App with Stable Diffusion, Docker, and Flask - Beginner ...

In this simple coding tutorial, you will learn how to make your own Generative AI application with Stable Diffusion, Docker, and Flask. The app will take in a user-provided text prompt and convert it into high-resolution images. (Total size of 2048 x 2048 pixels! 😱)

We will also dive into Docker Init, Diffusers, CUDA, FreeU, EDSR, OpenCV, and D...

▶ Play video

compact snow Jul 24, 2024, 2:55 PM

#

can I share my kaggle profile updates in #💬┊general ? if not here .. then where should I post them..? thanks..!

tulip surge Jul 24, 2024, 3:48 PM

#

bruh did u do this project

shy flume Jul 24, 2024, 10:05 PM

#

I created an ml model using imagenet v2 on a 12 class trash dataset and since it has achieved a high accuracy, I want to publish it as a notebook. What is, in general, the best format for a kaggle notebook?

lucid summit Jul 25, 2024, 2:01 AM

#

shy flume I created an ml model using imagenet v2 on a 12 class trash dataset and since it...

i would look at the top notebooks for that dataset on kaggle or of a similiar type but really depends, some people go heavy on the EDA and some others like just heavy modeling focused with little comment but great metrics

dusty siren Jul 25, 2024, 5:26 AM

#

hey! I am doing this little project to help detect cardiac arrest and cant find agonal breath datasets anywhere... any tips?

finite bridge Jul 25, 2024, 2:27 PM

#

Hey Kagglers, I'm currently doing the Guided Tutorial for the Petal to the Metal competition. I understand what the error image is saying but I don't understand the long string of letters and numbers that comes after the "object at".

The image without the error message is where I'm assuming the error is pointing me to, but here's the problem: The error wants me to use the DefaultDistributionStrategy but I need to use the TPUStrategyV2 in order for this guided tutorial to actually be meaningful. I have no idea how to make the distribution strategies the same but all I know is that I need to keep using the TPUStrategyV2. Help would be appreciated 🙏

whole gazelle Jul 26, 2024, 1:58 AM

#

🙂Does Kaggle offer an internship or pretraining program like Revature for job placement?🙂

obsidian pulsar Jul 26, 2024, 2:03 AM

#

finite bridge Hey Kagglers, I'm currently doing the Guided Tutorial for the Petal to the Metal...

you're correct that you need to use TPUStrategyV2 instead of DefaultDistributionStrategy. To use TPUStrategyV2 ,
you can create an instance of it and pass it to the distribute method of your dataset.

obsidian pulsar Jul 26, 2024, 2:03 AM

#

dusty siren hey! I am doing this little project to help detect cardiac arrest and cant find ...

PhysioBank or CAD

finite bridge Jul 26, 2024, 2:55 AM

#

obsidian pulsar you're correct that you need to use TPUStrategyV2 instead of DefaultDistribution...

thank you so much asao. although, i'm not sure how i would implement this. do you have any example code i could use or how exactly i would do this?

dusty siren Jul 26, 2024, 3:03 AM

#

obsidian pulsar PhysioBank or CAD

hi! thanks for the quick reply. i checked physiobank but kinda confused about what is on the website… it showed i. one result but when downloaded it was a text file of just random text… also not sure what CAD is

maiden imp Jul 26, 2024, 4:43 AM

#

Guys your views on this book? For a beginner? What should be the perquisites?

sly inlet Jul 26, 2024, 5:00 AM

#

the only drawback i felt for this book is it uses tensorflow which is pain in the ass

#

try this free pdf version of book , and see if its fits your need
https://udlbook.github.io/udlbook/

maiden imp Jul 26, 2024, 5:04 AM

#

sly inlet try this free pdf version of book , and see if its fits your need https://udlboo...

Thanks buddy.

brisk cargo Jul 27, 2024, 9:21 AM

#

"Notebook Threw Exception" Error

Hi all, I’m seeing a "Notebook threw exception" error after submitting my notebook for the competition. It runs fine locally. Has anyone encountered and resolved this issue? Any tips would be appreciated!

Thanks,

tender galleon Jul 27, 2024, 10:12 AM

#

anybody knows the reason for this ? I'm doing a simple project from a book I'm reading and this keeps happening which makes running any cell take way more time than it should

clever bay Jul 28, 2024, 2:48 AM

#

2 things:
Do you recommend using Kaggle learn
Is it better do the notebook first and use the learn page for reference or just read the page and then do the notebook

quasi holly Jul 28, 2024, 3:28 AM

#

maiden imp Guys your views on this book? For a beginner? What should be the perquisites?

Hey, i learnt machine learning via kaggle free courses then went for this book, this book is goes in a lot of depth for each topic
Pre-requisits - Pandas, Numpy, matplot lib, you will struggle with ML in whole if you donot master these 3 topics
Just master Pandas, matplot lib then go for it

modern trellis Jul 28, 2024, 11:21 AM

#

Hey, I have a few questions regarding hyperparameter tuning for kaggle contest models:

Where do you tune your hyperparameters? Locally or on kaggle notebooks (or maybe google colab?)?
How long does it generally take? How to decrease tuning time?
What's the best library for hyperparameter tuning? Optuna?

plush cairn Jul 28, 2024, 2:54 PM

#

Hello everyone, hope you are well.
Is there a way to import a public github repository to my Kaggle notebook environment?

dusty mauve Jul 28, 2024, 3:49 PM

#

Hi, does anyone know how to load in a pretrained image classification model in kaggle ?

heady karma Jul 28, 2024, 5:44 PM

#

Hello everyone,

I need help with this project if anyone can help it will be very helpful.

Here is the Link - https://www.kaggle.com/discussions/questions-and-answers/522860

Help Needed: Improving Time Series Prediction Accuracy with Ensembl...

Help Needed: Improving Time Series Prediction Accuracy with Ensemble Models.

vivid sand Jul 29, 2024, 2:36 PM

#

Hey all, this is a long shot, but anyone willing to have a discussion on ethics in data science? This can range anywhere from privacy to bias. And just full disclosure, I'm genuinenly interested in ethics, but this also happens to be part of a data ethics course i'm taking as part of my grad program. Any help or direction is appreciated! I've reached out on many outlets and not getting many hits unfortunately

fickle trail Jul 29, 2024, 5:17 PM

#

vivid sand Hey all, this is a long shot, but anyone willing to have a discussion on ethics ...

Can you tell me more about data eithics?

#

This is new tpic to me

vivid sand Jul 29, 2024, 6:08 PM

#

I'm not the best person as I'm still taking the class hehe 😅 but from what I know the name pretty much explains: any ethical issues that may arise within the field of data science would count. The examples I mentioned seem to be pretty wide-ranging. The first one has to do with not infringing upon people's privacy, with some examples being health data (HIPAA), Amazon Echo potentially listening to private conversations, Apple accessing biometric data such as face scans, and etc.

verbal crest Jul 29, 2024, 6:09 PM

#

In case you didn't know, Kaggle also has a short course with some reading and practical exercises on AI Ethics: https://www.kaggle.com/learn/intro-to-ai-ethics

Learn Intro to AI Ethics Tutorials

Explore practical tools to guide the moral design of AI systems.

#

(This is just a subset of the broader topic of ethics in data science)

vivid sand Jul 29, 2024, 6:14 PM

#

Thank you Myles. I should have mentioned that I'm looking to interview a data science professional about ethical dilemmas they've faced in their work as part of a homework assignment. I know this is a bit of shameless promotion on my part, but I've been asking around and I'm really unsure of where else to find interview candidates 😓

finite bridge Jul 30, 2024, 4:44 PM

#

Hey Kagglers, I'm currently doing the Guided Tutorial for the Petal to the Metal competition. I understand what the error image is saying but I don't understand the long string of letters and numbers that comes after the "object at".

The image without the error message is where I'm assuming the error is pointing me to, but here's the problem: The error wants me to use the DefaultDistributionStrategy but I need to use the TPUStrategyV2 in order for this guided tutorial to actually be meaningful. I have no idea how to make the distribution strategies the same but all I know is that I need to keep using the TPUStrategyV2. Help would be appreciated 🙏

heady patio Jul 30, 2024, 9:31 PM

#

Hi everyone! I want to make a facial emotion expression detection model with deep learning or machine learning. I have made a model which gives 60 accuracy rate with machine learning. The model I have tried was XGBoost. If you have a better model with deep learning can you suggest any? And the main question I needed answers is that how can I make Feature Extractions from faces. Like when you are angry your eyebrows gets closer or your mouth relocates a bit upper.

magic gate Jul 31, 2024, 7:16 PM

#

Has anyone managed to run Co-DETR (https://github.com/Sense-X/Co-DETR) or DiffusionVID (https://github.com/sdroh1027/DiffusionVID) on colab? They are object detection models. I haven't been able to run them and can't find any way to do so. I wanted to see if anyone in the world has managed it lol

GitHub

GitHub - Sense-X/Co-DETR: [ICCV 2023] DETRs with Collaborative Hybr...

[ICCV 2023] DETRs with Collaborative Hybrid Assignments Training - Sense-X/Co-DETR

GitHub

GitHub - sdroh1027/DiffusionVID: Official Repository of the paper "...

Official Repository of the paper "DiffusionVID: Denoising Object Boxes with Spatio-temporal Conditioning for Video Object Detection" - sdroh1027/DiffusionVID

viral grove Aug 1, 2024, 4:55 PM

#

Need help answering this question please: https://www.quora.com/unanswered/I-need-help-evaluating-my-options-of-a-masters-degree-in-data-science-artificial-intelligence

Quora

I need help evaluating my options of a master's degree in data scie...

1 person wants answers to this question. Be the first to answer.

jovial leaf Aug 1, 2024, 10:07 PM

#

Is there a way for 2 people to edit a Kaggle notebook live? Something similar to Google Colab
I invited someone to edit, but he doesn't see my changes unless i hit save version, which isn't 100% live

plucky spire Aug 1, 2024, 11:46 PM

#

jovial leaf Is there a way for 2 people to edit a Kaggle notebook live? Something similar to...

You can use a Jupyter notebook in PyCharm and invite the other person, you will see everything live and both can execute code

jovial leaf Aug 1, 2024, 11:48 PM

#

plucky spire You can use a Jupyter notebook in PyCharm and invite the other person, you will ...

I finally used LiveShare in Vscode. But what you told me still works for me, ty

clever bay Aug 2, 2024, 6:36 AM

#

I've learned we use pre-trained models when making convolution networks in Tensorflow, is this the same in PyTorch

#

And how long does it take to train a CNN in Torch

honest vortex Aug 2, 2024, 12:42 PM

#

Hello. I am working on a problem where i need to change the labels(of training data) of my .png from 0 to 255 orignally , to 0 and 1. I am trying to use a ML model that expects labels in 0, 1 form. I have tried many ways such as threshold , using numpy but I couldnt figure it out. is there specific way to do this?

fossil vapor Aug 2, 2024, 1:15 PM

#

honest vortex Hello. I am working on a problem where i need to change the labels(of training d...

I am not aware of any specific function but you can divide all the labels by 255 and get the new values which will be in the range 0 to 1

honest vortex Aug 2, 2024, 1:55 PM

#

fossil vapor I am not aware of any specific function but you can divide all the labels by 255...

Thanks for the reply. I have tried this before and it makes my image completly black , making my labels ( which are in white) not visible. I have noticed that if i change the labels to numbers like 50,100 etc my image labels which are white gets a bit less visible but i can still see them. but when i reach 10 and below it becomes completly dark

#

Im working on retinal vessels segmentation problem and this is one of my eye images with its labels

pale lark Aug 2, 2024, 3:44 PM

#

Hey i have an issue here, I tried to use a dictionnary but it says that there remains a key error where I actually definied " 6" in my dictionnary, anyone to help ?

thick terrace Aug 3, 2024, 11:02 AM

#

how machine learning models handling nan values in target prediction in the case of descision trees? i dont understand, imagine the root of a tree in a forest doesnt have value

tribal grail Aug 3, 2024, 6:14 PM

#

Hey, I just installed Cuda and I think , cuda makes ur Neural Network use GPU's power while training , my gpu is being utilized (20-30%), but it also shows that my cpu is being used fully 100% , is it like that or have it done something wrong while installing Cuda ,
I am running on VS code btw

finite bridge Aug 3, 2024, 8:51 PM

#

Hey Kagglers, I was looking at the list of competitions on Kaggle and I was wondering how to bridge the gap between only taking Kaggle's micro courses and actually entering real competitions with cash prizes and knowing how to solve the problem of competitions.

verbal crest Aug 5, 2024, 4:11 AM

#

finite bridge Hey Kagglers, I was looking at the list of competitions on Kaggle and I was wond...

A good progression might look like:

Courses
Titanic Competition + Other Beginner Competitions
Playground series competition
Full prize competition

Each competition will require learning new skills and trying new things. Many people jump right to prize competitions and learn that way, there is no one answer, it depends on your skills, learning style, and comfort levels.

haughty wyvern Aug 5, 2024, 4:00 PM

#

Guys, I'm trying fit my model but, my CPU is always 100% . How can I fix it?

empty mural Aug 6, 2024, 6:00 AM

#

haughty wyvern Guys, I'm trying fit my model but, my CPU is always 100% . How can I fix it?

You can try changing your accelerator to a T100 or T4 GPU, depending on your work.

Go to settings --> accelerator --> choose accelerator

I think it'll work

haughty wyvern Aug 6, 2024, 10:31 AM

#

Tks

mint bough Aug 8, 2024, 3:06 PM

#

Is anyone willing to take a few newbies under their wing and teach us and work with us in a group of 4?

delicate hill Aug 8, 2024, 4:29 PM

#

can anyone guide me on how to find questions or prompts that can help with analysis in kaggle datasets?

dull sky Aug 9, 2024, 1:19 PM

#

hey! how do I access the output files after a commit?

thick terrace Aug 9, 2024, 2:34 PM

#

navigate to explore your csv file wheres is saved

#

i use the everything app

#

Q: can i use more submissions?

dull sky Aug 9, 2024, 3:44 PM

#

@thick terrace at kaggle? okay, thanks

thick terrace Aug 9, 2024, 3:45 PM

#

i was rtalking about your pc but if you run notebook on kaggle sure the directory have them

dull sky Aug 9, 2024, 3:45 PM

#

Right, I found it. It turns out my previous attempt failed before it saved the model results.

#

The current one worked.

#

If I commit 2 (or more) notebooks at the same time, do they share the same cpu?

brittle bison Aug 11, 2024, 3:49 PM

#

With regards to Kaggle competitions, I am a little confused about the given test set. Would it be bad to train the models on the training set, then test each model on the test set, submit it, and choose the submission with the lowest test error? I understand we don't want to overfit the test data, but the given test data isn't the "true" test data set anyway - the true one is hidden. So are we free to use it?

placid galleon Aug 11, 2024, 6:56 PM

#

hey i wanted this data set DFL - Bundesliga Data Shootout,but it says that it was for the competition only is there someway to access it ? i want to make a computer vision project

deft fox Aug 12, 2024, 5:23 PM

#

brittle bison With regards to Kaggle competitions, I am a little confused about the given test...

Class labels are not available for test data. One can try pseudo-labeling but that increases the likelihood of overfitting.

brittle bison Aug 12, 2024, 6:17 PM

#

deft fox Class labels are not available for test data. One can try pseudo-labeling but th...

But doesn't kaggle score the submission (as the scores are shown on the leaderboard). So aren't the class labels used for the scoring?

reef meteor Aug 12, 2024, 9:24 PM

#

Hi everyone. I just completed the Titanic competition and and ready to make the submission. However, the help tutorial doesn't seem to match the current UI? Is there a way to actually submit directly from my notebook anymore?

radiant breach Aug 14, 2024, 9:14 AM

#

Hi everyone, I have a dump question,
in the isic-2024 challenge requires internet to be off, right,?, then if i want to install some libraries, like 'pip install <library>', that won't be possible, what could be the alternative?!!

sullen lintel Aug 14, 2024, 3:50 PM

#

reef meteor Hi everyone. I just completed the Titanic competition and and ready to make the...

What is the accuracy of your model?

dull sky Aug 15, 2024, 5:46 AM

#

Hey there! I'm looking for EDA books, so far I found one decent, albeit old one. Could you recommend newer books in the topic?
Exploratory data analysis by Tukey, John W. (John Wilder), 1977

#

there are some medium articles, but I'm looking for indepth why, how, what kind of books

sacred sonnet Aug 17, 2024, 2:08 AM

#

Guys how do you use hugging face's models in a local environment

deft fox Aug 17, 2024, 3:59 AM

#

brittle bison But doesn't kaggle score the submission (as the scores are shown on the leaderbo...

Class labels are available to Kaggle, but not to us. We only see a score on a fraction of test data.

shy flume Aug 17, 2024, 4:42 AM

#

hey guys i have a question about a dnn implementation

I have the following backward function:
`
def backward(layer, dA_prev):

Calculate dZ by calling the activation_backwards() method from your Layer class

and pass to it dA_prev

dZ = layer.activation_backwards(dA_prev)

m = dA_prev.shape[1]

layer.dW = 1/m * np.dot(dZ, layer.input.T)
layer.db = 1/m * np.sum(dZ, axis=1, keepdims=True)
layer.db = np.squeeze(layer.db)

# Compute gradient for the previous layer

dA_prev = np.dot(layer.weights.T, dZ)

return dA_prev

Calculate dW and db using the input to this layer (ie the activation of the previous layer).

`
but I cant seem to pass the doctests
im wondering if anyone knows whats wrong. if u need extra information in order to get a gauge of the issue, pls ask.
Thanks!

toxic hollow Aug 20, 2024, 7:12 AM

#

Hi, everybody.
I have a question
I need to extract the abstract of papers not using GPT4, I have to rely on local resource.

from py_pdf_parser.loaders import load_file
from py_pdf_parser.components import ElementOrdering

document = load_file("JPM-2022-Harvey-25-46.pdf")
file_path = 'JPM-2022-Harvey-25-46.pdf'

document = load_file(
file_path, element_ordering=ElementOrdering.RIGHT_TO_LEFT_TOP_TO_BOTTOM
)

So I parsed the pdf using py_pdf_parser, and I'm going to merge the pieces until obtain the compelete abstract.
Now I try to use embedding models for this. But that doesn't work well.
If somebody has solution to about this, please help me.
Thanks!

#

In the case that I have to use the LLM models, the size should be under 2GB.

jade jetty Aug 20, 2024, 9:57 AM

#

Hi all,

Do anyone have information about if DEFCON will launch a CTF competition at Kaggle this year?

toxic hollow Aug 20, 2024, 4:35 PM

#

Hi, everybody.
In the implementation of RAG, could you tell me the challenges and solutions to that?
And if you provide the references, I'm very thankful for that.
And I read papers where it uses BERT model to retreive the necessary data.
I want to know that it is useful for RAG, now. I think ChatGPT4 is perfect for this task.
So I want to know about the usage of LLM models and Ml models in retrieval process.

brittle bison Aug 21, 2024, 3:14 AM

#

Is there any benefit to multiple submissions to a competition? Isn't there a danger of overfitting if you are going to choose your best performing submission on the leader board? Or do people do it as somehow an approximation of generalization error?

feral coral Aug 22, 2024, 2:36 PM

#

hi guys, does anyone have any experience with roboflow models? I'm doing an ML project for the first time and I was instructed to make my dataset and train my model there, but I'm lost on how to proceed further. if anyone has any experience, please lmk so I can consult you. thank you in advance

hushed dirge Aug 22, 2024, 5:48 PM

#

Hi Everyone, what would be the best data science learning path to get a job as Junior Data Scientist Role ?

deft fox Aug 23, 2024, 7:17 PM

#

brittle bison Is there any benefit to multiple submissions to a competition? Isn't there a dan...

There are always benefits to multiple submissions if one knows how to use the information properly. There is no solution that fits all scenarios, but in many cases it is possible to figure out how well public leaderboard (LB) scores correlated with hidden scores. If they do, then one can trust the LB. If not, is is necessary to develop a rigorous cross-validation (CV) scheme and stick to that. Come to think of it, it never hurts to have a good CV scheme, but sometimes we can trust the outside information as well.

deft fox Aug 23, 2024, 7:20 PM

#

hushed dirge Hi Everyone, what would be the best data science learning path to get a job as J...

If there was a simple answer to that question we would have millions of JDS people out there. There is no singe path, nor the best path, to anything in life. For some people learning DS will come as part of their regular work. Others will get a CS degree or take online courses. Yet others will jump into Kaggle competitions and learn DS by copying what other people are doing.

shadow arrow Aug 24, 2024, 5:03 PM

#

hi, i've been working on playground series churn dataset classification
i have done basic data cleaning and OrdinalEncoding
using XGBclassifier with 150 estimators gives me an accuracy of around 75
how can i increase the accuracy of the model

#

i have also tried using the Pytorch
it shows an train accuracy of 78%
but when i see the actual output its either 0 or 1 for all the predicted values
how do i fix this

wanton orchid Aug 24, 2024, 6:28 PM

#

Anyone familiar with NoBackendError in librosa ? The person who used code seemed to be fine but when I used exactly the same code I got this

deft fox Aug 25, 2024, 2:19 AM

#

shadow arrow i have also tried using the Pytorch it shows an train accuracy of 78% but when i...

Not enough detail to know for certain, but most likely you need to use .predict_proba instead of .predict

quick crag Aug 26, 2024, 5:03 AM

#

Radhe Radhe buddies, can any one suggest me some projects or sources or anything to upskill my self in data science, analytics and machine learning further and some content to add in my resume. tHANK YOU

dark raft Aug 26, 2024, 5:40 AM

#

I have a quesiton regarding EDA. Should we split the data into training and test set and only do EDA on training set? I've seen some articles say that this can prevent over fitting and data leakage.

clear stag Aug 26, 2024, 9:52 PM

#

Just a thought. Does anybody feel that AI (i.e. ChatGPT, etc.) is completly overvalue... pure smoke?
It feels there is a FOMO but when you try to do some staff is not so great

deft fox Aug 27, 2024, 2:31 AM

#

clear stag Just a thought. Does anybody feel that AI (i.e. ChatGPT, etc.) is completly over...

Some functionality is still way behind humans, but some things AI can do faster and better. So no pure smoke, but definitely some overhyped a bit. It is still early days.

dark raft Aug 27, 2024, 8:33 AM

#

clear stag Just a thought. Does anybody feel that AI (i.e. ChatGPT, etc.) is completly over...

This is a good article that might answer your question https://nicholas.carlini.com/writing/2024/how-i-use-ai.html#intro

How I Use "AI"

I don't think that AI models (by which I mean: large language models) are over-hyped. In this post I will list 50 ways I've used them.

slow narwhal Aug 27, 2024, 4:19 PM

#

Hi !
I was wondering if you could win medals even if the competition has already ended and you beat the bronze score ?

sinful root Aug 29, 2024, 1:32 PM

#

slow narwhal Hi ! I was wondering if you could win medals even if the competition has alread...

Most competitions with a close date won't reconsider if someone gets a higher score later (imagine how it would feel to have your bronze metal taken away months later!) but we have some competitions that are evergreen. Of course, we're always adding new competitions, and you get the satisfaction of seeing your solution as one of the best!

spring junco Aug 30, 2024, 5:17 AM

#

How can I get badges from Kaggle?

empty belfry Aug 30, 2024, 5:42 PM

#

Can I ask here some feedback about PC hardware specifications?

raven thunder Aug 31, 2024, 4:53 AM

#

hi , a kaggle grandmaster nischay dhankar alumini of my college gave a session on kaggle. i dont have any background in coding . I am very fascinated by medical imaging . what could be the path?

wraith sun Aug 31, 2024, 5:12 AM

#

Hi, I'm new to machine learning in general and I would like to ask where do I start? Like which specific math should I study first?

#

I want to be able to understand what is actually happening behind the scenes every time I train a model. Maybe with this I can make it perform better

torn vector Sep 1, 2024, 4:12 AM

#

Is anyone from here got 100% accuracy on Titanic unseen data ?
I always wonder on kaggle leaderboards how people get accuracy 100% where I am struggling with 86%

drowsy solar Sep 1, 2024, 12:49 PM

#

Have some valuable course or tutorial for the new start?

meager bone Sep 2, 2024, 4:30 AM

#

drowsy solar Have some valuable course or tutorial for the new start?

I am trying Coursera

toxic hollow Sep 2, 2024, 5:14 AM

#

Hi, everybody. I have a question.
I want to make a method to architecture the neural network for given real problem.
Is this possible?
So, I mean can we make the certain arhictecture of network based on neuro science?
Please help me overview of this and methods.
Where I can find the proper references?

drowsy solar Sep 2, 2024, 5:33 AM

#

meager bone I am trying Coursera

Is Kaggle course available in Coursera?

meager bone Sep 2, 2024, 5:39 AM

#

drowsy solar Is Kaggle course available in Coursera?

Oh mb i thought you were talking about Data Science

drowsy solar Sep 2, 2024, 5:41 AM

#

meager bone Oh mb i thought you were talking about Data Science

ok.Thanks

silver bear Sep 2, 2024, 8:27 AM

#

Hi Everybody,
If you are interested in conducting academic research in the fields of generative AI, NLP, and XAI, Please do not hesitate to contact me. 🙏

shut prairie Sep 4, 2024, 12:37 PM

#

Hi, I've been trying for a couple of days to submit a notebook to a competition and I cannot do it because it says my notebook is using a non-versioned dataset. I have tried multiple times to pin the dataset to a version, but after I open that window it simply doesn't show anything, it just says "Loading…", although I've left it some time to load. My teammate has the same problem. I have also tried to download the dataset through the Kaggle API but it fails.
Any suggestions that I could try?

verbal crest Sep 4, 2024, 9:23 PM

#

shut prairie Hi, I've been trying for a couple of days to submit a notebook to a competition ...

Hey Malina, I passed this to an engineer on our team to look at.

worn herald Sep 4, 2024, 9:46 PM

#

shut prairie Hi, I've been trying for a couple of days to submit a notebook to a competition ...

Hi Malina, sorry for the confusing UX. We need to do a better job of surfacing the reason for the failure. In this case, the problem is that the dataset you're trying to pin a specific version for (https://www.kaggle.com/datasets/gordonyip/binned-dataset-v3) is an "Unversioned" dataset, meaning that the creator of that dataset has determined that they only want the most recent version to be accessible to others. This may also be why the API is failing (though I'm not sure which command you're running/is failing). So the options at this point would be to:

Create a discussion on the dataset requesting that the creator update the settings for the dataset to allow access to all versions.
If the current version of the dataset is what you want to pin, download the source and re-upload as a fixed dataset under your control. You could do this via the UI or if you prefer the API, I can help you figure out why the API isn't working--I just need to know what command you're running.

Calibrated-binned-dataset

Calibrated data product but without linearity correction.

merry thunder Sep 5, 2024, 11:01 AM

#

Guys i am a beginner in kaggle and this might be an stupid question but I pressed file, went to upload input and uploaded a dataset and named it taxis-dataset1. Then when i enter it into the code it says file not found. Why is this?

azure dove Sep 5, 2024, 5:23 PM

#

merry thunder Guys i am a beginner in kaggle and this might be an stupid question but I presse...

add the file extension, .csv or .xlsx

worn herald Sep 5, 2024, 11:21 PM

#

merry thunder Guys i am a beginner in kaggle and this might be an stupid question but I presse...

I recommend using this Copy file path feature in the sidebar. Click that, and then paste the value into the read_csv call (as a string). It should be something like /kaggle/input/taxis-dataset1/path/to/your_file.csv.

merry thunder Sep 6, 2024, 8:05 AM

#

thanks guys

warm umbra Sep 6, 2024, 8:09 AM

#

Hello Everyone , I want to Analyse Log file with ML

I have some timestamped log files from controllers, drivers, etc of a device. I have seen some error codes in the log files whose causes are not known. I would like to analyze these log files and identify patterns, such as whether errors occur sequentially or if one error depends on another. Errors could also be triggered by other errors, potentially even from one day ago or just a sec ago . What could be the best and simplest approach to this? or if there are solution of similar problems plz let me know

jagged spear Sep 6, 2024, 2:33 PM

#

Hello

This is a very basic question which in my defence will probably be easy to answer, hehe.

I'm a soon to be second year AI and Datasci student engaged in the RSNA 2024 Lumbar Spine Degenerative Classification purely for the learning curves.

https://www.kaggle.com/competitions/rsna-2024-lumbar-spine-degenerative-classification

A peer of mine, perhaps correctly, says that we have to split the images into training, test and validation classifications. He wants to do this using code that randomly selects images and puts them into any one of the 3 categories.

However the competition already presents testing and training datasets with, I'm sure I remember correctly but couldn't find the documentation that details it, a final unseen set of images that it performs the classification on so as to determine the effectiveness of the model.
Also nowhere in the Efficientnet sample can I see anything that does that classification.

I think I am right here in that in terms of testing and validation the images are already classified and it's only through a dictionary that some of the images need the conditions and plains added to them.

Thanks for any and all help, any clarification will help a great deal.

RSNA 2024 Lumbar Spine Degenerative Classification

Classify lumbar spine degenerative conditions

zenith lodge Sep 6, 2024, 9:09 PM

#

Does anyone work in or have experience with fraud detection? I'm super interested in the topic and have a lot of questions

torpid dock Sep 7, 2024, 1:26 PM

#

Guys where is the auto complete settings in kaggle notebook? I need to complete step by step code but if I press tab all codes written at the same time

obsidian pulsar Sep 8, 2024, 3:07 PM

#

zenith lodge Does anyone work in or have experience with fraud detection? I'm super intereste...

Yes, I've had that happen to me. how can I help u?

hard gust Sep 9, 2024, 2:51 AM

#

Hi - how to create a Team for challenges and add team members? It seems the Teams link for challenges (titanic and house price) have expired: https://www.kaggle.com/c/titanic/team https://www.kaggle.com/c/house-prices-advanced-regression-techniques/team

Titanic - Machine Learning from Disaster

Start here! Predict survival on the Titanic and get familiar with ML basics

House Prices - Advanced Regression Techniques

Predict sales prices and practice feature engineering, RFs, and gradient boosting

terse badger Sep 9, 2024, 1:18 PM

#

Hello everyone, I would like suggestions on how to solve this type of dataset:https://www.kaggle.com/datasets/ashisparida/amazon-ml-challenge-2023
This is previous year's amazon ml challenge dataset. I am interested to know how product_length is predicted using text data

Amazon ML Challenge 2023

#

this is the cleaned and merged version of the columns of title bullet_points and description

#

I am unable to understand what preprocessing or feature engineering to do

#

Can anyone suggest something to start with?

toxic mantle Sep 10, 2024, 11:17 AM

#

I want to ask if I need to reinstall the dependencies every time I restart?

fervent glen Sep 11, 2024, 9:17 PM

#

I'm trying to do the Intermediate ML course, and I'm dealing with this error in the missing values exercise. Any ideas?

fervent glen Sep 11, 2024, 9:20 PM

#

toxic mantle I want to ask if I need to reinstall the dependencies every time I restart?

Restart what? I found that for course exercises at least, it seems to be you have to restart setup, yes.

wild garden Sep 11, 2024, 10:24 PM

#

hi, this might be a stupid question but do you need to become a data analyst first before becoming a data scientist?

fervent glen Sep 12, 2024, 2:36 AM

#

wild garden hi, this might be a stupid question but do you need to become a data analyst fir...

https://www.data-mania.com/blog/data-science-vs-data-analytics-which-to-learn-first/#:~:text=By starting with data analytics,in studying data science first.
https://www.reddit.com/r/datascience/comments/11abupo/were_you_a_data_analyst_before_becoming_a_data/
https://www.youtube.com/watch?v=kr59DGtWDTs
https://www.reddit.com/r/datascience/comments/11abupo/were_you_a_data_analyst_before_becoming_a_data/

You could read into it yourself. There's a couple of things online about that.

From the datascience community on Reddit

Explore this post and more from the datascience community

YouTube

Sundas Khalid

Should you become a Data Analyst to become a Data Scientist?

Is data analyst a pre-requisite to become a data scientist? In this video, we are discussing common skillset between the two roles and what makes them different. What do you think? Share in comments if you think data scientist should first become data scientist.

My Self-Taught Data Science Journey: https://youtu.be/34r9OwjysDM
How I Would Lear...

▶ Play video

Data-Mania, LLC

Data-Mania Writer's Guild

Data Science vs. Data Analytics: Which To Learn First?

On the fence between choosing to learn data science vs. data analytics? Learn why analytics acts as a prerequisite to being a data scientist.

cunning thunder Sep 13, 2024, 2:05 PM

#

You are working on a binary classification ML algorithm that detects whether an image of a classified scanned document contains a company’s logo. In the dataset, 96% of examples don’t have the logo, so the dataset is very skewed. Which metric would give you the most confidence in your model?

Im torn between Recall and F1 Score, which is right in this scenario I lean towards recall more and my reason for that is that f1 gives recall and precision equal weights, and since my positive class is only 4% of the data and catching the positives is the priority recall makes more sense here. I would love any help thanks!

worn gale Sep 14, 2024, 1:35 AM

#

Well experienced data scientist, do you guys always write all your python code from head?
Or you use online tools or AI as we the newbies

rich vortex Sep 14, 2024, 7:08 AM

#

cunning thunder You are working on a binary classification ML algorithm that detects whether an ...

F1 Score offers the harmonic mean score between Recall and Precision. Recall measures the true positives and is important when a model is susceptible to lack of positives, which seems to be your case. So, from that perspective, recall seems the sensible option. However, I suggest applying both metrics as they provide quantitative assessments of your model that describe the strengths and weaknesses. Thus, giving you more diagnosis options.

real ravine Sep 14, 2024, 12:08 PM

#

cunning thunder You are working on a binary classification ML algorithm that detects whether an ...

You're right. As long as catching the positives is your priority, you should definitely use recall.

#

How do teams typically collaborate when working towards a Kaggle competition? GitHub, shared Kaggle notebooks, or something else?

dire cloud Sep 15, 2024, 5:59 PM

#

I am taking a machine learning course that will be about 70% theory for exams. Does Kaggle have anything that I can practice regarding theory problems, or is there any resource that would make sense?

fast crown Sep 16, 2024, 4:04 AM

#

Does anybody working on project like comprehensive tool like "Social Listening"? or have some experience on such technology?

normal pier Sep 17, 2024, 10:06 AM

#

Hello Everyone,

I've used the MLC-LLM chat apk which in the below link:
https://blog.mlc.ai/2024/06/07/universal-LLM-deployment-engine-with-ML-compilation

I've used the MLCChat App, it showing the below models alone which is shown in the screenshot, how can I access my own models, which is in my device?
Is there any possible to use my local models in this MLCChat App ?
How can I do this with flutter?

Please help me out in this doubt !!!

plain mural Sep 17, 2024, 10:49 AM

#

No module named 'kaggle_evaluation'
[2:40 PM]
This is my first Kaggle competition. How can I import files into the notebook to be able to use such import statements?

uncut burrow Sep 19, 2024, 4:43 AM

#

Hello everyone, I am new to Kaggle. How can I download csv files on kaggle ?

tepid shuttle Sep 19, 2024, 5:20 PM

#

for any kaggle competition we need to have internet disable, now i need to pip install some dependencies or updated library, how do i permanently store them in notebook, so can access those libraries in internet access disabled mode?

verbal crest Sep 19, 2024, 8:08 PM

#

tepid shuttle for any kaggle competition we need to have internet disable, now i need to pip i...

Lucky for you we just launched a new feature that solves this problem very neatly: https://www.kaggle.com/discussions/product-feedback/532336

[Feature Launch] Introducing Package Manager | Kaggle

[Feature Launch] Introducing Package Manager.

gleaming cradle Sep 19, 2024, 8:16 PM

#

Hey I'm working on a dataset of Forest fires, model has to predict the probability of fire in the forest after getting some inputs from users. I've done most of the part but I'm getting a high MAE. Let me know if anyone can help!

spring cobalt Sep 20, 2024, 5:08 AM

#

can anyone help me ? whenever i went for submit it show inference error and got reject

empty mural Sep 20, 2024, 7:16 AM

#

Hello everyone

Please I have a problem.

I fine-tuned Gemma 2 on the Kaggle notebook now I would like to save the model fine-tuned and share it on Kaggle Models in my account. I don't know how I can do that.

Please somebody can help me 🤲

uneven flint Sep 21, 2024, 10:53 AM

#

Hi

ripe blaze Sep 22, 2024, 1:07 PM

#

Someone that can explain me how do I know my model found the global minimum when tuning the weights? Like I just learned about the gradient descent, and its looking for the global minimum. But there are also local minimums... How do I know the gradient descent found the global minimum over the local minimum.. There a way to visualise it? proof that my model using the best weights possible?

daring kiln Sep 22, 2024, 2:47 PM

#

does kaggle changed its UI ? the notebook is now confusing with big words , can some pls help ?

ripe blaze Sep 22, 2024, 3:03 PM

#

lol try ctrl + schroll down?

open dock Sep 22, 2024, 4:28 PM

#

Hi . I am absolutely beginner in kaggle (just started) . I have a core i3 6th laptop with 4gb ram in linux system would that be enough to continue with kaggle or do I need a graphics card and more rams to operate minimally ? I am on a budget so I need some suggestions

eager fossil Sep 23, 2024, 2:56 AM

#

Hey I tried kaggle course from start to learn ml

#

But I get stuck at some points

fading swift Sep 24, 2024, 9:05 PM

#

Hello everyone, I’m looking for suggestions on how to become job-ready for a Machine Learning Engineer position. I’ve completed my certification and worked on various projects, but my resume hasn’t been getting selected. If anyone could share a strong resume example, I would greatly appreciate it.
Although I have over four years of experience in Operations, I am eager to transition into Data Science. I have been searching for a job since December last year, and I would welcome any ideas on how to land a position and kickstart my career.
Additionally, I’m interested in projects I can build to enhance my portfolio and skills I should focus on to become job-ready as soon as possible. So far, my projects have been fairly generic, and I want to stand out.

analog shale Sep 25, 2024, 8:32 AM

#

open dock Hi . I am absolutely beginner in kaggle (just started) . I have a core i3 6th la...

Those stats are not really great for data science on your local machine. If you want to do serious experiments on your local machine, a GPU + more RAM + a decent CPU (depends on how much data preprocessing you need) is a must. I would say, your current laptop will not suffice for more than basic exploratory data analysis.
But luckily, Kaggle does provide everyone with 30h/week of GPU and TPU compute and there are more cloud providers with free/affordable GPU quota (e.g. GoogleColab). Thus, I would not rush with buying a new setup, but instead try what you can achieve with the provided free cloud resources.

frank granite Sep 25, 2024, 2:17 PM

#

hi all, im pretty new to using kaggle, but i have a few notebooks working fine but my current notebook is giving me an issue when trying to !git clone:

Cloning into 'testing'...
Username for 'https://github.com':

and it just hangs here

#

ok solved this - misstyped the url

stray lily Sep 25, 2024, 6:03 PM

#

#help Im a beginner in ml . So how can i learn ml? I know i should start from preprocessing of data. But i don't have resources availabe for that . Can you guys share me the resource link?

dull canyon Sep 26, 2024, 11:43 AM

#

Hello 👋 I'm training and autoencoder on signal data using LSTMs for anomaly detection.
For normalization I'm using sklearn.StandardScaler. For . fit(), should I only pass in the cleaned data without any deviating signals or the entire data?

final galleon Sep 26, 2024, 11:50 AM

#

dull canyon Hello 👋 I'm training and autoencoder on signal data using LSTMs for anomaly det...

Why you cleaned your data ?
I guess to train the model only on with the clean data, without any noise. Therefore, in my opinion normalize only the cleaned data (that you will use in training & testing)

dull canyon Sep 26, 2024, 11:51 AM

#

final galleon Why you cleaned your data ? I guess to train the model only on with the clean d...

Yes that's what I've been doing. But on deployment after scaling I'm getting large values which are ruining the model performance

final galleon Sep 26, 2024, 11:56 AM

#

dull canyon Yes that's what I've been doing. But on deployment after scaling I'm getting lar...

emm I never used to work on models in deployement. Maybe I am not the right one to give the right answer. However, If I well understood what are you saying, the predictions of your model should follow the scale of your label. Therefore, the output/predictions should not be in the scale [-3, 3] (Scale of the standard scaler)

Regarding the performances, sorry man I don't have any Ideas, I don't know how to help. In fact, it happened to me once getting low performances after training and testing compared with my test scores. I assumed that this was due the fact that my model does not generalize well (I didn't fix the issue until now)

final galleon Sep 26, 2024, 11:59 AM

#

stray lily #help Im a beginner in ml . So how can i learn ml? I know i should start from pr...

I my hamble opininion pre-processing is not an ML skill. It is in fact required to prepare data for ML models. Therefore, you don't need ressources/courses on pre-processing. You need just to check the state of your raw data, spot the "noise", the pre-processing you need to do. For instance, maybe you need to convert the "15 years old" to 15 for the age feature.

This kind of pre-processing require only basic programming skills.

thin pelican Sep 27, 2024, 5:44 AM

#

hello can anyone please help me with AI related end degree bachelor IT project idea or even a walk-through?

young fern Sep 27, 2024, 5:54 PM

#

Hi, I wonder how much time should I spend on tunning the model, like finding the best parameters? I have spent some time doing feature engineering, but not sure should I stick with my current model or keep trying new models / ensemble methods. Many thanks

cerulean sage Sep 30, 2024, 12:08 AM

#

who is good with SQL? I have a little problem, i have a table called companies but it is not connecting to my database

vital venture Sep 30, 2024, 2:55 AM

#

Hey,I want to learn machine learning using python. I have completed the beginner stage.Is there anyone who can guide me?

steady vessel Sep 30, 2024, 5:15 AM

#

Hey everyone! I have a competition coming up, and the first round is focused on data structures. I'm honestly starting from scratch and not sure where to begin. I'd really appreciate any suggestions for lectures or resources to help me get started. By the way, I prefer Python, but if you think C++ or another language would be better for this, please let me know!

solid pulsar Oct 2, 2024, 1:22 AM

#

Hi, Who can help with the task of learning how to determine the value of a diamond?

misty hearth Oct 2, 2024, 3:42 AM

#

Hello Kaggle Comm, i have a question, when i save a version of my notebook that have outputs files, and then i cancel it, to save some hours
if i go back to this version & pressing edit, how i start this notebook with those output files 🤔
Info i'm seeing :
**Clicking Output Message says **
Notebook canceled
View the status under the logs tab
On Logs tab Environment
i can click Latest Container Image and there is the output size 😮
how i start this notebook with that data ?

Thanks !

solid pulsar Oct 2, 2024, 9:58 AM

#

Hi, I have a question, who has worked in ‘Diamonds Price Prediction’, I need to get MAE < 200, but when I derived the correlation matrix, I saw that Price has many args with minuses, please help

winged slate Oct 2, 2024, 1:37 PM

#

Here's a fun one: can one learn to use and build ML models without having linear algebra and calculus under their belt?

real patio Oct 2, 2024, 8:59 PM

#

Hello, @tardy lodge the requirement for those seeking for kaggle mentorship is high sort of

tardy lodge Oct 2, 2024, 9:11 PM

#

real patio Hello, <@1116845511278866544> the requirement for those seeking for kaggle mento...

hi @real patio i'm not someone who works on our mentorship program, but @wind silo is the right person to talk to

real patio Oct 2, 2024, 9:13 PM

#

Thanks Ma @tardy lodge

empty bluff Oct 4, 2024, 2:35 PM

#

Does anyone know of a data source for the US CIA World Factbook? https://www.cia.gov/the-world-factbook/

empty bluff Oct 4, 2024, 2:37 PM

#

winged slate Here's a fun one: can one learn to use and build ML models without having linear...

I'm in the same situation. The answer is yes. However, be aware that the learning curve is even steeper without it because you will have to learn it at some point to understand how you got your results.

wind silo Oct 4, 2024, 4:12 PM

#

tardy lodge hi <@1124480839548403732> i'm not someone who works on our mentorship program, b...

Thanks @tardy lodge.

Hi @real patio Happy to answer any questions you may have. I've sent you a direct message as well if you would like to chat more.

graceful axle Oct 4, 2024, 8:15 PM

#

I'm getting submission file not found error even tho everything seems alright could somebody pls help
kindly help me with this somebody

uneven fractal Oct 5, 2024, 9:52 AM

#

Hello! I’m an undergraduate student researching image reconstruction using a diffusion model. I know this is not Kaggle-related, but I'm encountering an issue with my diffusion model and wanted to seek some advice.

In my research, I’m trying to reconstruct one type of brain image from another using a diffusion model. Before using the diffusion model (when I directly used a U-Net for reconstruction), the training worked well. However, when I switched to predicting noise instead of the image itself using the diffusion model and then performing the denoising process, the training doesn't seem to work. The training loss decreases, but metrics like PSNR and SSIM (which evaluate how well the image is reconstructed) do not improve at all or even degrade.

Is this about my dataset being too small (I have 800 images in the training set)? I set the noising steps to 1000.

Has anyone experienced something similar when working with diffusion models? Any advice would be greatly appreciated. Please help..

Below is the code I use for training.

t = diffusion.sample_timesteps(n).to(config.device)
x_t, noise = diffusion.noise_images(latent_target_image,t)
predicted_noise = Unet(x_t, conditioning_3d_image, t, diag)
loss = L2(predicted_noise, noise)

sharp sun Oct 5, 2024, 11:44 AM

#

Hello every one can anyone tell me how exacly i can start and move forword for learing ai and ml.

rain oar Oct 5, 2024, 1:34 PM

#

sharp sun Hello every one can anyone tell me how exacly i can start and move forword for l...

what interests you about AI and ML? what educational background do you have? There's lots of different ways to go about it and there are lots of guides already written out there. If you're here it means you're on your way so check out Kaggle Learn and Kaggle competitions

sharp sun Oct 5, 2024, 1:45 PM

#

rain oar what interests you about AI and ML? what educational background do you have? The...

I am a full stack developer and I am exploring al and ml, currently learing python's libraries such as numpy, pandas,matlab,scikit etc. I have basic theoretical working of ml models such as supervised , unsupervised and reinforcement learning.

Kaggle is something where we can compete with build or improving models with provider datasets.

#

Correct me If am wrong because I am new to kaggle

#

But the main question is what are the correct prerequisite I should learn and understand so that i can participate

#

tech like chatgpt, githubcopilot and text to video generation really amaze me

rain oar Oct 5, 2024, 1:49 PM

#

Go look at Kaggle Learn "Intermediate Machine Learning" you will understand how to submit to competitions in two quick lessons

#

You will be able to understand the code very quickly and the general format for a basic ML model. Then iterate

#

If you're curious about chatGPT and other LLMs, then look up videos/tutorials on how to build it from scratch

#

It's all software dev: sometimes you need to look something up to understand it, sometimes you need to actually go study and read up a lot more to understand it, but in the end it's doing projects and finding out what you need to learn to progress with your projects

#

What advice would you give someone new to software dev on how to begin? 😄

sharp sun Oct 5, 2024, 3:05 PM

#

rain oar What advice would you give someone new to software dev on how to begin? 😄

for software dev I would aske to build some learning projects like blog app, chat app, video streaming app and in software dev most important is reliability , scalability, availability and security. And for senior dev go through system design.

And try to avoid tutorial hall and build app no matter small or it has any purpose.

All the ai and ml tech are used in the form of software so if software dev is must have skill to build ai ml apps

rain oar Oct 5, 2024, 3:08 PM

#

sharp sun for software dev I would aske to build some learning projects like blog app, cha...

Same goes for ai and ml: build some learning projects for ai and ml. Avoid tutorial hell. Build projects no matter how small. You got this 😄

sharp sun Oct 5, 2024, 3:10 PM

#

yes sir thank you and can you suggest some very basic ai ml projects I can build and learn on the go.

rain oar Oct 5, 2024, 3:12 PM

#

Look up learning projects online to get a list, pick one that sounds the most interesting. Or just do kaggle competitions, as they are also considered learning projects. Projects in "Getting Started" and "Playground" categories are good place to start

wintry birch Oct 5, 2024, 7:56 PM

#

Hello guys,
Do you know of any interesting datasets to refresh my pandas skills? I am mostly a beginner that have done two ML projects.

finite galleon Oct 7, 2024, 5:10 AM

#

Hello, I am working with log data analysis. Most of the work in log parsing that I have found till now use more or less similar methods with very heavy ML/DL algorithms or heuristics.
I wanted to know more about the log analysis using premitive methods like PCFG parser or some unsupervised parser which takes the whole data into account.
My ultimate goal is to generate good quality templates. Please point me to any resources if you know.

split orchid Oct 7, 2024, 9:39 AM

#

sharp sun Hello every one can anyone tell me how exacly i can start and move forword for l...

Before pivoting, one should always consider - why you want to ?
Do you know your WHYs?

sharp sun Oct 7, 2024, 9:39 AM

#

split orchid Before pivoting, one should always consider - why you want to ? Do you know you...

I am not pivoting I am just learning new tech

#

say it as exploring

split orchid Oct 7, 2024, 9:42 AM

#

sharp sun I am not pivoting I am just learning new tech

cool then, you can start with some problem statement and then traverse back from end goal to data
Example -
step1 - you want to identidy if a given pic is cat or dog
step2 - you get to know that it is done by some model ( which is an artifact )
step3 - how was that model built
step4 - what does training mean and what is the role of data here
step5 - get the data

after this go forward again with the help of some already existing guided work- like kaggle notebooks

#

learning from scratch helps - but since you are exploring - this approach is more practical and help you even more in hands-on

sharp sun Oct 7, 2024, 9:45 AM

#

split orchid cool then, you can start with some problem statement and then traverse back from...

thank you for guiding steps

for building models

i should understand ml learning models right like k means, decision tree, etc

split orchid Oct 7, 2024, 9:49 AM

#

sharp sun thank you for guiding steps for building models i should understand ml learn...

not necessarily - there are packages like scikitlearn, xgboost, LGBM for classical models
youc just need to see the documenation and start implementimg - having an algorithmic knowledge is definitely plus - but knowing how to use them is even bigger plus to start with
so you can start using them from documentations/guided code books etc

near plank Oct 8, 2024, 10:34 AM

#

Hello everyone 🤗

slim storm Oct 8, 2024, 12:13 PM

#

Instead of showing "Copy path" it is showing this in my kaggle notebook, can someone help ?

split orchid Oct 8, 2024, 1:19 PM

#

near plank Hello everyone 🤗

Hello !

split orchid Oct 8, 2024, 1:19 PM

#

thin pelican hello can anyone please help me with AI related end degree bachelor IT project ...

Hi, what is the complexity of the project you are expecting?

split orchid Oct 8, 2024, 1:22 PM

#

fading swift Hello everyone, I’m looking for suggestions on how to become job-ready for a Mac...

Hi These days almost all profile for ML/DL/AI needs some prior experience. But this can be bypassed by displaying the real AI project. You can make a project portfolio which helped you solve a real world problem. You should be able to demonstarte the problems you faced and how you overcame them.
you can ping me if you want some personlaized solution.
Thanks

slim basalt Oct 8, 2024, 7:29 PM

#

WHHHHHHHHHHhY

why private competitions, can't be set public later?

lilac venture Oct 8, 2024, 9:17 PM

#

hello i was working on the housing regression competition and i am fairly new. I was wondering for a column such as Street, then it has a range of string values such as gravel, pavement... how should i encode this

halcyon spindle Oct 8, 2024, 9:50 PM

#

someone has an automatic1111 notebook working for kaggle?

lucid wren Oct 9, 2024, 12:37 PM

#

#❓┊ask-a-question Hello, I wanted to know how I can increase my storage capacity? Thank you

real patio Oct 9, 2024, 2:17 PM

#

Hello @wind silo i have DM you the questions still waiting your response thanks

fierce gulch Oct 9, 2024, 10:50 PM

#

bonjour je suis francais , je me forme au dev , je suis débutant sur l'utilisation de AI par API, je veux aller plus loin que le simple chat de chat GPT ou autre chat assister par AI , je cherche un binome ou faire partie d 'une équipe et apprendre et passer des nuits blanche a me casser la tête ( je parle pas englais donc je me démerderais par écrit merci)

random narwhal Oct 10, 2024, 4:06 PM

#

I wrote a classification program for Logistic Regression, but why does the cost function (j(w,x,y,b)) become larger after gradient descent

fierce gulch Oct 10, 2024, 4:49 PM

#

random narwhal I wrote a classification program for Logistic Regression, but why does the cost ...

def gradient_w(w, x, y, b):
m = len(x)
h_wb = h(w, x, b)
grad_w = (1/m) * np.dot(x.T, (h_wb - y))
return grad_w

def gradient_b(w, x, y, b):
m = len(x)
h_wb = h(w, x, b)
grad_b = (1/m) * np.sum(h_wb - y)
return grad_b

Mise à jour des poids et biais

while T:
grad_w = gradient_w(w, x, y, b)
grad_b = gradient_b(w, x, y, b)

w = w - a1 * grad_w
b = b - a1 * grad_b

if j(w, x, y, b) < 1e-15:
    break

random narwhal Oct 10, 2024, 5:00 PM

#

fierce gulch def gradient_w(w, x, y, b): m = len(x) h_wb = h(w, x, b) grad_w = (1...

Thank you, brother. I'll understand it.

orchid spoke Oct 11, 2024, 7:51 AM

#

Hello everyone,

Can anyone help me figure out if it is possible to link a local codebase and use it in any hosted competition?

glass bronze Oct 11, 2024, 2:29 PM

#

I am working on this competition, and it appears the file upload button is disabled. is this normal ? https://www.kaggle.com/competitions/child-mind-institute-problematic-internet-use I downloaded teh data and have been working in a local notebook

Child Mind Institute — Problematic Internet Use

Relating Physical Activity to Problematic Internet Use

#

do i have to use the kaggle notebook ?

spiral mesa Oct 12, 2024, 12:58 PM

#

Hi all,

I recently saw a blog post about fine tuning paligemma for receipt scaner

I have few questions :

how i can create my dataset, i can use a tool to define the box and the text ?
i ran the model with gradio on my local but it's so slow. Why ? I need to convert to model to a spécific format to use in local ?

Where i can find tutorial or which resources can i read to learn it easily ?

Many thanks

zinc cloud Oct 12, 2024, 1:21 PM

#

Hello everyone, I am an engineering student (mechanical) and I am interested in Math, AI and coding. I have been learning ML algorithms and EDA little by little for some time and I want to learn more. I don't have a particular goal in mind like I don't know if I want to work as an ML engineer for a company or maybe do research work, I am just learning and exploring this field. What would be your advice for someone like me in order to learn and level up in this field?

wooden folio Oct 12, 2024, 6:38 PM

#

hey, I don't know if this is the right place to ask this question but anyways, I have learned a little of ML, and I want a ML job, but most entry level jobs in popular companies require at least 2 years of experience, but the problem is obviously, I want the job in order to gain experience, I am willing to take a part time remote job for as little as 10k dollars a year just so I really get into the field, or I'll even work for free and I don't if "work" is the right term for it, but I want to learn from someone who knows more than me .

wet fjord Oct 13, 2024, 6:19 AM

#

hi, im new to ml and ive been trying to make a prediction model to identify handwritten numbers. ive just been kind of stuck since my models accuracy is always 10-13% and all the tips ive seen online have been exhausted for days. its kind of my first project in this stuff, so sorry if i seem clueless. Ive had great results using pretrained models which is why i wanted to go a little further and make one but im cooked.

im just noticing the beginner digit section, and it seems easy to just look at it for the answer. but i dont really get what the others did, mainly looks like they added more filters and used other tactics

if anyone can answer, what models are generally recommened and why, ive only touched adam.
and the filters and why u use them, i used 32,64,and 128. with some dropout. but i noticed more filters, as well as followups for them. also heavy dropouts. idk it seems crazy, maybe im a bad teacher

final question: why dont i see best model used alot, is it not good longrun? sorry

flat swan Oct 13, 2024, 11:57 AM

#

someone can help me?

willow relic Oct 13, 2024, 7:14 PM

#

Hello everyone,

I'm an ML engineer apprentice, currently working as an intern for a startup on a GenAI project. I've done some academic projects before and now I want to gain real world experience by working on real projects. Just like @wooden folio , I'm looking for a part time remote job to work even for free. PS: I'm motivated and willing to get my hands on stunning projects.

If you are interested or need additional information, feel free to DM me.

Kinds regards,

Stephane

timber ocean Oct 15, 2024, 4:32 PM

#

I am curious about how you guys manage computing requirements?

Whats your set up?

potent niche Oct 15, 2024, 4:47 PM

#

Hi guys 🙂 Hi, I am working on a prototype of a motion sensor with an api to extract information already labelled via wifi on real time, like a data collector but smaller, so it doesnt biased the movement. Do you guys have any suggestions based on your experience any kind of additional feautures?

empty bluff Oct 15, 2024, 6:31 PM

#

Hi all. I have a kaggle.com question: The first version of this notebook, https://www.kaggle.com/code/michaelstelly/cia-world-factbook-analysis?scriptVersionId=201169542
has an incorrect import. It's been stuck in processing for 21 hours. Selecting "stop the session" does nothing. How do I kill the session?

CIA World Factbook Analysis

Explore and run machine learning code with Kaggle Notebooks | Using data from No attached data sources

hoary warren Oct 15, 2024, 11:28 PM

#

hay when is the sgd book going to git fixed thars a problum with seabor 😩

tender light Oct 16, 2024, 12:49 AM

#

Hi everyone!

I'm super new to all of this stuff, and was attempting to make a simple model that can detect the position of a basketball in an image! I was attempting to use https://www.kaggle.com/datasets/trainingdatapro/basketball-tracking-dataset/data this set of data to train, and while attempting to set everything up(just plot the bounding box for the images I already have) and was noticing a small issue when attempting to rescale to points to image given in the files.

If you look at the xml file for the data, it says the original size of the image was 1280 by 720, yet on some of the images the values for the boundingbox seemed incorrect, as they were extremely small.

(these are x1, y1, x2, y2)

For example, the x values for the second image were 966, 408, 987, 429. These just seemed off for an image that looked like this(can't send here can send in dms)

Also for like the first image for example, the basketball is just outside of the image, how does that work?

please help either here or in dms!!

Basketball Object Tracking Dataset

Screenshots from videos with the ball, object detection dataset

tender light Oct 16, 2024, 3:44 PM

#

🙏

subtle willow Oct 17, 2024, 7:41 AM

#

HI Everyone!, hope everyone is good, I'm new to Kaggle
Just started with the Titanic tutorial and am a bit stuck
For "Part 3: Your first Submission"

i managed to get the correct output for % of woman who survived
i also managed get the correct output for % of men who survived

But when i add the last part in order to get an output of "Submission was successfully saved!"
i get this error message:

NameError Traceback (most recent call last)
Cell In[16], line 7
5 features = ["Pclass", "Sex", "SibSp", "Parch"]
6 X = pd.get_dummies(train_data[features])
----> 7 X_test = pd.get_dummies(test_data[features])
9 model = RandomForestClassifier(n_estimators=100, max_depth=5, random_state=1)
10 model.fit(X, y)

NameError: name 'test_data' is not defined

Any advice?
Thanks!

thorn summit Oct 17, 2024, 3:43 PM

#

subtle willow HI Everyone!, hope everyone is good, I'm new to Kaggle Just started with the Tit...

To resolve This Error make sure You rerun The test_data cell where You apploaded a file and Also Make Sure Where You Use This File , If There is any syntex error, For your better Help I suggest Just Send me The Code

subtle willow Oct 17, 2024, 6:01 PM

#

thorn summit To resolve This Error make sure You rerun The test_data cell where You apploaded...

thanks for the response, after running it again it submitted!

rain oar Oct 17, 2024, 6:14 PM

#

tender light Hi everyone! I'm super new to all of this stuff, and was attempting to make a ...

Values for the bounding box should be extremely small, as the bounding box is for a tiny basketball on a large image. The gif on the home page for that dataset shows the full size image (1280 x 720) and a tiny blue bounding box around the basketball. So the x1, y1, x2, y2 values you provided should be right, since the basketball is only 20 x 20 pixels.

tender light Oct 17, 2024, 7:28 PM

#

rain oar Values for the bounding box should be extremely small, as the bounding box is fo...

Oh I see that's interesting,

I'm a little confused about something then. In the files most of the images are just not like as wide and high as the big image, so are those cutouts of the big image? and if they are, how do I know like what "big image" the cutout refers to

I'm a bit confused all around harold

rain oar Oct 17, 2024, 7:44 PM

#

tender light Oh I see that's interesting, I'm a little confused about something then. In the...

You might be able to answer your own question by studying the "Dataset Structure" and "Data Format" sections in the description. It also looks like a pretty small dataset as it's only a sample of the actual dataset which apparently is for sale on another website...

smoky scarab Oct 18, 2024, 5:25 AM

#

Hello. I'm starting to use kaggle for the first time.
Even though my PC is connected to the Internet, Kaggle notebook cannot connect to the Internet. Below is what I tried. The result is that the internet is not available. How can I connect to the internet?

#

import requests
try:
response = requests.get('https://www.google.com')
if response.status_code == 200:
print("OK")
else:
print(f"OK_But: {response.status_code}")
except requests.ConnectionError:
print("No")

livid kestrel Oct 18, 2024, 9:31 AM

#

Hi. I think the last time I submitted anything on Kaggle was probably the first ever Jane Street competition, about 6 or 8 years ago. This is a pretty naïve question, but for the submissions, does our submissions notebook have to also train the models we use, or is it possible to train the models separately, load them into the notebook and just do the inference/predictions?

#

Actually I'm specifically asking for the Jane Street competition so this might not apply to other competitions?

last fern Oct 18, 2024, 1:52 PM

#

smoky scarab Hello. I'm starting to use kaggle for the first time. Even though my PC is conne...

Why do you want notebook to connect the Internet. Maybe this is unsafe.

last fern Oct 18, 2024, 1:55 PM

#

smoky scarab import requests try: response = requests.get('https://www.google.com') i...

smoky scarab Oct 18, 2024, 2:01 PM

#

last fern

Since I want to create content that supports Japanese, I would like to use "rinna/llama-3-youko-8b". First, you need to install transformer, and you need an internet connection. How to do it without using the Internet?

last fern Oct 18, 2024, 2:06 PM

#

smoky scarab Since I want to create content that supports Japanese, I would like to use "rinn...

I understand it, I try to connect the internet on kaggle notebook, It's ok. Can you take a screenshot of the whole page to see the exception stack trace

smoky scarab Oct 18, 2024, 2:06 PM

#

last fern

sterile barn Oct 18, 2024, 2:07 PM

#

https://www.kaggle.com/datasets/jakubkhalponiak/phones-2024
https://www.kaggle.com/code/jakubkhalponiak/a-study-of-smartphones-available-in-2024

I have webscraped phones from gsmarena.com and published a notebook and the dataset i would apreeciete any feedback on this as its my first time posting anything on kaggle

Phones 2024

A Comprehensive Collection of Phone Information

A Study of Smartphones Available in 2024

Explore and run machine learning code with Kaggle Notebooks | Using data from Phones 2024

last fern Oct 18, 2024, 2:09 PM

#

smoky scarab

change the website address, for example youtube, x.com.

languid fern Oct 18, 2024, 2:12 PM

#

Hi, I am currently going through the book Deep Work by Cal Newport.

Now I am interested – how does your learning/working process look like? What are your habits? 🙂

For example, you participate in Kaggle competitions or watch/read ML tutorials.

smoky scarab Oct 18, 2024, 2:13 PM

#

last fern change the website address, for example youtube, x.com.

smoky scarab Oct 18, 2024, 2:14 PM

#

last fern change the website address, for example youtube, x.com.

I can't connect.

last fern Oct 18, 2024, 2:17 PM

#

check this out, I have reproduced your problem

smoky scarab Oct 18, 2024, 2:27 PM

#

last fern check this out, I have reproduced your problem

There is no Internet mark on my screen. Where do I set it to display the Internet mark?

smoky scarab Oct 18, 2024, 2:29 PM

#

last fern check this out, I have reproduced your problem

I can't even save.

last fern Oct 18, 2024, 2:31 PM

#

https://www.kaggle.com/code
create a new notebook

Run Data Science & Machine Learning Code Online | Kaggle

Kaggle Notebooks are a computational environment that enables reproducible and collaborative analysis.

last fern Oct 18, 2024, 2:34 PM

#

smoky scarab There is no Internet mark on my screen. Where do I set it to display the Interne...

I don't know. Perhaps the website displays differently in different regions. You can google for help。I think find the correct setting, you can connect the internet

smoky scarab Oct 18, 2024, 2:39 PM

#

last fern https://www.kaggle.com/code create a new notebook

Thank you!!!, I was able to connect!

smoky scarab Oct 18, 2024, 2:48 PM

#

last fern I don't know. Perhaps the website displays differently in different regions. You...

When I tried to run the program I wanted to run, it stopped immediately. I'm trying Kaggle because the Collaboratory doesn't work due to lack of GPU.

#

I am trying again.
The GPU is not working, only the CPU is working.

#

And finally this happens. How should I solve it?

tranquil gull Oct 18, 2024, 8:19 PM

#

Hi Guys,

I’m a CS student currently doing my final work on the course. The problem i’m addressing is to predict the if a stock is gonna go up or down so that i could either Sell or Buy.

I was wondering if anyone have ever get in touch with a dataset that fills this description?

Thanks in Advance!

pseudo storm Oct 18, 2024, 11:09 PM

#

Why in most cases people use MSE and not MAE, isn't MSE only better than MAE then we want less big errors even at cost of having more errors on average? Is it usualto care about big errors more than on errors in general? For me it looks like MSE is a bit niche and MAE should be used in most cazes instead

tender dagger Oct 19, 2024, 5:58 PM

#

hello, i am an absolute beginner in machine learning. i just learned about linear regression and error metrics and wanted to get my hands on a small project using the techniques i learnt. So i started with the famous boston-housing-prices dataset on kaggle and would appreciate if you could take a look at my code: https://www.kaggle.com/code/khalidhelmy55/boston-housing-prices and guide me on what is missing or what could be better done..
according to the metrics i calculated the model is not performing good.

compact snow Oct 20, 2024, 6:13 PM

#

Question 1:
I’m running my error metrics locally (e.g., RMSE, MAE) on my validation set while participating in a Kaggle competition. Since the test set lacks target values, can anyone help clarify which error metrics I can use to assess my model locally, and if possible, could you list some commonly used ones?

Question 2:
Also, I’m noticing that the error metric scores I compute locally are different from the Kaggle leaderboard score. How are these related? Are the scores directly or inversely proportional, or is there another relationship I should be aware of? Any insights would be greatly appreciated!

rain oar Oct 20, 2024, 9:19 PM

#

compact snow Question 1: I’m running my error metrics locally (e.g., RMSE, MAE) on my validat...

You assess your model using validation data, not testing data. Your local scores are using the testing data, while I believe the scores on the leaderboard are using new data that is not in the testing dataset.

last fern Oct 21, 2024, 1:40 AM

#

pseudo storm Why in most cases people use MSE and not MAE, isn't MSE only better than MAE the...

GPT answer

drowsy plank Oct 22, 2024, 2:42 AM

#

Hi I'm looking to improve in the field of Marketing, is there any type of dataset or competition to make a recommendation system? thanks

velvet tinsel Oct 23, 2024, 4:04 AM

#

i am an 3rd year CS undergrad. i have Intermediate knowledge about all data structures. i study hard and take effort. look as I need DSA to achieve any high package placement. i am practicing my DSA Skills in Leetcode. but it is going difficult for me to figure out the logic by myself (specially medium & hard level problems). and even if i saw the solution i understand it but it does not fit in my mind perfectly. so why this is happening ? and how to tackle this issue ? will i ever get better ? you can also guide on what kind of approach/steps should i follow while solving a particular Question ?'

wraith sparrow Oct 23, 2024, 4:40 AM

#

velvet tinsel i am an 3rd year CS undergrad. i have Intermediate knowledge about all data stru...

U re at wrong place its kaggle not a dsa server

#

Anyone using allennlp ? Is it not maintained anymore by allenai ?

hollow nova Oct 23, 2024, 5:17 PM

#

hello. I am currently working on Pandas section on Kaggle and I got a question. Can I ask it here?

#

nvm thank you i figured it out

stone anchor Oct 23, 2024, 5:53 PM

#

I'm trying to learn more about Bayesian optimization techniques for machine/deep learning. Any good YouTube series recommendations?

lost crag Oct 23, 2024, 6:06 PM

#

hey guys, I want to become a data analyst and perhaps a transition to data scientist later on, Would it do by just focusing in python and R? the programming for this field seems very different compared to other fields like web development, etc.

rain oar Oct 23, 2024, 8:33 PM

#

lost crag hey guys, I want to become a data analyst and perhaps a transition to data scien...

for starting in data analytics, focus on Python and SQL, but it depends on how deep you want to go.

lost crag Oct 23, 2024, 8:33 PM

#

I´m aiming to go pro on this stuff.

rain oar Oct 23, 2024, 10:37 PM

#

start small, learn the basics, take on challenging tasks that make you get our of your comfort zone, learn more, take on more projects, learn more, rinse and repeat until you're a pro 😄

earnest cliff Oct 24, 2024, 1:01 AM

#

Where can I find conferences or talks about data and all that world?

lost crag Oct 24, 2024, 1:30 AM

#

is there a guide about the math I have to learn that you guys recommend? I´ve been learning on my own, but it won´t hurt to learn from well known free resources

errant glen Oct 24, 2024, 2:37 AM

#

Lin alg

#

And calc 3

#

Obviously

#

More complex versions of that combination exist

#

But

#

Most ML require those two