#playground-series-s4e10 | Kaggle | Page 1

sleek vapor Oct 1, 2024, 10:28 AM

#

Hola, mundo!

past fern Oct 1, 2024, 12:37 PM

#

sleek vapor Hola, mundo!

Hola, mr Sanwal

turbid sparrow Oct 1, 2024, 2:39 PM

#

Hello Guys !

lethal nest Oct 2, 2024, 8:09 AM

#

hi

unkempt yew Oct 2, 2024, 1:15 PM

#

is this the one for the loan payment?

sour light Oct 2, 2024, 1:33 PM

#

I am facing "Evaluation metric raised an unexpected error", this particular error while submitting my CSV file

#

Can anyone please help me

rose lark Oct 4, 2024, 2:36 PM

#

HI, I am a data analyst beginner. Can anyone explain why the sample output given in dataset has 'loan_status' values 0.5? when it should be either 1 or 0? Please help me understand what am I missing?

fast mulch Oct 5, 2024, 4:42 AM

#

sour light Can anyone please help me

have you found a solution for this? I have had it before, and I solved it by removing the index in my submission.

sour light Oct 5, 2024, 1:00 PM

#

fast mulch have you found a solution for this? I have had it before, and I solved it by rem...

Ohh, I'll try wait

sour light Oct 5, 2024, 1:00 PM

#

rose lark HI, I am a data analyst beginner. Can anyone explain why the sample output given...

It's just a sample, actual would be 1 or 0 only

sour light Oct 5, 2024, 1:06 PM

#

fast mulch have you found a solution for this? I have had it before, and I solved it by rem...

It worked, thanks a lot !!!

fast mulch Oct 5, 2024, 6:44 PM

#

sour light It worked, thanks a lot !!!

perfect! Glad it helps!

tough frigate Oct 6, 2024, 11:58 PM

#

sour light It's just a sample, actual would be 1 or 0 only

If you want to improve your score with the same model, try submitting the logits (predict_proba) instead of the classes (predict). The granularity of the submission often scores higher by several percentage points.

sour light Oct 7, 2024, 8:36 AM

#

tough frigate If you want to improve your score with the same model, try submitting the logits...

Oh, Interesting! Worth a try, Thanks !!

vague yoke Oct 7, 2024, 5:56 PM

#

rose lark HI, I am a data analyst beginner. Can anyone explain why the sample output given...

Hi! It's because we can predict the probabilities of the outcome rather than stricting the predictions to 0 or 1 depending on your preferences.

You can obtain predictions as probabilities by using {model}.predict_proba function.

As a conclusion, you can feel free to try with both approaches for what gives better score from the same model. I have primarily used the method of probabilities

ivory finch Oct 7, 2024, 9:55 PM

#

hello, I joined the competition Loan Approval Prediction. Is there somebody who can help me to understand how to get the original data? Spanish speakers are welcome.

vague yoke Oct 11, 2024, 9:27 AM

#

ivory finch hello, I joined the competition Loan Approval Prediction. Is there somebody who ...

The link to the original data is given in the competition,

This is the link:
https://www.kaggle.com/datasets/chilledwanker/loan-approval-prediction

Loan Approval Prediction

ivory finch Oct 11, 2024, 11:18 AM

#

Thank you @vague yoke

agile rapids Oct 13, 2024, 5:52 PM

#

hello all, I'm new to ML and trying to understand how to get a decent score in this competition. I have a basic model with 2 hidden layers (16 and 10 features). I use ReLU for the hidden layers and sigmoid for the output layer. (There is also a batch norm before the activation function). After a fairly short while training, my training loss stopped decreasing and started oscilating. I'm wondering what are some ways to get around this? I've tried increasing the number of hidden layers and the number of features in each hidden layer, but I'm not making progress, it still fluctuates around the same loss. I'm using Adam optimizer so not sure if i really need to fine tune the learning rate more. If I am at the point where I am overfitting the training data, how can I tell if that is the case or if I hit some local minimum? For reference when submitting from this model, I am at around 83-87% on my submissions. Probably a dumb question, but if I'm already overfitting the training set, does that mean my only other option is to add some regularization to my loss function? I'm using a cross_entropy loss function from pytorch

vague yoke Oct 14, 2024, 9:31 AM

#

agile rapids hello all, I'm new to ML and trying to understand how to get a decent score in t...

Hi! You can try using callbacks such as ReduceLROnPleateu, that adjusts learning rates if the loos doesn't go further down for X epoches. Additionally, use EarlyStopping callback with restore_best_weights parameter to True, which will restore the model parameters from the best epoch, so you don't need to worry about the fluctuation issue.

Lastly, take a look at Kaggle's free "Intro to Deep Learning" course that will give you more idea about neural networks.

Our objective in this competition is to maximize ROC-AUC score.

vague yoke Oct 14, 2024, 9:31 AM

#

ivory finch Thank you <@1271107657616457799>

You're welcome

eternal lotus Oct 14, 2024, 3:47 PM

#

Hello everybody?

#

What are you doing now?

desert flame Oct 21, 2024, 12:30 PM

#

hi im new to this
i have question about dataset

person_income is annual income ?
person_emp_length is person employment length in years?
these columns, loan_intent, loan_amnt , loan_int_rate are for the loan they are applying for ? the one we are trying to predict the approval or not ?
anything about the loan tenure ?
how is loan_grade determined ?

buoyant verge Oct 21, 2024, 4:24 PM

#

The playground Dataset is synthetic. Explore relationships, but don't go too far down the rabbit hole of the data generating process.

vagrant hill Oct 22, 2024, 11:09 PM

#

rose lark HI, I am a data analyst beginner. Can anyone explain why the sample output given...

is the answer we are supposed to give a binary 0 or 1 is can it be a range b/w 0 and 1

waxen tiger Oct 23, 2024, 3:43 PM

#

Hi, anyone using polynomial features in tree-based models? Does the score improve? I am asking this because I see lots of people using polynomial features but I don’t really think it would work as in tree based models the interaction is built naturally.

waxen tiger Oct 23, 2024, 3:45 PM

#

vagrant hill is the answer we are supposed to give a binary 0 or 1 is can it be a range b/w 0...

Float not int, a probability a person can get loan approved (label 1). Thus it can be used to compute AUC ROC.

buoyant verge Oct 23, 2024, 11:55 PM

#

waxen tiger Hi, anyone using polynomial features in tree-based models? Does the score improv...

CV might improve a little. Find a way to iterate quickly and prove it for yourself.

abstract pasture Oct 27, 2024, 6:27 PM

#

waxen tiger Hi, anyone using polynomial features in tree-based models? Does the score improv...

Never had an improve at scores with using polynomial features. person_income looks more similar to normal distribution after taking the log but it doesn't increase scores that much at the end.

ashen yacht Oct 27, 2024, 9:40 PM

#

Anybody else try upsampling with SMOTE/ADASYN to fix the class imbalance? The difference between the two methods wasn't statistically significant for me (chi-squared test), but I'd be interested to see how other people approached this. My public score was ~95%

buoyant verge Oct 28, 2024, 11:01 PM

#

ashen yacht Anybody else try upsampling with SMOTE/ADASYN to fix the class imbalance? The di...

A waste of time here.

barren flax Oct 31, 2024, 8:52 PM

#

Hi guys, I'm currently trying to get my models to 96 on the last day if possible. I achieved a 93 with a random forest model and I've been trying to incorporate a blend of random forest, xg boost, and cat boost with a logistic regression as the meta model, however my performance keeps coming out poorly. I'm not sure what I'm doing wrong, but I would deeply appreciate any tips if possible!

#

https://www.kaggle.com/code/williamodonnell/loan-prediction-comp

#

Apologies for it being very disorganized, I'm trying to get better about keeping things tidy