#🏆┊competition-general
1 messages · Page 1 of 1 (latest)
Let's get this channel kicked off! I'd love to hear what everyone's first competition was like - the good, the bad, and the ugly 🙂
I thought my first submission would take me to 1st place.
It actually took me the whole day to get the submission format correct 😅
Expectation | Reality 🥲
Ohh I agree @clever flax. I did my playground series and I couldn't understand what's the use of splitting the train set into train/test when there is already test set for submission. It took my days to figure out and I had to go through many codes already submitted to finally understand.
I loooove the competitions co-hosted by The Learning Agency Lab! They are so very related to our daily lives and are very beginner friendly. My first competition was the Feedback 3, and I was able to get a solo silver in it. Really having fun in CommonLit!
Those initial learning curves are so rough, although, they feel so good once you overcome them!
Hah, that's awesome! Solo silver in your first comp is amazing 😮
Did anyone participate in the HuBMAP competition which just ended? Any thoughts on it? https://www.kaggle.com/competitions/hubmap-hacking-the-human-vasculature/overview
Segment instances of microvascular structures from healthy human kidney tissue slides.
Lets start with the bad, my first shake-up ! "Microsoft malware prediction" final leader-board was terrible for me and many kagglers, but i learnt a lot ... i think LOFO feature importance was one of the research results of this competition. (It was all about "Centos_OSVersion" feature , see the github link)
I am a bit of an outlier as I haven't completed any competitions properly despite being on kaggle many years... 🤣 what usually happens to me is that I get too hung up on data munging/learning how to work with unusual data formats such as satellite imagery, and by the time it got to modelling I was either bored of the problem or the competition is over... However, I do learn a lot just by this sort of experimenting, and since most real world data science problems start with messy data, I would say even if you don't like competitions there are a lot that you can learn from them :))
100% agree
I could very well relate to your situation 😉 But I want to overcome from this
i agree this happens sometimes with me usually for CV based competitions where either the pre-processing time or modelling time is kind of boring....
How do you all stay motivated for the long duration of 3 months?
I think current active playground series - Season 3, Episode 20 Predict CO2 Emissions in Rwanda competition (started 2 days ago) is a good start to practice a full pipeline regression problem . I remember i started my journey in kaggle with "titanic" and "house price" playground competitions , like many novices i was a bit confused about submissions , cross validation , model selection and ... but didn't give up, decided to read other participants notebooks( initially EDA notebooks) , this idea helped me a lot. After that i participated in featured completions , and got my first solo silver medal in LANL Earthquake Prediction . So be persistent. Don't give up on your training and practices. There will be times when things get tough, but you need to keep going and enjoy your "Aha... Moments" !
Playground Series - Season 3, Episode 20
Can you predict upcoming laboratory earthquakes?
Thanks @sharp carbon for the insight
Nice one.
My first competition was a playground competition S3E14, the result wasn’t good as I was still a beginner and learning how to approach them
My first is G2Net Detecting Continuous Gravitational Waves.
I deeply realized the fun of kaggle and decided to stick to it.
Mine was the Cassava
Nice, playgrounds are a great place to get started
Yep, making small tweaks and seeing how your LB position changes is fun ><
Oh nice!
😂It's interesting to be able to work on a lot of topics that you wouldn't normally be able to.
but yes, The increase of LB is beneficial to mental health lol
We are going to be running a competition close event for the ICR Competition! Hope to see you there
https://discord.com/events/1101210829807956100/1139331657766277152
I was already hoping to take it on myself. excited to be part of it!
So, what was the thing people did to achieve 0.06 public LB and got >5 for private LB for ICR competition 
Excited 😁
What a wild shake up
I almost participated in that one for a little bit
I roughly spent 5 days on the competition. I was pretty sure this would happen in the very start of it. Anyway, I learned something new in the competition as well, so worthwile 🙂
I might take a day to just poke around and see if there is some principled approach someone couldve taken to get a consistently good result.
My thought is that someone couldve probably done something with very smoothed and unconfident predictions that wouldve done consistently. Maybe something where you only even moved the prediction if there was wide consensus from many models. Otherwise just default to 0.5 or some other safe bet to minimize the damage of anything wrong
these ~1000 row tabular comps always surprises me, and I can't hold myself from joining
The most curious thing that has happened to me is that if we define as training CV+Public LB, as validation Private LB, submissions as epochs and we plot everything as it was a learning curve, we can see when the overfitting starts.
Quite few lessons learned after having handled such a small dataset.
How did you see their solutions
Hi raddar, your findings and notebooks helped to analyse problem formulation of this competition. I am eager to understand what the results obtained have been beneficial for the host. As i see and compared models , more complex models more worse results ...
I have big feeling, that there will be little use to the competition to the host. However, it is hard to say as it seems private LB was completely different from training and public LB datasets.
Definitely True
Use Machine Learning to detect conditions with measurements of anonymous characteristics
The main thing is doing that thresholding stuff to boost public score completely obliterates your model
I don’t get why does that make sense to do
I get that it sounds insane to drop data based off time valuesin a medical competition, but is it really that insane when these data points were clearly way off, they were really far from all the other data when I was looking at them through umap representation.
and it wasn't some random thing, I tried 100 different seeds and these data points were always largely separated from main data
Yeah, I too now think that it was ballsy to drop these rows, didn't think about it when I did it, I guess these data points were poorly labelled or something, I'm pretty sure they appear somewhere in the beginning of covid pandemic too
It also seems that most top has general case CV solutions, top1 solution is really great too, I gotta check it out later
I tried removing these rows in my models. It really holds... https://www.kaggle.com/competitions/icr-identify-age-related-conditions/discussion/430963
Use Machine Learning to detect conditions with measurements of anonymous characteristics
his solution is public, you could try it out
but yeah that would be one hell of a notebook
going to run it 🙂
Btw, did anyone else try to understand tabpfn source code?
I might be tripping, but I don't think it's training anywhere, it just checks if data is ok to pass to tabpfn and then it just concats whatever you passed in fit with whatever you want to predict and just predicts it with pretrained model
I don't think their models are available to run not in inference
at least I tried to load it to try fine tune but it said model was inference only or something, I can't remember
yes..
I just wonder with certain things like the null epsilon dropping if there was any possible way to know that was a good idea with the information we originally had? Like if it didnt show anywhere on the training set or public lb set that this was beneficial and only showed up on the private test set because data was sampled in a different way and we didnt know about this then how could one have possibly made this bet?
seems intuitive for me to drop data thats insanely far from most observations and also has a clear feature that distinguishes it from everything else
although I get what you are saying
One key question is "why does submission error count as 1 submission when we only have 1 submission a day"
This submission error is not easy to find the problem too [like forgetting to set index = False]
There are hundreds of discussion posts on this
Well data points with unknown date and all target to 0. Not so weird to drop them in general
I actually have a wilder hypothese regarding data quality. The point are not independant that is you have patients that appears multiple time. First as sane then as ill. This is relatively classic (help collecting data, see the evolution over time). But for the ML algo this is not good as you have points with close features but different targets. This is also why we have unstable cv. Because it wildly depends how the observations of the same patient are split.
(This is target over time) see the pattern repeating between 2012 - 2016. It’s like we had a bunch of patient as 0 then as 1, one year later.
Anyone wants to team up with me for the housing problem?
Sure
Glad to see you all teaming up! I see that you also posted in our https://discord.com/channels/1101210829807956100/1130572338182762657 channel, Remi. Reminder to others that our looking-for-a-team channel is where you should post for finding team members.
Hey there, thanks for posting in https://discord.com/channels/1101210829807956100/1130572338182762657 . I'm going to delete this message as it's a duplicate.
Is there any schedule for future competitions? I am hoping to join the next competition right at the beginning
Unfortunately there is no schedule for future competitions, you just need to pay attention to the site / the emails and tweets we make about new competitions.
I see. Can we reasonably expect there to be a competition starting when another one ends, or is it just random?
Each competition comes together through a series of complex events and coordination with the hosts and data scientists involved, so I wouldn't call it random, but it's definitely not the case that a new one starts when another one ends.
Alright, thanks for the info
Playground competitions start every two weeks, I believe on Monday. There should be one starting today at 7 PM EST.
Here https://www.kaggle.com/competitions/playground-series-s3e20
https://www.kaggle.com/competitions/playground-series-s3e21 (will be available soon)
Playground Series - Season 3, Episode 20
@real delta I don't seem to see any nulls in the sample submission
I am getting error--- script file 'C:\Users\HP\anaconda3\Scripts\conda-script.py' is not present.
How to solve this
Is this in the kaggle kernel ?
how to solve titanic dataset for accuracy 1
Only with cheating https://www.kaggle.com/code/tarunpaparaju/titanic-competition-how-top-lb-got-their-score
I was hoping for that too.
Thanks for the info. May I suggest that, unless there is a reason for secrecy (like an internal policy), that a tentative list of competitions be published without any commitment to dates or the assurance that the competitions will take place. This may help plan and target specific competitions. Just a thought.
Unfortunately the main reason is indeed due to internal policy, a lot of competitions get delayed or sometimes even cancelled before they launch and hosts typically want to control the announcement for their competition. So while we know it would be very helpful, it's just not possible.
Understandable.
Hello all !! https://www.kaggle.com/competitions/predict-ai-model-runtime There is a new competition posted today. Is anyone open to work together?
Predict how fast an AI model runs
Yes I'm ready ☺️
Hi All. I am struggling to do one thing. If I train a model on my local machine, how can i use it in Kaggle competition notebooks if they do not have access to internet?
A general answer to that question is that you upload the files into a Kaggle dataset - you can create as many of those as you want. Then link to that dataset from a notebook by clicking on "Add data" from the right-side menu.
Thanks! But won't it become a public model?
You decide when creating a dataset whether to make it public or private. If you go private, it can later be changed to public. Once you make it public, I don't think it can be converted to private.
Got it. Many thanks!
In the past years, I have actively participated in multiple Tabular Playground Series competitions. During this process, I have built a Kaggle Pipeline for my personal use.
Here is my kaggle pipeline, my Kaggle Pipeline (https://github.com/arnabbiswas1/kaggle_pipeline_tps_aug_22)). This is an Open Source Python based pipeline (193 ⭐ at github) for Kaggle tabular data competition. Although it is customized for Kaggle TPS August 2022, with limited code changes, this project can be used as a pipeline for any tabular data competition. This project includes APIs for most of the ML competition related tasks:
- data processing
- visualization
- feature engineering
- training
- ensembling
- feature selection
- hyperparameter optimization
- experiment tracking
- submission of prediction to kaggle
Here is the discussion at Kaggle: https://www.kaggle.com/competitions/titanic/discussion/435856
Start here! Predict survival on the Titanic and get familiar with ML basics
Yes
Hello everyone. Great to be here
Hello guys. I'm new here in discord channel, I would like to share with you guys two community competitions with fully sponsored visit to Italy for the first place winner. Please feel free to participate in them: https://kaggle.com/competitions/oemc-hackathon-global-fapar-modeling
https://kaggle.com/competitions/oemc-hackathon-eu-land-cover-classification
Challenge for modeling Fraction of Absorbed Photosynthetic Active Radiation (FAPAR) using ground stations and Earth Observation data
ML challenge for classifying 72 multi-annual land cover classes for Europe using LUCAS ground-truth samples and Earth Observation data
@wide scroll i want to join you man,,I want to learn n grow
so I just completed intermediate machine learning course on kaggle, and believe that I can participate in all the starting begineer competitions, ngl
still far away from participating in actual competitions but I will get there-
I think you can already go for any competition you want, as there is no particular competency requirement for participation. Supposedly I know how to compete after having done it 30+ times, yet I still placed #2555 in one of the recent competitions. It has no negative effect on me, and I simply moved on to the next one. As long as you don't take things too seriously, I think you can start right now with featured competitions.
damn, that does make sense, I was thinking that if I do participate I should atleast get a good score, but at the end of the day., what matters is that I get practice even if I do get a bad score, gonna participate in high level competitions as well now even if I do garbage
Hey guys! Does anyone know any really good resources to learn Data Science and Machine Learning theory?
Anybody working on kaggle competition ribonaza RNA ?
What is the longest time you have waited for your submission to be scored after submitting it ?
Hey Mukesh! In case you're not aware, we have competition-specific channels for each new launch. RNA's is https://discord.com/channels/1101210829807956100/1149448638054015087
Hello Dear kagglers , i am looking for someone to collaborate with me in competitions , anyone who wants to join dm me. Thank You ❤️
I do....message me
2 minutes(playground series)
Does timeout count 😂
One of the competitions I completed got good mse, as I'm experimenting with adding features it getting improved, but need someone with domain knowledge so that I can achieve lowest mse
Hey Mukesh, mse might be problematic, we had 1 percent mse and 10 percent rmse, just check rmse or mae for a better scale!
I got 0.22 for content and 0.46 mse is this good or I need to improve ? In commonlit competition is that good or need to improve?
0.46 mse makes around 0.67 rmse, mostly I would say high but depends on data and context ofc
hi guys..anyone want to join me in forming a team for google mobile decimeter challenge?
Hello everyone!
I have looked over many competitions and found that several topics like NLP, Image Processing, and Deep Learning are very crucial knowledge to success in competitions. I have completed several courses on kaggle. But they are not enough for a good idea in statistics and coding. Please suggest good free resources to learn.
^ additionally for more in depth look into deep learning https://youtube.com/playlist?list=PLAqhIrjkxbuWI23v9cThsA9GvCAUhRvKZ&si=w0b0A3IftxhaPbYR Andrew karpathys videos and additionally http://introtodeeplearning.com/ mits mini intro to deep learning course
Is there any chance that an existing competition will be reworked? i'm talking about neurips unlearning challenge and it's flawed metric. If not, may final leaders be reconsidered after the competition is over?
Hello mate. I want to ask about the kaggle competition submission. As some of the test data content missing value, if i delete them and make prediction, the submission file will be somewhat shorter than the 'submission sample' file (lack some observation). My question is: How Kaggle treat these missing lines in my prediction file?, how is it affect my score of prediction?
So the only way to deal with missing value in test data is to impute it, or remove column if the column is omit by dimentional reduction in training set
Right?
Thank you for the insight
Hello, I have participated in Bengali.AI competition. I ran on CPU and my notebook has been running for more than 5 hours now. Is this normal?
If not, What should I do. This is my first competition
Hi guys I am looking for a team for HackCBS 6.0
PLease drop a ping if anyone has a team\
hi, is everyone getting into that ML ctf competition ?
Hi Maskman, I am trying to get started. Not sure how to get past the test. I will figure it out though!
I'm in the same place, trying to figure out Cluster 1
I might have to do lots more googling
Hi Maskman, I'm in. Currently solved 9
hey ! im new here, been studying data analytics for 1 month and have been doing python for the last 2 weeks. im looking for people to give me feedbacks on how to improve my competition submissions overall. here is my spaceship titanic submission : https://www.kaggle.com/code/sebastienmotionstats/spaceship-titanic-sub-motionstats
Hi, did anyone work on https://www.kaggle.com/competitions/rsna-2022-cervical-spine-fracture-detection/overview? I need help in creating the submission
Identify cervical fractures from scans
Hi, Is it legal to use the dataset competition as publication like paper or journal or student thesis?
There is no universal answer to that question. You should ask either competition organizers or Kaggle admins.
Hello! I've been using kaggle a lot as an online jupiter notebook for my private data but i've been thinking about starting doing some competitions.
I'm more specifically looking for time series prediction competitions.
Do you have some to recommend? 🙂
(please @ me if you answer i dont always check the channel)
@nova cape titanic and space titanic are always good ones to begin with
I dont think those are time series tho 🤣
@nova cape Here are competitions tagged with "time series analysis" on Kaggle: https://www.kaggle.com/competitions?tagIds=13209
(There are probably others that aren't listed here, but this should be a good start)
Kaggle is the world’s largest data science community with powerful tools and resources to help you achieve your data science goals.
Right thanks!
ig i'll just start from most popular from those!
What is input data format in Cluster 1?
I have a similar question about using one of the competition datasets in a book.
Each competition has specific data use licensing. Look for the Rules tab where you can see the competition data license terms. Scroll through and look for Section 7, "Competition Data" and also at the top of the rules after competition specific terms you can see "Data Access and Use" for licensing. We have also started adding licensing to the Data tab.
Hi guys, am I missing something or there is no group created for the competition data-rush-2023
It is a kudos competition, is that the reason no-one participates in that, I can see 0 discussions in that competition
@patent dirge It means competitions without cash prizes. Kudos means "thanks".
👋
Hi!!
I need to find competition that Find your ideal payment by changing loan amount, interest rate and term and seeing the effect on payment amount.
D you know ?
@spring sluice just a formula will do 😊
😆 Reality, I need an optimizer to find the best price scenario
@spring sluice take a loan, I guess 😂
Hello everyone i am new on kaggle and i want a team to start LLM Detect ai Generated text
Hey, just found out about this server
Hey everyone. I'm new to Kaggle, and I've been getting warmed up with the Titanic competition. I've experimented with several models and methods of eda, but haven't been able to get past 76%. Is there a channel or discussion that anyone can recommend?
Hey guys, we are currently of team of 2, trying to take LLM Detect ai Generated text Competition.
Anyone thinking of joining?
Hi, how do I join a competition?
I notice a lot of competition winners use stacking /ensembles of high powered machine learning algorithms. Why does this technique work so well?
Mathematically, if two models are created independent of each other, and have the same accuracy, an ensemble of both models models will have identical bias, and improved variance, relative to the performance of either model alone
It's sometimes difficult to create truly "independent" models, but essentially combining model predictions is almost never a bad thing.
I'm sure there are some Youtube videos on the intuition behind this, most ML/SL textbooks will also have a detailed mathematical explanation for why ensemble modeling is statistically favorable as well
Yes, agreed, thank you. However, the "bad" thing would be "increased CPU/memory usage".
Unless maybe, if you have a large number of features, and you give each inner model only access to a small subset of the total number of features.
It's an interesting strategy for sure.
Anyone cares to explain "Hill Climbing" algorithm intuitively? I notice many top Kagglers include this in their solution. It seems "Hill Climbing" algorithm is becoming popular algorithm of choice recently especially during this latest playground competition
where can i find the channel for Mohs Hardness Regression
guys i need some help
Could someone help me speed up a computation?
There's a specific chat for this competition: https://discord.com/channels/1101210829807956100/1173791189233844318
you need to be a bit more specific when asking for help....
For the mohs-hardness regression data set, where can I find what the acronyms of the features actually are e.g. whatis "el_neg_chi_Average". Am I just supposed to google this or is there somewhere I can find this for future competitions also?
'https://www.kaggle.com/competitions/playground-series-s3e25/data?select=train.csv
Playground Series - Season 3, Episode 25
Just curious does kaggle only have playground series as competitions ?
no, have you never visited the website? there are many other competitions active
Hi everyone, I'm new to Kaggle but I have done a few projects on Forecasting, Prediction modules.
I want to make some friends to work with and pursue my career.
Ok
Are you already stuck on "Scoring..." when you submit ? I haven't done anything very resource-intensive, so I don't understand why it's taking so long.
Could always try refresh and re-submit? Also worth checking your submission is meeting the expected format
My last submission worked with the same format, I just changed some hyperparameters, and as it was my last submission and I'll resubmit tomorrow 🥲
hello ! i am new here
I wanted to know, on a kaggle competition leaderboard, if you have the same score with someone, how is the one in front and the one behind calculated?
the displayed values are rounded off, but the ranking is done on the full precision scores
is there any info when santa 2023 will start? im really excited 😊
If you have identical scores the tiebreaker is whoever submitted their solution first.
Ok, thanks 🙂
Hello
I would appreciate any assistance in my discussion here:
https://www.kaggle.com/competitions/playground-series-s3e26/discussion/460096
Playground Series - Season 3, Episode 26
you should probably look at other people's notebooks and try to infer, im doing the same
Have you seen any notebook concerned with the unit of measurement or the normal range of features?
you should refer the dataset for that, the dataset has range as well the the units
ill link the dataset down below
wait they don't have the normal rate but they do have the units
I’m aware and did attach the link of the original dataset and the competition dataset.
As you mentioned it’s doesn’t include the normal range but that isn’t my issue at all
I listed the 3 problems I’m facing in the discussion, please check it.
Playground Series - Season 3, Episode 26
My question has been answered
https://www.kaggle.com/competitions/playground-series-s3e26/discussion/460096#2554186
Playground Series - Season 3, Episode 26
great!!
How do i make my score visible on my notebook?
after running your notebook and saving it's version, you can submit the output or submission file to the competititon by clicking on the three dots on the competetion page and pressing on the submit option in the drop down
Hi guys, I'm looking for online learning resources (courses/books/articles) and would like to know which have been crucial for your career and Kaggle "mastery"?
Reason: Spam
Reason: Too many infractions
hey everyone i got score of rmse 0.4 by using my this notebook quick question is that how can i improve it
https://www.kaggle.com/code/ayeshairshadcoder/house-price-prediction-competition/
Hi I'm new to Data science i'm doing house prices prediction competition but i cant remove the nan values in the columns with float data type Can someone help me
why can you not remove?? did you try googling the error? try using google, chatgpt, bard. they help solve most of the problems
thanks @lean quail i have resolved it it will work when i select it by columns and do fiina
i was creating a model for image classification but the code keeps on running out of memory if i try running it on GPU or takes superrrr long if i run it on TPU, i tried fiddling with batch size, image size in pixels, reducing the conv2d and maxpooling layers but nothing can make it faster, i dont understand if it is normal for the program to take close to an hour's time for 2gb of data or am i doing something wrong ?
Hey, I'm looking for a project to do for my regional science fair. Any suggestions/ideas?
Hi i want to do the LLM - Detect AI Generated Text project but I am confused at how my model will be getting graded. I have dont other competitions before a while ago and I converted the data to a csv called submission.csv and my model got graded. Can someone please explain how to submit a model and how it gets graded
Please help me on how to submit my notebook for scoring . Has been failing for the last three days. This is my first time. Please 🙏
look at the sample submissions that are provided in the contest data, then save your output in the same format at the end of your code, then hit submit.
You can try the titanic contest as it will walk you through each step
there may be a problem with your submission file, like the submission file has negative values but it is not supposed to have, so that might lead to failing of submission
if you can brief me with what exact problem you're facing maybe i can help
yea it has a step by step guide of what to do
My submission file and sample submission are the same. The testing file provided contains only one testing image and also my submission file contains only one row. But they are the same
Anyone with a clue on how to ccess logs in kaggle. Can I see the logs of my notebook?
Yes, you can. You check "Logs" tab in the notebook preview.
I have seen it but there is no much useful info. The logs there are not when the notebook is submitted.
Hello Dears,
I want to submit UBC Ovarian Cancer Subtype notebook, but there is a problem with submission, it failed after about 30 seconds and showed this message "Notebook Threw Exception",
It's really short time and definitely doesn't relate to submission I think.
please help, just 2 days left
👋 Hi, everyone!
🤝 I'm looking for friends to:
💻 Join me in coding.
🏆 Participate in Kaggle competitions.
📩 If you're interested, please DM me.
🕒 Preferably in the Vienna timezone (GMT +2).
🗣️ Fluent in English language only.
I believe teaming with someone on the other side of the globe is more wise
You now work all day long on code.
Acknowledging
Hi!
I'm looking for friends to:
Join me in coding.
Participate in Kaggle competitions.
If you're interested, please DM me.
Fluent in English language only.
||def didnt copy paste this||
Hey guys Paulos here. I'm pretty hardcore studying about #🚀┊spaceship-titanic and approaching it in R. I'm aiming to get a >= 90 % score in the competition. Only 2 people have done it (as far as the leaderboard is right now, remember that only 2 month old max submissions are still in the leaderboard)
Anyways, if you are interested in approaching this problem the way I'm doing it just ping me and I will contact you and talk about it. I'm really excited and I'm learning a lot along the way. Love it.
@hallow wing Hi I'm interested in team up
hi gyz ||:')||
i'm a new kaggler
just know the basics
interested to participate in Kaggle competitions, if you tho, shout a dm out to me
Hi, anyone want to collaborate on kaggle, I am new to it.
interested to participate in Kaggle competitions, if you tho, shout a dm out to me
Hi everyone, I'm new here i would love to participate in Kaggle competitions, hit me up!
is there anyone looking for a teammate for the playground competitions? I have a small amount of experience; looking for some teams that are willing to work on the more introductory competitions. Please DM me if you would like to colaborate!
Hi I am an aspiring data scientist, I believe we can for m a team
Yes me, I'm working in the #🚀┊spaceship-titanic with R and trying to use non classical machine learning models in #🛍┊store-sales-time-series-forecasting which i have a 500 place competition by doing linear regression
Ok forget about it, those are getting started competitions not playground ones
hii i have recently complete machine learning a-z course from udemy and andrew ng course 2 times and studying b.tech from cs(Ai-Ml) . and i want to make a group of those people who really want to do good in this field give daily 3-4 hours++ on projects
Hi, I want to team up to participate in kaggle competitions
Hello Jami
Would you like to team up for the WIDS hackathon.
I'm Ashhadul Islam a data scientist looking for a team.
https://www.linkedin.com/in/ashhadul-islam-b508581a?utm_source=share&utm_campaign=share_via&utm_content=profile&utm_medium=android_app
Link to competition
https://www.kaggle.com/competitions/widsdatathon2024-challenge1
Equity in Healthcare
Anyone know a good beginner competition for image classification?
Best computer vision competitions on Kaggle (for beginners).
Hi , I want to know what are the best beginners competitions for machine learning , I have done the Titanic and the spaceship Titanic one is there any other recommendations on what competitions should I go next would be so helpful
I enjoyed doing disaster NLP: https://www.kaggle.com/competitions/nlp-getting-started
Predict which Tweets are about real disasters and which ones are not
Thanks i will give it a go
Anyone know a good beginner competition for quant?
hello guys
The Learning Agency Lab - PII Data Detection anyone working on this project??
Develop automated techniques to detect and remove PII from educational data.
Can anyone suggest me beginner competition for data processing and data visualisation?
yes me
you can do the playground competitions
Hello, Am new here and looking for any team willing to Accommodate a new member. Data science & machine learning .Am ready to join right away
If any , kindly share the link
Hello! Looking for a team for HMS Harmful Brain Activity. I'm aware it's almost over but would like to use the data while it's available. Please dm if interested.
Hello I am interested in participating in fine-grained image recognition (FGIR) related competitions such as geolifeclef-2024 and planttraits2024 or any of the others mentioned here:
https://sites.google.com/view/fgvc11/competitions?authuser=0
https://www.imageclef.org/LifeCLEF2024
I am a PhD student doing research in this area for a while and I hope I can apply it to more practical scenarios
Please DM me if interested
can any one know about kaggle spaceship titanic
can any one know about kaggle spaceship titanic
Hey guys, I wanted to ask what classifiers for Titanic prediction should I use to get higher accuracy? So far the highest accuracy I have got with an optimized Decision Tree is 0.7799. I have also used Logistic regression,KNN,SVM,Random forest but they got lesser accuracy.
Try to tune your model's hyperparameter using scikit learn grid search/random search
Hey ! I have a problem of multilabel image classification but 2 labels the percentage of 1 is about 95% . I think this will harm the training . how can I augment the the percentage of 0 ?
generally when I do data augmentation I do it randomly using image data generator but how can I augment the part of images when it has not a person or machine !
Hey, how do I change competition team?
Btw, how can I deal with the name column of the Titanic prediction? I see various notebooks dropping it, but felt that it would provide some information if parsed properly 🧐
@scenic barn @scenic barn I have, please take this as my interest in the same
Hey,
Looking for a team mate for the competition : hms-harmful-brain-activity-clas…
I'm familiar with eda...and sklearn machine learning models...
If interested, then plz respond...
i know the competition is about to end...but 10-15 days are sufficient to make a significant contribution
I am a noobie in that field, but would like to try my best. If you want we could link up and try it 🙂
good
But anyways I have a problem with the "House price prediction" competition. I can not use the csv or download it in any possible way. My error is 'ParserError: Error tokenizing data. C error: Expected 1 fields in line 9, saw 2'. Does anyone know how to solve it?
I've gotten something similar while using TensorFlow alot. In a cases the dimensions of the data structure I was passing to a function were not of the dimensions it was "expecting". Maybe verify the dimensions of data you are passing. In this case that your data is 1D not 2D. You might be passing a vector of vectors (matrix) instead of just a vector.
Personally, I'd recommend "Deep Learning: A visual Approach" by Andrew Glassner and "Hands-on Machine Learning with Scikit-Leran, Keras & TensorFlow" by Aurélien Géron as starter books. The first visually explains ML core techniques and algorithms. The second is technical, but well explained, with copious and succinct examples. My persona copies are dog eared.
Maybe try to specify engine like “python” that works sometimes
Hey I plan to participate in geoclief. Yet to explore the data and will be starting the work after having completed the ongoing challenges.
PS: I am a PhD student too but I work in fundamental RL (Multi Armed Bandits if you have heard of)
Hello I am completely new to data science and programming, I am doing the Intro to programming course ( + started learning python through youtube very recently ) and was wondering if I should wait until I get the hang of basics to understand the code for titanic or just follow the tutorial and do it
Thank you ❤️
Still one month to go in my community competition aiming at understanding and improving ML on tabular data: https://www.kaggle.com/competitions/bench-tab-v1/leaderboard
Multi-task benchmark to evaluate the performance of ML models. Competition based on: https://arxiv.org/pdf/2207.08815.pdf
Hello dear members. Am new to Data Science and to Kaggle too. Just started the competition. Hope to make some new cute friends here.❤️
im looking for beginners 2 🥲
@midnight parrot @weary tide Hey guys i am new. i would like to join you !!
Hi....
I just started using my kaggle account this week actively...I was inactive but now I am trying to stay as active as possible...
There is a competition whose objective us to convert 2d to 3d images using ml...if anyone has any experience in this regard or have done...pls dm me,I would be really grateful I just want to learn and grow as much as possible...u can even dm me if u want to discuss anything regarding ml or ai...lets be friends...
Me 2
what r u learning these days, i'm still new in the field
Im trying my first competition. do you need to clean up the test data like you did the training data?
i am focusing on learning about machine learning.
no you dont need to do that.
I cleaned up all the test data, then i did feature scaling excluding the id and the field i am supposed to be "guessing" . then i split the data not 100% sure i understand what that does. then i did a tuning using LinearRegression followed by fit and predict.
I guess the question is what next and how do i know if it worked.
im really suprised i got this far 🙂
@snow egret you are doing great as A Newbie. Kudos 👏🏻
Which competition would I learn the most as practice if I am going to build a machine learning model for predicting how likely horses are to win a race in horse racing?
And would i learn more from an active or completed competition?
Hi, i'm new to Kaggle but the classic Titanic competition is a great first one to try and it should have cross over to your project. Beyond that i would be surprised if there isn't already a competition for race predictions you could try.
Hi Everyone! I am starting my journey in ML- AI, starting off with statistics for ML. Would anyone like to join me? It will be healthy competitive way to learn, track and explore.
Hi, Do you like to work on computer vision competitions?
oh same starting off right now, what you studying for statistics?
Same here, I am new to ML as well and would love to learn in a healthy competitive way too.
🌟 Hey everyone! 🌟
I'm thrilled to present my latest notebook: "Gemma, LLM, Kaggle Solution Writeup Agent"! 🚀
Ever wondered how top performers crack the code in Kaggle competitions? Join me as we'll unlock the secrets behind winning solutions, powered by Gemma, Google's Open Model.
We'll unravel the strategies, techniques, and secrets behind those winning submissions. With Gemma we'll dive deep into the heart of data science excellence. 🌐
Throughout the notebook, we'll:
🔍 Deep dive into Kaggle solution write-ups, dissecting the approaches of top performers.
🎓 Explore innovative methods for data generation, few-shot prompting, and fine-tuning LLMs to boost model performance.
💡 Break down complex concepts with real-life examples, making them accessible to everyone interested in data science.
Discover how to upload your fine-tuned models to the Kaggle Model Hub, opening up opportunities for collaboration and knowledge-sharing within the community.
Check out the notebook right here: https://www.kaggle.com/code/ianakoto/gemma-llm-kaggle-solution-writeup-agent
Is there a way to submit a notebook to competition without it consuming GPU quota?
Yes, here are a few options to try https://www.kaggle.com/competitions/kaggle-llm-science-exam/discussion/440811
Use LLMs to answer difficult science questions
Thank you so much
Somenone can tell me a project with source code that i can put on my resume
is there a sequence of competition on kaggle that I can follow from beginner to advanced?
I think I can start from here
but on leetcode we can sort easy to hard
I feel like there should be a pinned message which gives us or guides us as to how to generally approach a competition including doing Research, Data collection etc, it will be really helpful for us beginners!
hey, for a notebook competition, once I submit a notebook, I am getting an error that it's not able to find my custom model which I have supplied with the notebook.
I have ran the notebook myself in kaggle and it seems to work there. Any tips, suggestions?
You can select based on the topic of the competition. There’s filters for it. No such filter for easy to hard though
Good day! I am an Applied Computer Science student, currently in my second year with a focus on Artificial Intelligence. For our Deep Learning Project, we have to choose a competition on Kaggle where we have to use at least one of the following:
- MLP
- CNN
- RNN
- Auto-encoder
- NLP
As we are very limited in time (only 10 days), they have advised us against using CNN, as training would take too long. So my question is which of the current competitions would you suggest me to choose? Thank you all!
Hey everyone !!
I hope everyone is doing great
This is my first ever Kaggle competition, hence I don't have prior experience with respect to submissions. I just wanted to ask how does submission work ? Do we have to upload the model or a pipeline (script) somewhere ? (I could just see an option to upload the notebook) . Apologies if the question is repeated
It would be of great help if anyone of you could provide some necessary information about the same
you submit the notebook
that code on kaggle
you can code locally but
have to upload the notebook to kaggle
in some competitions you can submit the submission.csv ( i think in the playground compeititons
rnn and auto encoders are a good bet
also, does your project have to be a prize compeititons or you are okay with non prize playground series
you have a lot more options in non prize comps
Yes, it may be both prize and non prize series!
MLP: Titanic, Prosumers, Home Credit Default Risk, Digit Recognizer, Regression with an Abalone Dataset, Store Sales Time Series Forecasting, Leash Bio, and possibly others.
RNN: Prosumers, Store Sales Time Series Forecasting, BirdCLEF 2024, and possibly others.
Auto-encoder: Predict Energy Behavior of Prosumers, Leash Bio, and potentially others.
NLP: Titanic, PII Detection & Removal from Educational Data, NLP Getting Started, Automated Essay Scoring 2, and Contradictory, My Dear Watson.
you can try these on the above compeitions
Thank you so much!
anyone know of a time series competition where exogenous data was useless and where the given time series to predict was disgustingly bad?
(Thats what the data im working on looks like so im trying to find winning notebooks to see how they went about it )
Is there a channel for the WiDS Datathon 2024 Challenge #1 for cancer diagnosis
great stuff
thanks
Good day everyone, I have a few questions about my model for the Titanic competition. I extracted the titles of each person and put them as extra features. Then I normalized my data and did PCA. Then I took the 10 best principle components and fit a few models on these. All of my models are performing very badly even after doing extensive grid search with each model. Does anyone have any tips?
what do you mean by badly exactly?
I am planning to participate in "Regression with an Abalone Dataset" kaggle competition. Any suggestions and guidance please.
Hi! I'm participating in my first Kaggle Code Competition, so I'd be greatful if you could help me, a beginner, understand the requirements for an elligible submission.
From what I've read in the Kaggle documentation, it is necessary for the submission notebook to be ran "top to buttom" in less than 9 hours of CPU / GPU runtime. That means that in my submission I should train the model on the training dataset and also predict on the test dataset in less than 9 hours, right? Or I'm getting it all wrong?!
I'm asking this question because I've noticed in this and other competitions' code tabs that there are public inferrence-only notebooks that import model(s) trained elsewhere (uploaded as Kaggle datasets) and use them directly to predict on the test dataset. This shortens the total runtime of those notebooks. Is this kind of notebook allowed to be a final submission? Or is this just a way to avoid exhausting the GPU weekly quota while also allowing one to see how well their predictions perform on the public leaderboard and also making certain notebooks public for the community without revealing too much of the training process used.
If this isn't allowed, then how are my submissions supposed to compete with these sped-up notebooks, with high public scores, especially in the efficiency section of the contest?
Thank you in advance!
Hi everyone! I'm excited to announce that I'm participating in my first Kaggle Competition, WiDs Datathon Challenge #2! I've just begun my journey into machine learning and modeling, and I'm eager to learn more.
I'm wondering if anyone would be available to chat about the fundamentals of training a model and share some knowledge with me. I believe that by learning and teaching each other, we can all improve our skills together.
If you're interested, please don't hesitate to reach out to me.
Thank you in advance!
Hi can anyone like give some idea regarding structure to motion model...like from basics how can I implement it...I have a month time
It would be of great help...pls dm me
It is complicated to say in advance what is easy and what is hard, because it depends on the progress one wants to achieve. If you need more user friendly competition breakdown, you can check Interactive Timeline of Active Kaggle Competitions: https://www.kaggle.com/code/kononenko/interactive-timeline-of-active-kaggle-competitions
Reason: Bad word usage
Reason: Too many infractions
https://www.kaggle.com/competitions/rsna-2024-lumbar-spine-degenerative-classification/discussion/505673 is any one partiicpating in RSNA hacakthon
Classify lumbar spine degenerative conditions
Hi, If there is anyone from Kaggle team.
I wonder why can't we make private competitions public again?
Hi Everyone! I just posted my work in my very first competition. Can you all please have a look at it and let me know of any improvements/suggestions?
Playground Series - Season 4, Episode 5
Does anyone know I cannot submit my notebook?
Disable the internet
There's option
Can the model XGBClassifier handle the Class imbalance problem on it's own? without me doing the scaler? Here a model I just made, Could I kindly ask you for feedback in the comment section? https://www.kaggle.com/code/mohamedlazaar2/basic-xgbclassifier
hi
Do we know what new competitions are coming up in following months?
Hello Kagglers! 
If you are about to start a competition, I offer you to join my team to solve "LMSYS Chatbot Arena Human Preference Prediction". Because more minds are better than one! 
Apart from solving the LMSYS competition I want to create some kind of short-term "mastermind group". A mastermind group, in the context of building a career in IT, is a peer-to-peer mentoring concept used to help members solve their problems with input and advice from the other group members. The idea is to bring together like-minded individuals who are focused on mutual growth and success within the IT field. 
The activities of the mastermind group will include working together on the LMSYS competition, checking in on each other’s individual projects, providing feedback and help each other grow! 
If you are interested, feel free to DM me. 
@vivid rose interested
Reason: Bad word usage
Reason: Too many infractions
@vivid rose - I would like to join you
I’m interested
Hello people! My name is paul and I'm currently on going for the RSNA 2024 lumbar spine.... competition, but I need some help. I understand the data given to me by the competition, but I don't get how to format the x_train/test and/or the y_train/test. Basically, I don't know what to do with the data I'm given, I'm stuck with what I should load the model and what the output should be. Please help if you can.
Well here's what I did:
I iterated over the train.csv, fetched the series_id from train_series.csv and loaded the image in the corresponding folder path in a tensor
You have to make a prediction for each study id
Each series consists of a 3d image that can be reconstructed from the list of 2d mri scans
You don't necessarily have to reconstruct it as a 3d image though. You might find it easier to treat it similar to video classification
Reason: Bad word usage
Reason: Too many infractions
i see
it's the solution that i could think of so im trying it for now
im just doing it to learn but probably the best way is to use a pretrained neural network for MRI images
Starting off with a pretrained backbone can save you time for sure, but you will mostly likely still need to do quite a bit of training
Hi guys I am pretty new to DL. I build MLP and CNN from scratch using numpy, but not sure how to apply them with the various datasets. I want to have a good foundation before building models in the Rsna 2024 lumbar competition, so can someone have any recommended resources for this image competition?
I am interested in collaborating with a team for some upcoming data science competitions and hackathons. Anyone up?
i'm interested too , please if you got a team please tell me , or we can create a team
hey , I wanted to ask, is it possible to solve "Titanic - Machine Learning from Disaster" problem with neural network ?
@rocky zinc You may search on YT there a lot of tutorials over there for the same
Thanks gupta
I have my LinkedIn set up, but I don't have any ideas on how to apply for internships. I need to join urgently, and I'm unable to find any good opportunities yet. Any suggestions?
are you looking for winter internships?
me
Applying on linkedln requires alot of skills honestly(you will understand when you see it) , make a resume. Search your field (fr ex; front end development jobs/internship) in search bar on LinkedIn you will see tons of jobs and then on the right corner there will be an option to select county (in which country you’re looking job)
Hope this helps
Where is the channel for Cancer detection competition
Hi everyone, I’m new here and need some help. Can someone please explain what I need to do for this Skin Cancer competition in a simple way?
- What do I need to do? - Develop a program to identify dangerous skin spots from images.
- What language and tools should I use? - Use Python and tools like TensorFlow or PyTorch.
- Which data should I use? - Use the Training and Test data provided in the competition files or data from outside.
Thanks a lot!
I'm calculating pAUC_80 using the implementation found on the icic skin cancer competition, but it is not matching up with the outputted score, any ideas on how to fix this? I want to test my models true value without wasting a submission
Meee
You will use Python and Machine learning.
Do you reccomend deep learning?
I have tried both but lgbm seems to be the best method
Resnet got me terrible scores
On tabular data tree boosting sees the way to go
Has anyone used image data with high score yet?
From what I can see most people are using lgbm
I think people are just copying eachother to get the highest score for now tho xd
its all extremely close to eachother with slightly different params if you look at the open code
You just summarized the entire field lol
It's basically distributed Graduate Student Descent. People grab the best performing publicly available approach, tweak some hyperparams and hope for the best
Sometimes one of those hyperparams puts the performance at a new order of magnitude and then becomes part of the baseline for future GSD iterations
😭 I tried with other solutions and nothing I made compares to the open code
Open to collaborate buddy
On July 1, The Global Multimedia Deepfake Detection Challenge 2024 (The Challenge) officially launched. This competition is co-hosted by the INCLUSION·Conference on the Bund and Ant Group, and organized by Ant Digital Technologies.
The Challenge aims to invite participants to develop, test, and further evolve more accurate, effective, and innovative detection models against various types of Deepfake attacks in real-world scenarios, as well as to motivate innovative defense strategies and improve Deepfake recognition accuracy.
https://www.kaggle.com/organizations/inclusionconference-on-the-bund
can i dm a competitions master/expert, i want guidance on how to get better in competitions...
Anyone has any idea how i could improve on public score
i working on Leash belka SMILES prediction
I get this error while downloading the data via API, any suggestions?
403 - Forbidden - You must accept this competition's rules before you'll be able to download files.
PS: I have already accepted the rules for the competition
@lofty bear, @random umbra or any Kaggle Staff can you please help
you need to participate in the competition before able to download the files , see on the top right corner there will be written "join competition" once you click on it, rules will pop up on the screen, accept the rules and then you will be able to download the data(training/testing - data) files
I am already into the competition, and have accepted the rules
I am interested in collaborating with a team for some upcoming data science competitions and hackathons. Anyone up?
I am up and looking forward to enter competitions
Check dms
Are there channels dedicated to community competitions here ? Such as https://www.kaggle.com/competitions/rohlik-orders-forecasting-challenge/overview
I couldn’t find any
Use historical data to predict customer orders
hey
Hi
We don't automatically create channels for community competitions, but if hosts contact us asking for one we are currently experimenting with creating them to see if they get traction.
Thanks!
hi i had a question about my work on the current playground series if anyone is around to help me out it would be much appreciated
I am wondering why no competitions started in the last month and why there are only five active competitions recently
Reason: Spam
I was wondering that too.
Hi ! I'm a first-timer in a Kaggle competition. Getting my head around the skin cancer dataset. Can I ask a stupid question?
The AI model we are developing is for an end user to take a mobile photo at home and submit this. So there won't be any other information provided to our model for inference. So is there anything useful to us about all the metadata in the dataset? Surely we just focus on the photos and labels?
I went and read it carefully. It doesn't seem that it is to create a system for an end user to take a mobile photo and submit it. Rather, it is to analyze "3D total body photos (TBP)" that happen to have the quality of "close-up smartphone photos". Fro this perspective, the meta data from the user will be available to a clinician, and, potentially important.
Hi all, I worked with Claude 3.5 Sonnet to solve the problem and this article explains what we did. Would love to hear comments. https://www.linkedin.com/posts/james-bockrath-2ab6422a0_activity-7226616370333933568-JZ8X?utm_source=share&utm_medium=member_desktop
Hi, Kaggle community!
My friend Alex and I are developing a tool to make deep neural network training interactive, user-friendly, and deterministic.
Are you tired of the guesswork and never-ending experimentation?
Are you seeking an alternative methodology that makes the process less painful and more understandable?
We'd love to hear from you! Please share your thoughts and help us to upgrade the current methodology.
LINK: https://calendly.com/graybx/30min <<
Making things deterministic is not at all possible unless you fix seeds. Talking about interactive, wondering what you are doing different from weights and biases?
The difference between our solution and Weights&Biases is that during training, you can tweak and change hyperparameters, change the architecture, set the learning rate individually, re-initialize dead/low impact neurons, and deny-list/allowlist data samples. This is supported by suggestive neuron-level statistics, so the changes you make are rather informed than guesses. For instance, you can grow the learning capacity of the network (a.k.a adding neurons to a layer) every time it plateaus, thus reaching convergence with overall fewer FLops.
During training? How is that technically happening can you please explain
You can take a look at our demo for a better understanding of the technical aspects
LINK: https://www.loom.com/share/a5c3b934d65f413da023282d64b42d70?sid=41c57ae3-eb03-4854-b638-df37250287ab
hello everyone! about 10 months ago I remember seeing some competitions about quantifying uncertainty of predictions. If I'm not mistaken, it was the first episode of a new series of playground competitions. The problem is I can't find them anymore. Does anyone know what happened to them?
Same lol, I look forward to the next NLP competition, there hasnt been any launched in the last 3 months
hi I just started my first nlp competition about tweet disaster. I saw a bunch of people reached the highest score. I wonder which kind of model did the job for you? Was it deep learning? machine learning? or already trained model?
So I am training a model for skin_cancer detection dataset which has images w.r.t target variable.When I am using any pretrained model such as resnet or efficient net it's giving me below par accuracy of 56% but when I am using a random cnn it's giving 90%? Why
I have been waiting for time series competition for like 7 months 🥲💔
Now there are only 4 active competitions which really make me wonder if there is something going now
I believe we usually used to have like 7 or 8 active competitions at least
Are you using only the algorithm or using the Pretrained weights too?
Pretrained weight too with unfreezing only last 10 layers
I read once that models that was trained on normal images e.g. images of people, fruit, animals, cars, etc, don't work well when fine-tuned on medical images
Also, I didn't work on this competition so I don't know good the score 90% is
But it is possible that there was data leakage in the second case where you used random cnn which lead to this optimistic score
Ofcourse if the score was obtained from the LB then you can eliminate this possibility
there have been barely any tabular comps this decade 😅
at least, the obvious way is to use tabular data
How much time will submision take for this skin cancer detection competition?
Is there any new competition on kaggle, that we can join
Regression of Used Car Prices seems to be good for newbies.
Actually, not that much lately
Does anyone have information about if DEFCON will launch a CTF competition at Kaggle this year?
Is there any ongoing competition?
There are 4
Is there any course on coursera that can help me get started with the kaggle competitions?
How do teams typically share code while developing an analysis and solution for a competition?
Do they use a GItHub repository, or is some Kaggle notebook more typically used?
Did you find this competition page helpful? Doing this will get you knowledgeable
https://www.kaggle.com/competitions/titanic/overview
Anyone working in Amazon ml challenge
Is anyone working on the Store Sales - Time Series Forecasting competition, I am just beginning, and I would like to join
@main storm yep I am
as a beginner, I started working on the store sales.
Hello folks anyone interested in taking part in any upcoming kaggle competition pls dm me asap
I am, which competition you are planning for?
can anyone help me ? whenever i went for submit it show inference error and got reject
Hi @vivid kite. Are you still looking for mates for competition. I am in. Which competition are you planning for?
I got a position of 2000 in a competetion with 2500 members 😭😭
finna go jump from a tower
Hi everyone,
I'm planning to participate in the "Child Mind Institute — Problematic Internet Use" Competition. This is my first ever Kaggle competition. I'm not sure how to get started tho?
Hey friends! 🌱👩🏽💻👨🏽💻
My team and I recently participated in a Climate Risk Research Challenge, and we developed DEWSClim, an early warning system app to help farmers tackle climate risks and improve food security. 💡🌾
We’ve entered the Gemini API Developer Competition, and I need your help! Please vote for our project to help us take DEWSClim from idea to reality. Voting ends September 30th, so your support means everything!
Link to vote: https://ai.google.dev/competition/projects/dewsclim
Thanks so much!
when winning competition, do we need to presented our solutions to the hosting company?
Just started my second competition, I noticed on the leader board there is rf_benchmark.csv with a different icon. Does anyone know what this is? I am guessing it means random forest benchmark. Is this an autogennerated prediction?
Hi guys, is anyone interested in teaming up with me for the BrisT1D Blood Glucose Prediction Competition?
Feel free to contact me for teaming up remember I'm still a learner
im a learner as well been doing datascience for about 7 months now im willing to team up if anyone is available
looking for a team mate for the Child Mind Institute competition to learn more anyone wanna teamup ?!
I'm Zeineb currently pursuing my master's in data science. I'm still quite new to generative AI, but I've been diving in! I recently worked with LLMs and built a little AI Q&A chatbot for workout and diet recommendations using Mistral 7B, which I finetuned with some scraped data from Quora and Reddit. i also build a facial recognition using ython and deep face
if there an intersting project i would love to be a collaborator 🤗🚀
Hello, I’m Sarvesh, a PhD student in AI at IIT Bombay and currently working as a Student Researcher at Google. I am seeking collaborators for various Kaggle challenges and conference competitions, as well as research collaborations.
To collaborate, you should be well-versed in coding, particularly with experience in large language models (LLMs) and frameworks like PyTorch or JAX. For research collaborations, a strong mathematical background is also necessary.
Please note that due to time constraints, I will primarily provide guidance rather than direct involvement. If you meet these criteria and are interested, feel free to get in touch.
I was participating in the EEDI competition, any idea why the cell randomly stops running while loading the model?
is this normal I cannot use the file upload feature? do i have to use a kaggle notebook vs local dev ?
the kaggle notebook runs out of memory, do i have to pay to use google notebooks to compete?
ayo, I solved ARC but I need to wait 9 hours, what do
if you ever wanted to know what it feels like to stare at 600k, ask me anything
8 hours now...
What are we looking at
600k?
??
This is my model doing the thing with no gpu accelleration
what arc is this? housing?
I don't understand how to generate my submission.csv file for the competions.
Hi everyone, here is a link to challenging kaggle competition in healthcare.
Those who are experts in fine tuning with small datasets should definitely give it a try
https://www.kaggle.com/competitions/artificial-intelligence-in-respiratory-sounds
Using machine learning to classify chronic respiratory diseases from human generated sounds like cough, breath, Vowel O etc.
Anyone can tell me what is this project based learning what are the requirements to start building anything and how people follow this path?
does it have any awards
hello, i am doing my first competition and i have a question. when i joined the competition i created a notebook for it where it's linked to the competition, and i see that there's a code section where people show their EDA and stuff. i know that if you're high on the leaderboard you're probably not showing how you're doing it, so how do you submit without showing others your code? do people have a public notebook for EDA and then submit a different way?
Notebooks you make in competition are private until you make them public. Best of luck.
@glacial pike thank you!!
Check out Part 3 in the Titanic tutorial: https://www.kaggle.com/code/alexisbcook/titanic-tutorial It shows you how to submit to competitions
Project based learning means you pick a project and learn the skills and concepts you need to complete the project, when the need to learn naturally occurs while you're trying to complete the project. This can be tough because you don't know what you don't know, so sometimes you just have to jump off the deep end and pick something out of your comfort zone and get started researching.
Looking to join a team for this math competition!
I'm in! How can I join your team?
Thanks this answer taught something very good things to me and will definitely start implementing these asap
has anyone done the House Prices - Advanced Regression Techniques competition?
Not yet, did you start that one on your own or is it part of a kaggle course
Hey everyone... im have decent experience in ML and AI but havent ever done any sort of challenges and would like to start with the Kaggle ones, Im not an expert for sure but how to find challenges that are not too easy and not too difficult?
https://www.kaggle.com/competitions/playground-series-s4e10 try this one. If it's too hard, do the beginner competitions. If it's too easy, do the advanced competitions.
Thank you very much
https://www.kaggle.com/code/turabiyldrm/easy-catboost-tutorial-0-69-score-in-3-minutes
I’ve shared a notebook on Kaggle—feel free to check it out!
Can we use cohere llms ?
hello,
I am working on a loan approval prediction project and am considering modifying features by removing less important ones or engineering new ones based on their importance. If I apply these feature modifications consistently to the training and test datasets, will my notebook be eligible for acceptance within the competition?
Additionally, as I am new to Kaggle competitions, I would appreciate an overview of the typical roles and responsibilities that participants might take on.
Hello everyone...I am learning ML ,but do not have any experience yet.i am new in discord and kaggle..how can I use it to improve my learning and kickstart my career.please guide me about this platform
I wrote a discussion post on how to get started using ML using Kaggle Learn as well as competitions: https://www.kaggle.com/discussions/getting-started/541178
30 Days of ML Assignments with working links!.
https://www.kaggle.com/code/turabiyldrm/eda-and-feature-choosing-with-randomforest
can you please upwote it is my last bronze medal for being expert
do some kaggle courses to familiarise urself first
Hello everyone! Although the WiDS competition has ended for a while now, I’m excited to finally share my work. You can find the details in my GitHub and Kaggle links below. I’d love to hear your feedback!
https://github.com/drkbluescience/WiDS2024_Challenge2_MetastaticDiagnosisRegression
https://www.kaggle.com/code/enisezengin/wids-24c-2-metastaticdiagnosisregression-80-154
Thank you!
This notebook presents an exploratory data analysis (EDA) and regression modeling approach for the WiDS Datathon 2024 Challenge #2. - drkbluescience/WiDS2024_Challenge2_MetastaticDiagnosisRegression
done
not yet but planning to start
Why not start with nlp ?
i have already started pandas
I would still recommend do join that ai club after u register (or even if u not consider registering do install pycharm and there re nice builtin numpy/pandas and other data science courses in the IDE
Does anyone know why plagiarism is allowed in kaggle competitions? In every competition you look at, the top 10% in public leaderboard always have the same scores because they've forked the highest scoring solution
and this behaviour is even more encouraged because you gain notebook medals when sharing these types of notebooks
yeah, in ARC, the entries on ranks 42-813 (772 entries in total) all have a score of 26 😂
With medals being given out until rank 134, the majority of the medals will go to copycats.
so you have to beat rank 42 or else youre in the bottom 50% 😂
rank 42 with 6 days to go aswell, how depressing
Is anyone taking part in Meesho Competition?
The goal of Kaggle competitions is to get the best possible solutions, public sharing during the start and middle of a competition can help accelerate all teams and leads to better learning experiences for participants and higher quality final top solutions.
Your goal should be to learn from what is public and build on it to secure higher results. Usually this process will push a lot of simple "fork and submit" users out of the medal range, but unfortunately that doesn't always happen. It's unfortunate, but the alternatives (eg. blocking public sharing) would be worse for most participants and most hosts.
In that case you should completely rework the medal and rank system. It's a joke that the best way to earn titles is by forking and submitting.
"Usually this process will push a lot of simple "fork and submit" users out of the medal range, but unfortunately that doesn't always happen." - It literally always happens in every competition because people just try to farm medals. In the scenario from the user above, you get pushed to the bottom 50% if you don't beat the highest scoring notebook.
I understand the point about knowledge sharing, but it completely destroys the point of competing and trying to get medals imo
If you look at the data (check out the metakaggle dataset) I think you'll find that isn't the case. For one thing, the results from the public leaderboard often don't result in good outcomes on the private leaderboard (due to overfitting etc).
Bronze medals are certainly possible to get with a little luck and forking, silver medals are very rarely awarded this way, and gold never are. To get to master or grandmaster tiers (the ones that actually matter) you need gold medals, which are still very hard to get.
Bronze medals and the expert tier represent effort at Kaggle, but you are right that sometimes people will find easier ways to win things. It doesn't tend to help them though, ultimately a bronze medal on Kaggle is only really valuable if you learned something in the process or can explain your project to a recruiter to get a job. Simply forking won't get you very far in the end. We would love to prevent this sort of behavior but ultimately we are making a tradeoff to support the benefits of sharing.
If you have specific suggestions for changes you'd like to see to the system we are always open to listening to feedback.
Is anyone’s notebooks being stuck on “Queued” while committing for quite some time ?
Queue times are up due to the large number of people on Kaggle for the Gen AI Course. We're increasing capacity to help keep them down.
hello guys i have a question , is it possible to import ollama to my external ssd ?? and train my model using my external ssd
anyone has recommendations of EDA handbooks? new guy needs help
Heyy guys i'm looking for hackathon to patrticipate in especially in LLMs you know any?
@slim plaza
Is scam message he sent in all channel
Which other channels? @true trench
Hello guys !! I had a question, who determines what medals we recieve? the organizer or kaggle?
I recently got a bronze medal on the long context gemini competition, I wanted to ask how I can improve this, since this competition is not ranking based
wonder if there's a long queue in the submissions...I have a sub that is still scoring after X hours although I expect it to complete in X-2 hours...and a commit that has been stuck on queue for 15 minutes
What steps would a parent need to take in order to allow a minor to participate in a contest?
my parent filled out a google form but is there anything weshould do after that?
uh literally nothing different
no but like do they send you an form or smth
or can you just participate
Hello Everyone. My name is Bukenya Lukman. Reaching out to everyone who can help. I' am a Ugandan student.
Looking for ways to raise tuition such that I can graduate (Ms in Software Engineering and Data Communication) this coming January , 2025.
Any support is highly appreciated. Help me Share with Friends and colleagues
I have explained my story in my GoFundMe campaign here
https://gofund.me/a598102e
Any help, even sharing will be highly appreciated
Hello everyone, i am new to kaggle competition. I am getting started with spaceship competition. Can anyone help, how to handle "nan" values in different columns ? shall that be removed or that also can be trained?
Hello, I am setting up a kaggle community competition, but I would like it to have two stages, one stage where participants train their models and submit their methodology and the second stage where participants submit their solutions each week to predict the outcome of games. So the submission and evaluation will be ongoing and live. Can I host such a competition? It seems the "launch checklist" has certain requirements that need to be met, which are not relevant here. Thank you in advance
Let's connect, We have some project to discuss AI related
Anyone with a Kaggle competition medal, DM me! Let’s team up and try to snag a medal together ASAP.
Guys I am stuck with "Regression with an Insurance Dataset" competition ,it has 1,200,000 row so when I visualize the data I cannot understand anything so I cannot remove or handle the outliers and this is my first time to deal with such a big data so any suggestions ?
Complex solution: those columns will be turned into three columns, throughout. First will be set to 0 for a NAN, or number if one is present. Second two form a two dimension binary input indicating whether value is a nan ( 0,1) or an actual value (1,0). Simple solution: just set NaN to zero: it will not provide any inpout values to net while calculating.
hi , how to proceed with the problem ...like to we consider the state data of the policy and treat it like time series data ...and by creating the window do we predict the insurance amount ...can someone help me with this
Well, I mind to ask this question. But, if there is anybody who knows kaggle well, could you tell me how I can get prize in the competition after winning?
Payoneer, or paypal, or crypto?
Sorry,
hi, so i was wondering, what does the score represent in competition leaderboards? is it the accuracy on test data? or something else? or is it up for each competition to define? (if so then what is it for the spaceship titanic competition?)
The "evaluation metric" described for each competition clarifies exactly what the score means. It's different for every competition.
what can be the possible reason for failed submission in a competition, if the version saving have no error.
error: Notebook Threw Exception
really cant understand much via explanation given in Faq.
It threw an excaption, and you have no code to handle it
Hi @lofty bear
I am very eager to know if there will be season5 for playground series starting Jan 2025.
Can you enlighten us about it ?
Happy holidays and a very happy new year 
is that normal?
depends on the competition I guess
Bro, You need to first upload prediction csv file (which you predict from your model). And you will have found this test file with your train data.
If it is a kaggle notebook it is perfectly fine.
It can run upto 9 hours if I am not wrong
I got what was the problem I forgot to trun on the gpu
Hello,everyone, i am just a rookie in this field. i wanna know how to prepare this item if i just have a little basis. i dont have some high goals ,i just wanna have a basic recognization of this item. Thank u for ur suggestions and advice!
Hello, kagglers.
I have one question.
I need to make a news sentiment ananlysis tools.
Now, I have 2 methods, first method is to use open source models such as deberta or roberta and second method is to use HMM(I heard IBM used this method).
Which method should I select?
it depends on ur work.i recommend the first one if u have much work to complete
Hii, I have a imbalanced dataset. Target column has a two category. One is 1.5% and second is 98.5%. And dataset shape is (100000,32). There is 1lkh rows. If i use smote then data is increase in high numbers of rows. So, according all of you what should i do ?
Hi @solemn compass . Without smote, in a simple way, try multiply these records by 65.3 times to balance the dataset for testing.
hello. where can i find datasets that are in demand? looking to contribute
I'm a 17 year old who has been doing ML projects for about a year now, but don't know how to improve from here and get better at competitions. I am willing to pay masters, grandmasters, or anyone who can prove that they can teach me how to improve my ML skills for an hour of their time.
Please dm me if interested
Currently in school ?
Dmed
Hi, everybody.
I'm looking for ar related to project where that can detect palne and argument the objects on that.
If there is anybody who knows, please tell me.
Thanks.
Hi everyone!
I’m looking to form a team for the WSDM Cup - Multilingual Chatbot Arena Challenge and would like to collaborate with 2-3 motivated individuals. If you're interested in joining forces and taking on this exciting challenge together, feel free to reach out to me via message!😋
I am motivated but not that skillful to properly help you but I have good understanding if you guide me basic basic then I will do work for you
Well, "processing" news is trivial. So the question is what for?
hello @everyone I have joined AI mathematical Olympiad and now I am little bit confused about dataset and other files that are showing in side bar, i want to know how start with this problem, where can i find dataset, here are two one is reference showing only test showing 3-5 problem, test set showing i think approximately 10 problem and one pdf contain 10 problem so can anyone guide me about how to start working on this from where can i get dataset and more. kindly help me thanks.
Hi,
A question for "WSDM Cup - Multilingual Chatbot Arena"
Can i pretraining model before create notebook to summit?
pretrain dataset not related with competitions
It would appear so if you make it publicly available. (from Code Requirements section)
Thanks,
"GPU Notebook <= 4.75 hours run-time during the training phase"
When using the train dataset from the competition for pretraining, does it have to be under 4.75 hours right? Or is that just the run-time of the submission
Hello! Does anyone have recommended peft install recommendation? I used pip but when importing peft, got this error: ImportError: cannot import name 'GatedRepoError' from 'huggingface_hub.errors' (/usr/local/lib/python3.10/dist-packages/huggingface_hub/errors.py)
after trials and erros: huggingface_hub=0.25.0 is compatible with peft=0.14
Any Kaggle mod can help?
I'm participating in a competition that ends on January 18th. Do you know what time does the competition finished? I've looked into it in the Kaggle site but I can't find it. I live in a UTC + 1 time area.
I ran into this issue, and it turns out that i had been loggging in with two different email accounts, inadvertently . You know, when asked to sign in, you are given a plethora of ways too: gmail, facebook, email, etc. Signed off and logged in with only one, and things worked smooth.
I got this error message too when it was a problem. Click on your PFP when on Kaggle site to see if it's there.
can I join Kaggle competition via this channel?
No, this is just a place to talk about competitions. You can join competitions on the website.
so this competition gives reward?
yes, competitions have their rewards. differs from competition to competition.
cool, so what is the average reward?
Hi everyone I hope all are doing fine, I need suggestion for next simple project for beginners after the famous Titanic. can i have a list for for beginners. Thank you
for ML
MINST cahallenge is another classic. Sort of the ML "HelloWorld' project
Thank you yes indeed it is a good 👍 one for beginners
Hi, everybody.
I'm trying to fine tune BERT model with datasets.
If Someone who has similar project please help me.
Thanks.
Hello Everyone , I am new to kaggle competitions, i was looking at sample notebooks for https://www.kaggle.com/competitions/ai-mathematical-olympiad-progress-prize-2/overview this competition , i see people loading models from input directory , but i dont see any language models on my directory , is there anything i am missing or do i have to train my own model and upload for inference or can load from huggingface?
Hi, @lyric rune
I am also new here.
I want to work with you on kaggle competition.
If you want, plz contact me.
Hi, @hollow tundra
I am new here as well and would be inrested in working with other people.
Hi, I am new here and am interested in working with others
Hi! I’m new to this field and have recently completed a course in data science, machine learning, and deep learning. I’m excited to get started and eager to join a team where I can gain real-world experience. Let’s roll!
Looking forward to joining a team if anyone is open to having me on board!
@robust inlet @pure bear
Hello
I am also new to kaggle and a beginner in machine learning, let's decide a competition and participate in one, starting with playground competition would be good
DeepSeek just shocked the AI world by beating ChatGPT & Claude in benchmarks🤯
Everyone's rushing to try this game-changing model that's 30x cheaper than existing solutions.
We’re doing a deep dive into how it’s actually possible!
Date: Tomorrow, Jan 30th, 9 PM IST
Register Now: https://lu.ma/ael5tq70
how to download whole checkpoint-9422 directory quickly?
oh, it is kaggle kernels output <user>/<version-name> -p /path/to/dest
Hi
after saving a version of my notebook, and submit it to competition, can i shutdown my system while notebook is still running for submission??
hi
can someone help me with starting a competition?
i am new kaggle and i wanna do some projects but doesn't know where to start!
Hey! Do you have a team that I could join?
sure, everything that runs in background runs till it ends or you cancel it
When joining a competition as individual, and when some others join my team, and when there's limit to no. of submissions per day can each teammate can submit individually that no. of times or in total we can submit that no. of times in a day ?
I don't know actually, but I assume limit submissions is per team
GPU/TPU quota per person
So it's not good to merge teams from the very start n better near at deadlines ?
I guess, but there is a limit on added submissions too, so you have to think it carefully
or go to learning experience and just go to a team when you can
the GPU/TPU quota does not change, but there is a limit on number of submissions
https://www.kaggle.com/competitions/thapar-kaggle-hack-v02/overview
i participated in this competition and the accuracy rate throughout the competition is very low like around 25.8% is the top of the leaderboard
can someone help me to figure out what is actually wrong with accuracy rates???
i myself got around 15% accuracy
🎙️ The Secret to Starting your Podcast in 2025
Hint: You don't need expensive equipment anymore!
Join our FREE workshop and learn how top creators are launching podcasts using just AI.
📅** Thursday, Feb 13 at 9PM IST**
Register Now: https://lu.ma/wlnvyebn?tk=eMLWC6
For AI Resources Join: https://t.me/BuildFastWithAI
Want to start a podcast but don't have professional recording equipment? Curious about how AI can be your personal voice artist?
Join us for an exciting…
Build Fast With AI is a Generative AI focused start-up run by IIT Delhi alumni to deliver cutting-edge Gen AI solutions. --> Latest Gen AI Tools --> Gen AI roadmap & materials --> Workshop & Event Links Website : buildfastwithai.com Contact : @nag2mani
No, but I'm interested in creating one!
Any new competitions coming soon?
Let's make one!
i want to join a team too
Hi, everyone! I am new here. I want to participate in March Machine Learning Mania 2025 competition. Can I know what to do in this. I am first time participating in the competition
Hey, anyone interested in march mania competition
i am
i'm interested but i'm also new
me too
yeah, me too
I'm currently creating a team of 4, you both are welcome to join. @obtuse night
Hey everyone. I'm looking for people to learn, discuss and create together!
surely the top three people in a competition having the exact same score means something is off
Me too
Even if the score is same, the one who uploaded earlier will get the upper position.
hello @everyone, My query is regarding WSDM Cup I need your serious help, I can't tolerate this. Whenever I am trying to train any LLM in Kaggle. the kernel just thrashes me an OOM error. This is really frustrating, I am seeing people using the same configuration but they are able to train their model.
I know someone will say, you should lower the batch size, etc etc. But even with reduced batch size the error still persists. Please help out or join my sorrow if you have experienced the same
class Config:
# Model
model = "unsloth/gemma-2-9b-it-bnb-4bit"
# Data params
max_len = 2048
per_device_train_batch_size = 2
per_device_eval_batch_size = 4
# Training params
gradient_accumulation_steps = 2
n_epochs = 1
freeze_layers_cnt = 15
lr = 3e-5
warmup_steps = 20
optimizer = "adamw_8bit"
# LORA Params
lora_r = 16
lora_alpha = 32
lora_dropout = 5e-2
lora_bias = "none"
seed = 42
Hello guys!
hi guys my first ever submission on kaggle, is this normal running time ?
It's your 1st submission & if this one is not a large dataset then,
that's totally not a normal running session!
It could be kaggle backend issue(happens sometimes)
Try restarting the kernel
In lux-ai S3, We can only have two submission notebooks running at any time for competing with other agents, how do i decide which two those notebooks should be, right now it disabled the old ones, and run only last 2.
these bottom 2 submissions are disabled
Hey, i need help
We have long conveyor like 5 kms long
In which there are idlers around 4 to 5k
We divided conveyor section with imaginary line let's say every 20m , in that there are around 10-12 idlers of same dimensions.
We have normal data that is after replacement of idlers, and have abnormal data that is before replacement , we have dataset in the form of real positive fft that is each row contains list with 5k integers
If i train 1dcnn based auto encoder or vae it works section wise, like I can see higher reconstruction error in abnormal data. But it is impossible to create model for every section it will be computationally very expensive. I want single model that will work entire conveyor, but when I combine all data and train then it won't generalise well.
Also I tried extracting statistical features like kurtosis , skewness etc and trained dense vae but no luck what can I do ?
Note: i can see abnormality in normal data too. Even after cleaning it becomes more sensitive to normal data as well tell me better approach if you have any experience related to similer problem
try quantizing your model, and using quantization aware fine-tuning
I have no idea what you are talking about, but if you are using series (vectors) of data, finding the everage of series for each dimension, and then measuring distance from it for each point is a way to normalize that doesn't involve ad hoc MinMax scaling, as well as dealing with outliers in a way that doesn't skew everything
lets say i have idler rotation we record its vibration in normal and abnormal condition if i train autoencoder on normal data and test it on abnormal data it shows anomaly, but when i train normal data if all idlers then it no longer able to detect anomalies on abnormal data
Are you using autoencoding to reduce the amounts of dimensions you are processing? Otherwise, not of much use.
i tried AE, VAE using 1dcnn data, also tried supervised way like labeling 1 for normal and 0 for abnormal, tried pretained computer vision models by converting list into imagess mapping each frequency amplitude with different color, tried regression approach by assigning one target variable and predicting error, not much luck it get worst as we combine multiple idlers, but thing is we want single model to work all over
Okay. Imagine I'm trying to vectorize Spatial or Temporal Data. I average all values, and then for each unique one, I calculate the distance from that (collective) average, for each. Auto scales and auto normalizes (inherent in the Mathematics of it). Make sense?
ok thanks it should work, i will try, tell me other appraoches
This will guarantee that your data will scale between 0 and 1, or -1 and 1. I take it you are working with Artificial Neural Nets.
sure thanks bro, instead of scaling what if i devide it with max amplitude ? as it will preserve fft structure and relative values to other idlers, i tried scaling it scales with varying range when i inverse scaled it wont get back to same values
Are you using LSTM layers in your processing NN?
nope we cant use lstm or temporal based layers as we are mixing all idlers ffts data it will destroy temporal order
But LSTM are designed to excel at spatiotemporal processing, such as say FFT bin sequences
Its kind of their Raison D'etre
It would help if you clearly stated your problem state, desired solution, and approach to it
i know if we go for lstm, it will be better for single idler , but it become chalange and computationally expensive for number of idlers, current we are detecting anomalies based on fft stucture like normal ffts have smooth decay while abnormal one have peak in bwtween or decay is skewed
i tried dense vae, too like extracting statistical features but no luck, as we have only real positive part fft, dc, peak to peak values only
Again, rather than dealing with absolute values, use my method and values become relative to a norm. You can even change the size of windows (bins) to establish that norm.
It doesn't introduce artifacts (autoencoders do), nor additional computations (except for processing the data)
Outliers will tend towards 1, or if directionality is taken into account, -1 too. Then you can do whatever you want with data
ok lets do it, thanks for your valuable inputs
Here, a sample application w/ a trading bot I've been developing: metrics['open_high_dist'] = (ohlcv_data['High'] - ohlcv_data['Open']) / (ohlcv_data['High'])
i have developed so many, im getting 100% rate but number of trades are rare
in some im getting 66% rate and high profit too
this is the sample
I'm shooting for a system that's mid point, of predicted highs and lows, always falls within band of target high and low candlestick band. And then figuring how low I can go (buying) from predicted midpoint, and how high (selling). A system that, while not making maximum profit, always makes one 😁
A bullish bear 😂
u can go for anomaly based feature incorporate them into lstm meta model time series forecasting, it gives nice forecast but sometimes wrong too
also u can pretrain selfsupervised model on lots of stock if u have gpu memory like 80gb
i created smaller one sometimes it consistent and sometimes it dont, so i made losses too
What time scales are you using for entering and exiting positions? Over a day, a week, months? It matters
i created pretrained for daily frame and use forecasting models 5, 15, 30, hourly for confirmation direction. and i combine it with indicator that i have created
getting close to training the system this week. my categories
See this one rsi and rsi anomaly when anomaly score high and rsi low buy signal
When rsi high and anomaly score high buy put signal like that
damn crazy
Plans to train multiple NNets and than Ensemble Method style, combine there outputs when predicting for an asset. At least, that's the gist.
do you know any server where people actively discuss stock related stuffs using ml/dl ?
Not really. And my approach, not at all.
i want to develop something for options , i tried using forecast but thing is it is not consistent
Are you using monolithic Neural Nets. A single net. With one big massive input layer? Hoping something helpful comes out?
i engineered features using vae, saved in csv and used that for time series model like i use lstm variants and created meta model out of it
also for cross verification i created 50 small models which will follow iid condition of clt, using variable features in each models, to get forecasting closeer to expected true value
Okay, let's imagine you have 13 weeks of data. You could have multiple input layers. One processes 2 weeks worth of dat, one 3, then 5, then 8, then 13. Seprately at first. Then those "channels" , you concatenate them and process that. Each would learn about different influences. Short, medium and long term
yes thats what i incorporated in clt
but not consistent
different windows
But windows are all part of same Net. Just in beginning of pipeline, they are first processed in parallel, not toghether. Are you doing this. In Keras, you'd make use of a Functional Model
so it is different than lookback window right, interesting
my forcasting model will surpass if i get order book data but it is impossible to get also it is costly
yfinance API perhaps?
it provides only historical data not order book data
you can get hourly data for past 730 days on assets. better than nothing.
im getting in all frames of data from my broker
thing is i need order book data which is costly like millions
I'm trying to build a predictive tool, that each week sets up buy-sell pairs, before the week has started
great, you have nice approach
anyone here participating in CIBMTR competition, need to ask a query about, prediction submission.
I have trained model to predict 'efs' and 'efs_time', how do i calculate risk scores for prediction col.
I tried simplest way like using classification, although it gives lb score of 0.64 something
It provides historical data
in this competition
https://www.kaggle.com/competitions/aptos2019-blindness-detection/overview
I made notebook and I try to submit it but says to me "Notebook out of memory"
I was wondering how the notebook out of memory I already save version of notebook and saved successfully without any problems, problem only appear in submissions to competition
but version of notebook run and saved successfully I need some help in this problem
may be the hidden test data is big enough that it causes out of memory to your gpu.
which is not case with pubic test dataset, that's always small
What do you advise me to do?
preliminary checks, empty old stuff from gpu memory(not needed anymore), when not needed, use T4x2 for more availability of vram.
at last check at what step your vram is booming, is it preprocessing, inference. just divide this process in multiple steps. it will increase notebook time, but will get the job done.
PS: make sure you are using both T4-gpu
what is the submission file size limit in competitions?
is this the right channel to ask questions/help for my competitions?
Any latest updates?
updates on what? 🙂
updates on u guys hopping to rank 1 in aimo public lb he meant
Quick question on notebook internet requirements: if I have to install something not present in the stock environment (eg some more torch modules), how will that qualify for no internet competitions?
I think, no internet connection refers to a running program. I think when running it, you can install modules from multiple sources. For instance, running a routine that makes a pip install call. I think no calls to Internet mean none after setup and program running. Checking with chatGPT now:
Yeah, you’re mostly on point.
Here’s how it works on Kaggle competitions:
✅ “No internet connections” generally means:
• Once your code is running in the Kaggle environment (during training or inference), it cannot access the internet—no external API calls, no downloading files from the web, etc.
• Security sandbox – They want to ensure your solution is self-contained and reproducible.
✅ BUT Kaggle does allow certain things beforehand:
• You can install Python libraries via !pip install or !apt-get install in the Kaggle Notebook cells before running your actual solution, as long as the packages are hosted on PyPI or Ubuntu repositories.
• You can upload custom .whl, .tar.gz, or .zip packages as Kaggle “Dataset” files and load them locally.
So, if Kaggle doesn’t have a package pre-installed:
1. Try doing !pip install package_name in a notebook cell (works offline since Kaggle mirrors some PyPI packages internally).
2. If it’s a custom/private or less common package, download the library manually on your machine, zip it up, and upload it to Kaggle as a dataset. Then, you can do sys.path.append("/kaggle/input/your-uploaded-library-folder") and import it directly.
💡 TL;DR:
• Kaggle blocks network access during runtime, but you can still prep your environment with installs and uploads before execution.
Thanks for the help! I ended up uploading the wheel files as a dataset and all is well now, submission worked fine
One more question, when making a submission is run time deducted from my total allocated GPU time?
no
does anyone have space to add me to their team, i am kinda new to this, and i dont understand what to do
https://www.kaggle.com/learn/intro-to-machine-learning
This is kinda my saving grace since I just started recently too.
Learn the core ideas in machine learning, and build your first models.
yes i took this and the intermediate course already
Competition specific question already asked in #stanford-rna-3d-folding but I'm asking here as well since I think this can apply to all competitions:
Can I use a separate notebook to build my dataset to avoid reducing my available training time? Given I make the result public
Yes u can do so unless in discussion or somewhere it's restricted to do
Can anyone please help me make a model that is over 90% accurate with this data? I am having no more than 80% accuracy. Is there anyone who can help me?
what this data about? mnist or something. what architecture of models are you trying?
overfitted, run with less epoch's
FashionMNIST
thanks
With this classification data, my model is performing terribly like I only scored 0.0932 in the Kaggle competition. RandomForestClassifier gives me the highest score but I am stuggling to have a better score like those who are at the top of the leaderboard
Can anyone help me to do well?
is this the official BDAIO?
They are at kaggle 💀
No other national aio doing even practice contests/comps at kaggle BD is great
What imbalanced class methods do you guys prefer for maximizing AUC? I have tried SMOTE and NearMiss, and also used scale_pos_weight in Xgboost
typically we can balance the test set such that it more closely resembles the training set in terms of distribution. This then gives us a better AUC
Hopefully, ARC-AGI competition comes back soon. They said it would launch Q1 2025, but that's about to end in 5 days. 😅
All the participants (individuals n organisations) r grinding like hell for it 💀
From a long time 🤧
:)
Its launched
Please, I was recently working on a 'fraud detection' dataset and I needed to fill in the missing values for a qualitative variable: my 'Devise_used' column had 50100 rows per
s'il vous plait je travaillais récemment sut un dataset 'fraud détection' et il fallait compléter les valeurs manquantes pour une variable qualitative : ma colonne 'Devise_used' avait 50100 lignes par contre le nombre de valeur abérante est de 2498 je ne sais pas si il faut compléter par le mode ou juste laisser les valeurs aberantes comme valeurs significatives puisque je me dit que forcement c'est une colonne très importante
@flint wyvern
Regarding the topic you posted a few weeks ago. Email competitions-spotlight@google.com looks dead . Possible typo
Maybe competitions-spotlight@kaggle.com ?
Edit: it was my mistake
Did the official imagenet comp get discontinued after 2017 ?
hello
anyone want to do Image Matching 2025?
hi
hi
hi
Yeah
Hey, anyone interested in ML competition
Today I can't get any kind of mail for day 5 tasks and assignments
I'm interested
i am interested
I'm interested
I'm interested
And I've not gotten the email detailing the capstone project! There's a problem happening in the end of the event. 🤬
@ Divine Tree Even me have not gotten any email too
🚨 Final Call – Capstone Team Form 🚨
Hey everyone!
This is the last reminder to fill out the form if you're interested in joining my Capstone project team focused on AI/ML-based real-world solutions.
🗓️ Deadline: April 9, 2025 - 11:59PM (EST)
📄 Form Link: https://docs.google.com/forms/d/1UKhtfdgH8Z1CkGXXRIsQ2uaDYC5WBJiXhshn6fj3U4g
Looking forward to connecting with like-minded folks!
📧 Reach out if you have questions: snkp.globalcollege@gmail.com
Hi! I'm Shashank Pandey, forming a team of 3–4 members for our Capstone project focused on AI/ML-based end-to-end solutions in domains like finance, healthcare, and supply chain.
If you’re interested in collaborating, please fill out this quick form to introduce yourself.
⏳ Deadline: April 9th 2025
📬 I’ll reach out to selected individ...
hello, i have uploaded this notebook to the competition 'Stock_Research_With_LangChain_Gemini' could you confirm you got it> it's my first competition and i just click on 'Submit to competition'
🚨 URGENT HELP NEEDED: Rossmann Store Sales Kaggle Final Project
Hey everyone 👋 I'm a student working on my final project using the Rossmann Store Sales dataset. I need a Google Colab notebook that gets me into the top 5% on the private leaderboard.
🎯 Target: Private Score < 0.11700
💻 Must be:
- Fully working Colab notebook (clean & optimized)
- Feature engineered with lags, rolling means, holidays, etc.
- Optuna hyperparameter tuning
- Final model: LightGBM or ensemble (LGBM + XGB + CatBoost)
- No bugs / ready to submit
🕒 Deadline: Today
💸 Willing to pay for the right notebook
Please DM me if you have a strong solution or are available to help 🙏
Hey guys,
my teammate is'nt able to join kaggle competition. it says as shown in the image. can somone please help? he did register and attend the webinars and can also access assignment notebooks but yet he isnt able to join the competition. can someone please help on what he can do. Thanks.
Completed and submitted my project - AI Meta Data Generator for images
Here is the step by step guide and blog of my project
Video - https://www.youtube.com/watch?v=iReYY5vYzqQ
Blog - https://yaitoolbox.com/ai-driven-art-metadata-generation-build-your-free-model-on-kaggle/
AI-Driven Art Metadata Generation: Build Your Free Model On Kaggle
Meta Data Generator Notebook : https://www.kaggle.com/code/veeramallavijaigopal/ai-metadata-generator-for-images
To Explore Kaggle - https://kaggle.com
Google AI Studio - https://aistudio.google.com/
Step by Step Guide - https://yaitoolbox.com/ai-driven-art-metadata-generation-b...
AI-Driven Art Metadata Generation: Build Your Free Model on Kaggle. Step-by-step guide to creating your model with Gemini AI for art sales
can i submit the collab notebook link rather than kaggle notebook
Revolutionizing Learning with AI: YouTube Video Analyzer How Gemini-Powered Concept Maps & Quizzes Transform Video Learning Introduction In...
AI-Driven Art Metadata Generation: Build Your Free Model On Kaggle
Meta Data Generator Notebook : https://www.kaggle.com/code/veeramallavijaigopal/ai-metadata-generator-for-images
To Explore Kaggle - https://kaggle.com
Google AI Studio - https://aistudio.google.com/
Step by Step Guide - https://yaitoolbox.com/ai-driven-art-metadata-generation-b...
hey did u use napkin ai in the blog ?
that code block decorator style how did u do that i couldnt find any such option
Insert the image in the napkin and select the image to get decorative option
I see too tedious
Today generative ai capstone projects will be announced by kaggle officials @ 9:30 pm IST
Good luck every one
if I want to use C/C++ in a competition I should build C/C++ code in the notebook or usage pre-built binaries is possible?
hello there i am new to competetions is it alright if i discuss any ways to solve a particular competetion or is it against the rules?
currently i am working on llm classification finetuning and i have some doubts as i am a beginner.
can i ask those here?
It's fair to discuss but it's better to discuss n ask in the respective competitions' channels
Okieee
Try posing that question to chatGPT. That's what LLMs are for.
Hello people !
I have a general question, do you usually use the Kaggle notebook ? or do you rather use google collab or any other platform ?
Any tips for organizing notebooks when starting a competition ?
Am a beginner and I find it hard to resume my work whenever i pause it (have to re-run all the cells etc)
Thanks 
Should I encode categorical features of my dataset into numerical features and then standard scale those too or should I leave the categorical features out?
Is your notebook too big? It runs mine pretty fast, even though im not really that deep into building my model yet
Its my first competition as well
It depends on your dataset if it is impacting the dependent variable then yes you need to do that.
would u mind telling me what a dependent variable is?
something that your predicting from all your features
In an experiment, it is a variable that is kept constant. Independent variables are changed for each instance of an experiment. It's a term from scientific research methodology. The goal is to see the effect of Independent variable variations on a constant (dependent variable).
Categorical data is Discrete data and doesn't follow the same mathematical laws as Continuous data. Categorical: Apples or Oranges. Continuous: how sweet is this piece of fruit, from 1-10. So you want binary vectors for categorical, numerical for continues values.
It's depend on if these feature have an effect at the target feature or no and if them have a correlation with other feature and there is sensory side based on you which you should have a soild understanding of the topic of the data and the data and based on that you will decide if these features are important and have to make encoder or non important features and you can drop them
Hi everyone! I'm new to Kaggle and excited to start participating in competitions. Could anyone share what skills are most important for getting started and doing well? I'm learning Python, data analysis, and basic ML. Any advice or beginner-friendly tips would really help thanks
These might help jumpstart your journey. The first is full of diagrams, visually illustrating ML concepts (what's really going on) ; the second chockfull of techniques to get you exploring creating custom algorithms (and brief to boot)
Thank you
They're good for "learning to swim"
So they are good for starting not going in depth right?
Deep learning will give you a very visual book into understanding what the algorithms you use, actually do. Hands on is like a journal , trial and error based experience on how to structure and apply the vast range of techniques out there.
Both books will allow you to get deep, w/o needing a deep understanding b4 hand, if that is your question
So to answer your question, yes
thanks
i also wanted to ask about the difference between tensorflow and pytorch,
I prefer tensorflow, but you can do the same with both
is there a big difference or just the difference in deployment?
Syntax is different, but you can build , say a NN in both
no one is b etter when it coes to ANNs
maybe this will help
so @calm oak , how did you get into AI/ML
ok so there is no big difference
thanks for your help
Yeah, I think PyTorch is still used by people who learned it, and don't want to relearn how to make a wheel. Oddly, popular with people developing Stable Difusion algorithms (text to image stuff)
i am in digital egypt cups initiative, i learnt data science concepts and libraries like data wrangling and these stuff and then for more knowledge i am exploring the AI/ML right now
right now i am learning pytorh
Keeping it a stack, in AI, data wrangling and preprocessing is 75% of the work 😅
the rest is easy
Master it. If you need to switch to TensorFlow, you'll still understand the fundamental concepts. Code just will be a bit different
As in, you could code in C++, or Python, but princiles remain the same
ok ,thank you
learning how to code 🤣
true 😂
I tried OFC
is there any famous competition about video ai in recent years?
Me as a tester for code
General ML question: If i have 2 columns of categorical data, would it make sense to one hot encode both of those columns? I'm asking cus now there will be a bunch of columns with 0s and 1s for the features of both columns and im wondering if that would be enough to differentiate them
Yes, this is relatively common practice. Whether or not it's the best approach for your problem depends on many factors.
You can also use -1 and 1, instead of 0 and 1. Or for each column, use two values, ie; 1 and 0 , or 0 and 1, respectively, to designate True or False for the individual columns. They are all strategies for representing discrete / binary inputs, in a manner of speaking. Experiment and see what gets you a better result. I'm assuming you are passing them to an ANN.
Is anyone doing stand form rna folding comp this year?
I’d like to!
Hello. I have to finish a project which is a competition on Kaggle from 2019. The goal of the project is to classify 5 stages of an eye disease. I've already got a working code that scores 0.88 qwk. Now I have to use different color spaces and convert the images from the dataset to 6 channel colorspace from the traditional 3 channel RGB. I have tried this and got a result 0.59qwk. I am thinking I do something wrong. The goal is to reach 0.90+ qwk. Is anyone interested in helping me with this task?
hello. I am trying to take part in OpenAI to Z Challenge. I am unable to verify my kaggle account using persona. It failed and now says to contact support but i haven't received a response from them. Any help?? Also I don't see an option to retry either.
Hi Everyone
I’m putting together a high-impact AI project — AgriIntel — a Smart Farming Assistant built to solve real problems faced by small-scale farmers like my own family.
It’s not just a demo or a college project. AgriIntel is being developed as a serious portfolio product to help all team members land strong remote roles in ML/Data Science by showcasing real-world impact, engineering ability, and product thinking.
We're building:
🌾 Crop Recommendation Engine (Soil, pH, Rainfall, etc.)
🧠 Crop Price Forecasting using Time Series (Prophet/LSTM)
🌿 Leaf Disease Detection using Computer Vision (YOLOv5 or MobileNet)
🗣️ Hindi Voice Assistant (Whisper + gTTS)
📊 Insightful Dashboard (Streamlit/React)
This is your chance to:
✅ Build a project recruiters will ask about
✅ Add something unique to your GitHub & resume
✅ Work in a sprint-style, outcome-driven team
✅ Contribute to something that creates real-world impact
🔗 Full project pitch & roadmap:
https://docs.google.com/document/d/1bY6L_xb5gjAUrqiewkz6BcTMldzJd-kZvS6dn1lNazY/edit?usp=sharing
If you're serious about standing out and want to be part of something meaningful, drop me a message. Let’s build AgriIntel together 🌱
Best,
Dinesh Kumar
🔗 LinkedIn: https://www.linkedin.com/in/dinesh-kumar-775575222/
This is js open source u guys showcasing it like smth else ...
seems like this would have been more appropriate posted under #💬┊general as opposed to #🏆┊competition-general because it isn't specific to actual Kaggle competitions
They r trying to give competition to everyone
@bitter lintel hey super sorry for the tag, its actually my first competition. I'm really struggling to make a submission cus the submission keeps showing me an error. My submission.csv file looks the exact same as the sample submission i believe. The values in there are not good at all, but i don't think there's something error or anything in there. Would you mind helping me out a bit to get a submission score of some kind?
Hard to tell. There could be a slight error in your submitted file itself, related to formatting. A missing comma, colon, bracket, or key. Make sure for one particular answer that makes part of your submission, it is well formatted. Also between the answers. Check the csv file you are generating to verify. BTW, you may want to code a python class that generates the csv itself from your predicted data, consistently, and ensures that all answers are always correctly formatted, individually and across each other. Basically, it can create the sample submission to a tee, if required to.
Or maybe, you just aren't submitting correctly (not the submission itself, but code that submits the submission). Check that out too. Search "Code" for competition to find examples of this (usually you can find a few examples; use search tool i necessary). Anyway, best advice I can give. And you needn't apologize for asking for help. That's the whole point, amongst others, of this server. Good luck.
Thanks a lot! So my error looks like this, submission scoring error means that there's a format issue with my submitted csv file right?
and not in the code itself
I was giving you two things to check for, but yes, it's probably not the code itself. If you want to verify this, if you can run the code itself in a notebook, WITHOUT the part that handles submitting it, you can assure yourself that this is NOT the problem. However it may be formatting, and a few others. Here's a list of things to check up on:
- Wrong file format
You submitted a file with the wrong format (e.g., wrong extension or structure).
Fix: Match the sample submission exactly, including file extension.
2. Invalid or missing columns
Your submission is missing required columns or has incorrect column names.
Fix: Use the exact column names provided in the sample submission file.
3. Wrong number of rows
Your submission has more or fewer rows than expected.
Fix: Make sure there’s exactly one row per required ID.
4. Non-numeric values where numbers are expected
You have invalid values like “NaN”, “undefined”, or text in numeric columns.
Fix: Ensure all prediction values are valid numbers.
5. Duplicate IDs or malformed rows
Your file contains duplicate or missing IDs, or rows that aren’t formatted properly.
Fix: Ensure all IDs are present, unique, and ordered as required.
6. Wrong delimiter or encoding
Your file uses the wrong delimiter (e.g., semicolon instead of comma) or incorrect encoding.
Fix: Save your file as UTF-8 encoded CSV using commas as separators.
7. For code competitions: file too large or timeouts
Your notebook exceeds allowed time or memory limits.
Fix: Optimize your code and reduce resource usage.
Wait, wdym by without the part of the code that handles submitting it?
i believe i ensured all of these tbh, im still not sure what's going wrong... ://
I also put my submission file and and sample submission file on llms to check if there are any minor differences in anything, couldn't find any
Okay I think I found it, the data on my produced csv are all float 32 type of data whereas on sample submission its float 64 type of data. That could be the problem right? @bitter lintel
Nvm that didn't work
they make lots of mistakes, all the time. Remember, they give a simulacra of intelligence, but have NONE (no understanding of what comes in, or goes out). They are probabilistic, pattern matching parrots, of sorts. Also, I gave you a checklist of things to look for. Unless you asked your LLM to check for these explicitly, it won't necessarily check for these.
I don't know what to tell you. Try making a submission with dummy values. Create and use a python class that uses random values for answers, and creates your csv. You can then use it to generate a submission that includes your predicted values. As i said b4.
Yeah i know that about llms, it was more like a dire check tbh since I haven't been able to find any dissimiliarities after checking things on my own...
Ok that's a good idea actually, let me try that
It's just frustrating cus i've made 7 attempts so far, it keeps showing this error
Sorry for pings again @bitter lintel are negative values in my predictions a problem? Since I think there are logs involved in the scoring criteria. If it somewhere takes in a negative log that could result in an error. Do you think that's possible? What to do in that case
@gloomy zodiac please check DM
i am working on a major project..
i need your suggestion for proressing further..
it will hardly take 10 seconds..
NutriAI — Your FREE AI-Powered Nutrition & Calorie Tracker 🍎
Are you someone who wants to eat healthier, lose/gain weight, or just understand what’s on your plate?
We’re building NutriAI, a completely FREE and intelligent calorie tracking app designed for everyone—from fitness beginners to health enthusiasts.
What makes NutriAI speci...
Nice form
Opensource project ?
Sorry for a negative response in form 💀
It won't work niggas u lost when u mentioned "free" also Cal AI do exist for its best theres no competition to it even if it's premium
bro..
i value your response..
but there is much scope..
Bro 💀
specially all those apps are.. much for western food..
I should ve stayed anonymous 😅
no worries
and when your plates are fked with 7 differnet items...
calculation its calorie based on volume and ingredient..
bro..
do u really thing there is no scope..
There is scope tho depends on ..
Do u mean it open source or closed source ?
Btw they might be for Western but they r valued n their MRR is in millions just in few months
-# so there's definitely scope they proving (maybe time for Asian food now)...
yhh
I was trying to find the guy's traces in GitHub couldn't find anything related to him on GitHub 💀
someone vanished him..
Anyways u don't 've any repo yet for it ?
Hello, I am new to Data Science and I want to learn it. Could you please help me with that?
Work through the free lessons here: https://www.kaggle.com/learn. It's comprehensive. Also, these books are great to deepen your knowledge. Good luck.
Thanks
Hello
heyo oyo
Hi I'm trying to use Qwen3 with vLLM, can anyone share a starter notebook for this?
why is my name not showing up on the leaderboard? I thought kaggle would have auto-selected my submissions?
I tried everything I could to debug it. Is there anything else I can do? Any help would be greatly appreciated.
@zealous terrace not sure if you are right person to ping, but maybe remove this user, because he post scam
He post this in all channel
Where can someone find these books?
Amazon or google their titles + pdf and you should find links to ahem "free copies". Allegedly
Thanks
hlo bro can anyone suggest me some video resources to learn mlops for free
hello , is there like a chill discord group of beginers who have started doing KAGGLE?
This one
me when bot
Hola
🚀 Team Formation for DRW - Crypto Market Prediction 🧠📈
Hey everyone!
I’m putting together a strong, hard-working team for the DRW Crypto Market Prediction competition. The goal is clear: at least a medal, ideally more.
Looking for folks who are:
• serious about pushing for a top spot
• ready to share ideas and commit to regular iterations
If you’re motivated, eager to participate, and would like to be a part of the team, I’d love to see a starter repeater notebook with valuable insights. It’s totally fine if it’s public or builds on others’ work — but please don’t just copy-paste someone else’s notebook. What matters most is that I can clearly see your own thought process, effort, and dedication to the competition.
If that sounds like you, feel free to DM me or reply here. Let’s build something sharp together. 💪
Still 1-2 slots available. Going to finalize the team before 25.06.
After that hard joint work towards the medal 💪 💪 💪
was up guys
https://vision.hack2skill.com/event/bah2025?utm_source=hack2skill&utm_medium=homepage
hello everyone!
this is hackathon by isro! ( Indian Space Research Organisation)
Welcome to the Bharatiya Antariksh Hackathon 2025 a nationwide innovation challenge organised by the Indian Space Research Organisation ISRO with Hack2skill as the innovation partner We are calling upon undergraduates graduates postgraduates and PhD scholars from across India to bring their creativity passion and problem solving skills...
help people help you - don't provide the code in this form, please
Ahh gotcha — will post it properly next time. Thanks!
best way is github repo or a notebook where people can see and test your code on their machine
Thanks for the info, man! I really appreciate it^_^
what do they mean by offline-first
🚨 Collab Call: Gemma 3n + Education Hackers Wanted!
Hey folks — I’m Apreddy, solo builder of Lab2Life — a global hands-on science learning platform that just shook up Bolt Hackathon 2025. Now, I'm extending it to Gemma 3n (Kaggle Hackathon) to bring offline, privacy-first, multimodal science learning to underserved kids around the world.
Looking for 1 solid collaborator who’s got:
✅ Experience running Gemma 3n locally (preferably with Kaggle or on-device setup)
✅ Knowledge of multimodal AI (images, voice, text fusion)
✅ Comfort with Edge deployment, privacy-first AI, or education tech
✅ Bonus if you know lightweight backend/devops for packaging
💼 What You’ll Do:
Help build and demo a voice+vision AI Lab Assistant powered by Gemma 3n
Assist with integration (voice prompts, object recognition from webcam, etc.)
🎯 Goal: Build something powerful. Ethical. Fundable.
DM me or drop your GitHub/Kaggle/Portfolio if you're game.
Let’s bring science to every child — even offline.
i made these submissions but I forgot to select them, but I thought it would be fine because of auto-selection. However, my name does not show up on the leaderboard
why is my name not showing up on the leaderboard? I thought kaggle would have auto-selected my submissions?
If u don't mind just curious the production implications of Gemma 3n based apps are the apps would be 2-3gb download size and is it gonna be usual ?
It means that it should work without an internet connection
Hi, is anyone here also doing the DRW crypto competition? Please feel free to message me if you are. We can discuss some strategies and maybe ways to improve our model.
hey, I also working in the competition, do you want to discuss more
yes sure!
Hello all! I am free too. I have tools but not team.
Hello! Are you also doing DRW? If so, do you want to discuss in more detail?
You mean "DRW" about WEB3 on python etc? Or Delay read/right? WEB yes. Smar contract, etc...BUt i reffer use Python to make all on Blockchain.
This Gemma Seems very good. The competition is apply the models in real world, right? Don´t know exactly the area that i probably will go.
Based on what i have seen on Google site, the best shot is science. Easy to integrate Gemma to chemistry or Proteins. Most of real world stuffs are made for scientists. And they always use Python.
If you mean "DRW" about blockchain, there are too much solution yet. Only few ones solved real life problem like Thor, Bittorrent, IPFS, Heliun. May on Blockchain you can use IoT by IPFS and make decentralized model to predict climate, etc.
Is just my opinion in 8 years on WEB3 stuffs.
another tips: WEB3( IF you DRW means about it) have a terrible problem about human behavior. ( Bots, scripts, etc). If you make the model to predict human behavior and shows that that address or user is human or not, you will solve very important real life issue. Both on Trade and WEB3 games.
Yes. Crypto marker. Just check here.
Go ahead. I can help you. I already have many Notebooks about it. Sniper Botts, Push ABI contracts, Listening open Contracts to check if they leave open proxy to trigger the Pool etc...But i not use AI to make it. You can add me that i can help you. Not sure if i will go all win on Crypto.
My app is downloading the model during startup. The APK doesn't include the model
I'm curious, how fast is Gemma 3n running in your devices? In my Pixel 8, it takes almost a minute to generate an answer from the input image
i didnt download it yet btw its meant to run better on pixel ig
btw cant access gemma3n from ai studio
I downloaded it from Hugging Face
I am working on this, would love some help my score is 0.06 right now
Added you
BTW, what's the point of cheating in the DRW competition? Do these guys actually think that {insert the hedge fund they're dreaming about} will seriously consider their application?
Colleges might
Some high schoolers use it as an extracurricular
And get into really good schools like Berkeley through the method
Not through cheating lol
Kaggle changed the tier system and I lost some notebook gold and the GM title. I've shared a list of notebooks where I lost gold. Hopefully with your help it can be resolved. https://www.kaggle.com/discussions/accomplishments/589506 I am pretty sure some of them can be usefull for the DRW competition. Best, Lucas
Every system changes with time
Can anyone help me with the submission of CMI detect behaviour competition actually I'm unable to generate the necessary parquet file from the predict function and hence unable to make the submission
has anyone been able to run gemma3n on a Pi?
I have submitted a notebook for the MAP - Charting Student Math Misunderstandings competition but its been running since 35 mins lol
Is this normal ?
Yes
It's normal on avg it takes like few hrs
thanks!
I feel we understand a lot when we go through others solutions as well
I wanna reach to Expert at kaggle can anyone help me
I am facing the same problem here. The notebook ran successfully within a few minutes but it is taking forever Kaggle to score my submission. It has been 5 hours and it is still running.
Probably then the score won't be as good im afraid
Mine rank for 7 hours and gave least💀
yeah mine gave timeout error 😦
I am looking for someone to help me with how to do the exercises in the [Intro to Machine Learning] course on Kaggle.
[https://www.kaggle.com/code/wonderfulexcellent/exercise-your-first-machine-learning-model/edit]
The new Gemini Embedding model is now live in the Gemini API. Start building better Retrieval-Augmented Generation (RAG), classification, and search today.
It’s our most powerful and versatile text embedding model ever, achieving top-ranking scores on the Massive Text Embedding Benchmark (MTEB) leaderboard, and is priced at $0.15 USD per million tokens. Gemini Embedding supports 100+ languages and has a controllable embedding size.
You can generate text embeddings by using the embed_content method:
from google import genai
client = genai.Client()
result = client.models.embed_content(
model="gemini-embedding-001",
contents="What is the meaning of life?"
)
print(result.embeddings)
gemma even on PC is tooo slow...We need raspberry PI with super power. haha. May on huggingfaces have someone that changed it. I have no idea why they are asking fro mobile. A Embeed system that can run these kind of stuffs the price si skyrocket...
Awesome development
Ok this whole time I've been trying to make something that can run on mobile. But ultimately, there's gotta be a way, it's just not cracking for me
To run this models on Cell phone, is not easy. When i start was insider( before gemini online for all), the version of gemini Nano, is the only one that was able to run on mobile. But they aren't used all model. Gemma 3N will not work well. Just if you call it with API. Local is insane.
If they say " mobile devices" just new Raspberry PI can run( the most common mobile devices). So, you can use it there, trigger some sensors and done. Just IoT can run this model. Intel also have good IoT devices. But the prices skyrocket...this model def is not made to run as is to cellphone. Think "mobile devices" any IoT system that are near to computers.
Hi all!
I’m forming a team for the RSNA Intracranial Aneurysm Detection competition.
🔹 Skills we’ll use: Python, TensorFlow, Google Colab (AWS optional for architecture).
🔹 Goal: collaborate on data preprocessing, modeling, and optimization.
If interested, drop a 👍 or DM me! 🚀
Hi, I am.new to kagglr had a question regarding the submission criteria. What does it mean that my submission has to have a runtime less than 11 hours? Does it mean that the entire 2500 inferences should be completed in less than that?
And also the no internet rule. If my code relies on some packages like nibabel or idk torch then how does the no internet rule work.
Thank you for your help
I think the no Internet rule is about accessing things live, not about using packages
Yes, the entire pipeline — including all 2,500 inferences — must complete within the 11-hour runtime limit.
Regarding the no-internet rule: any form of external access (e.g., API calls, cloud-based data fetching, or live searches) is strictly prohibited. To enforce this, internet connectivity is disabled during runtime, so all required packages and data must be pre-installed and bundled with your submission
Thank you for the answersss
Looking for any experieince experts for medical AIML project to work on a paper for ICML
Is anyone getting "notebook threw exception" when using kagglehub ?
I suffered multiple "notebook threw exception" errors today due to this and it was fixed by hardcoding the competition directory rather than using kagglehub.competition_download 🤔
or maybe its because I submitted shortly after refreshing my kaggle key
anyone in the AeroClub RecSys 2025 competition ?
how do i make sumission
do i need to show the notebook or just the final file ?
cause i cant to handle all the dataset in the notebook
@everyone I made my app before submission to the competition should we keep the api key or should we redact it and the competition judges will use their api keys
Hey everyone!
Long before the launch of Kaggle’s Game Arena, I developed a real-time AI chess battle platform where you can watch different AI models play against each other.
🔗 Try it here: https://adaptive-ai-chess-game.streamlit.app/
🎮 Modes available:
Human vs Adaptive AI
Adaptive AI vs Hyperbolic AI
It’s a fun and insightful way to observe how different AI strategies behave on the board — ideal for anyone into AI, reinforcement learning, or game theory.
📩 If you're interested in collaborating, expanding, or integrating it into more formal benchmarks, feel free to connect:
👤 Karthikeya Guduru
📧 karthikeyaguduru19@gmail.com
🔗 https://www.linkedin.com/in/karthikeya-guduru-70227b262/
@indigo jasper Hi Melissa, I’m with the Hailuo AI (MiniMax) team. We’re running an Ai hackathon with a $100k prize pool and would love to partner with you. Who’s the best person to speak with? Thanks, Eric
Hi, me and my team are participating on the Gemma 3n Hackathon competition. When we tried to submit our writeup into the Kaggle competition website but an error was showed "Internal Error" and our writeup could not be submitted, even though we tried submitting it minutes before the deadline...
Here is the topic we created on the discussion presenting our solution and the error we faced: https://www.kaggle.com/competitions/google-gemma-3n-hackathon/discussion/597689
We invested a lot of effort into this competition throught the whole month and would like to kindly ask the Hackathon organizers and Kaggle support team to take look at this and take in consideration our submission. If anyone can help me out I would be very grateful! Thank you.
@covert heath @terse blaze @tight grove Appreciate any help on this. Thanks
We just launched the MiniMax $150,000 AI Agent Challenge, hope that clarifies what I’m after. If this isn’t the right spot, I’d appreciate a redirect. Thanks. https://minimax-agent-challenge.devpost.com/
Thank you @pulsar mantle for your message. For partnership related queries, please reach out to bflynn@google.com. Thank you!
Not Comp, but I encourage able users to look into Rett Syndrome research if you'd like to help some kids. This directly benefits a user within the red team community, and I'm sure they'll appreciate any effort put toward helping those with the disorder. @covert heath delete if inappropriate. 🐉
did u redirect it from devpost
its not there
Yes, on luma now https://lu.ma/2u17h1zw
Btw why not on kaggle
We need a hybrid hosting strategy for both the Developer and Creator tracks. Kaggle is great for the Developer track.
However, one of our team leads found that we may not be eligible to post a competition since we’re not asking users to solve a specific problem.
Right btw what advantages do companies usually get hosting these hackathons
Branding, awareness, networking, etc. Imo.
Right but compared to 1M$ 150k$ ones how do u think to differentiate; to participants 150k ones are quite common
The company runs several competitions, however, the major ones, as you mentioned, are mostly for innovation, ecosystem growth, talent acquisition, etc.
This one, minimax, is on a smaller scale but will still contribute.
Hi i am facing an error "Notebook threw error" while competition evaluation. Notebook runs fine when i run it manually. . Does anybody know how to solve it? (Jigsaw competition)
There's likely something in the hidden dataset that crashes it, e.g., NaN values.
I found it useful to comment out pieces of code and submit such incomplete result. It should help you narrow down the parts of the code that can cause the issue.
