Starting this channel off and hoping to set the tone a bit, I want to share a little easter egg inside our unit test code: WithLaunchDate(new DateTime(9999, 1, 1)); // TODO: change when getting close to that date. I find having a few jokes like this helps make writing unit tests more fun, and if we're having more fun we'll hopefully write more tests!
#๐ปโask-a-dev
1 messages ยท Page 1 of 1 (latest)
That Y10k bug will be the end of the world. ๐
Old world has long ended, we are in the new world!
Random behind the scenes info, some of you may have seen the linked roles process (or had to go through it) and you'll notice now it's no longer needed. We originally used the Discord Linked Roles feature which lets you link to 3rd party applications and have roles dynamically set through that. While being a neat feature and easy to integrate with on our backend, it had a few downsides. Biggest one was that it was a lot of extra steps for people to go through. We now manage the roles directly with our bot.
Iterative development means we get stuff out quickly and can adapt quickly, but of course comes with the downside of some of you may having to use the more complex version first!
I thought the linked roles connected our Discord accounts with our Kaggle accounts. I am curious how that is automated by bots now. Or was that lost in the trade-off for a simpler experience?
The accounts are still linked, it's just the roles themselves that is now automated. So you can link your account through https://kaggle.com/discord/confirmation, and on Kaggle we store that link, and then update everything relevant (what roles you should have, and the discord nickname)
Ahhh I get it now.
The one thing we did lose in the trade-off is the additional little note that appeared with a link to your profile. It's an unfortunate loss but the simplicity of setting it up is worth it. And we're looking at a way to replace that (like a /who .nathanforyou command)
If anyone has ideas on alternative ways to show that link, definitely open to them. We considered using the game presence API but it's really not meant for that and would require constant access to your discord account (whereas right now we just need that access when it's first linked)
Here's a fun behind-the-scenes tidbit. Later this week we're releasing Meta Kaggle for Code, a dataset of public, Apache 2.0 licensed notebooks from Kaggle. And this is the very first public notebook ever created (KernelVersionId = 1). An R script I'd guess written by Ben Hamner, Kaggle cofounder / former CTO. SDS stands for "social data science", an early name for the product.
One thing I'm excited about with Discord is an opportunity to be a bit more experimental. The nature of the platform means we can try things out and change as needed. To that end @foggy oyster had a great suggestion to automatically publish job posts and we spun up a new jobs-feed channel where they can publish jobs through a webhook. Huge thanks to @narsil for adding something that will hopefully provide a lot of value and I'm excited to see what else our community thinks of. Know that we're very keen to trying new things out here
Just set up an event for tomorrow, a live "fix-it-friday" where I'll be working on small changes hopefully suggested by you! If you got a good idea, or a small thing that's been bugging you, join in and ๐ค you might see your change on kaggle.com just a few hours later: https://discord.gg/kaggle?event=1139305139186970754
Hello all I would like to know if it's possible for me to download and save a model from hugging face on the kaggle notebook once and for all so I don't have to download the model every time I want to use it, if there's please could you point me to a resource I could use to achieve this. Thanks
If the model is also on Kaggle Models, you don't need to deal with the hassle of downloading every time -- it just "attaches" to your notebook the same way Kaggle Datasets, do. What's the model?
Thank you @pine narwhal for your prompt response,I plan to load the CompVis/stable-diffusion-v1-4 model from hugging face using the StableDiffusionPipeline from the diffusers liveary
Got it -- unfortunately we don't have any RAIL-licensed models yet ๐ฆ Unfortunately you may need to continue to deal with the download slowness. An alternative which I feel bad recommending because it's pretty messy is to upload the model files to a Kaggle Dataset (or search and see if someone else has already done it) -- I believe a lot of people use this workaround.
We'll keep working on support for RAIL-licensed models and hopefully we can offer something improved in the future.
Thanks a lot for your response I will try uploading the model to dataset hopefully that works ๐ญ if any other issues arise I will head back to this channel. Thank!
Does anyone have any idea as to why this errors out on kaggle
With a ConnectionError
ConnectionError: HTTPConnectionPool(host='www.robots.ox.ac.uk', port=80): Max retries exceeded with url: /~vgg/data/pets/data/annotations.tar.gz (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7bb293d67160>: Failed to establish a new connection: [Errno -3] Temporary failure in name resolution'))
But seems to work perfectly fine on Colab?
Both use tfds version 4.9.2
I thought this the best channel to ask since it seems a kaggle-specific thing, but if not, please let me know and I'll ask it in a different channel :)
@red oar the post that you linked in your mass downvotes topic had 4 downvotes, not just 1
I have save and run all in notebook kaggle . But when download notebook. That notebook not appear output. How to solve it? . Isnt bug or something else?
Can someone please review my code?
It is about a puzzle of priests and devils that is meant to be solved using Breadth First Search. Currently it seems to be working. But I am just learning about the concepts, so I may have looked over some things.
Hey, I wanna ask Kaggle team.
Why there is no conda in tpu kernel?
It's frustrating to manually configuring everything...
I am looking for partner for learn ML together. I am beginner
I implemented a model using keras.Sequential and then I used the model.predict method to get some predictions of the training set. Then I compared the predictions and the positive class of the true labels and I noticed that the model gives a probability bellow 50% sometimes, does this means that the model misclassified those points?
Or those numbers mean something else and I am misinterpreting them?
I assume those are class 1 predicted probabilities in the left column (I can't tell based on the screenshot, you need to verify). Depends on how you convert the class 1 predicted probabilities to predicted classes. If you decide to predict class 1 if the probability is above 50% then yes, those points would be misclassified. If you want, you could lower the critical probability threshold to maybe 35% and all of the points on the screenshot would be predicted class 1 correctly. However some other points you don't show here might be misclassified.
Interesting. And thanks that makes sense. But in terms of having an accurate representation of my model uncertainty, should I do that? And lower the threshold? ๐ค I guess it depends on the application right? How can I get the recall in keras?
No clue what you are predicting here, can't help with that. But in terms of evalution metrics, the first choice you need to make is whether to convert the probabilities to predicted classes in the first place. There are metrics (like logloss) that compare the predicted probabilities to the true classes directly so you might not even need to choose a critical probability. If you need a predicted class, choose an evaluation metric best for your problem (accuracy, recall, precision, f score, etc), tune the critical probability and figure out which p_crit optimizes your metric.
Thanks you very helpful
anyone here have any updates on the current firebase auth situation, from kaggles end? Or is the team just waiting for firebase to figure their side out?
I'm having issues pip installing statsforecast on Kaggle, anyone else?
nope, new notebook gets it done in about 15 seconds
Hello,
My friends and I are working on creating a project which should be able to detect dark patterns on the internet, particularly let the user know of its type and how it works, I know that detection can be done by means of machine learning models, but since the user would be checking on live websites, we'll be giving it as a kind of browser extension.
My question is, how do I let the user know in detail? Alternatively, is it possible to use an LLM as an extension to give a chatbot kind of vibe???
Please help us out, and lemme know if you need a deeper understanding
NOTE: we are pretty much beginners but just wanted to take the chance the make a bigger project, to truly learn diverse skillset for a hackathon
Hi
Are there any plans to make the "k" on the tab of the website Red when the notebook gives an error?
it would be cool.
Haha, I like the way you think. I think for branding purposes we might not do it that way, but it would be cool if it's possible to have an indicator on the tab icon of your notebooks run status.
Is it possible with kaggle cli or python api take indiviusal files from the output of kernels?
Does anyone know a ml model that we can use to count the number of people in a photo? Like a model from hugging face to demonstrate the impact of ml
On colab after giving certain permission, when notebooks are executed successfully or an error occurred, a Windows notification pops up with a sound and an indicating message
Will we have something similar? Instead of a tiny โ>โ running and โkโ finished tab icons? ๐
I also wanted to report an issue where when i copy(drag) a plot image from inside the notebook to my desktop, the notebook freeze and i have to refresh.
What is the definition of output refresh? Often after a period of fine-tuning, the model is deleted and cannot be downloaded. Is there any way to solve this problem?
Hmm, I'm not sure about the specific issue you are encountering - but it can probably be solved by saving a version "run all and save" button - then you will be able to always download the output from the viewer page rather than in the editor.
I am locked out of my main kaggle account, the one I am using on this discord is a new one I made so I could get support on here. I have done the forgot password prompt multiple times over multiple days and I have not gotten a single email with the code it says I would get. I know my email isn't blocking Kaggle because I get forum updates + I got a multiple account warning when I tried to phone verify and just ditch the old account.
I literally cannot do any work, because I need to be verified to get internet but my phone number is verified with the account I am locked out of
We can't provide any support on discord. Emailing support from here https://www.kaggle.com/contact is the only way to get help. They should be able to fix things up (but please be patient).
A lot of the common inquiries we receive are listed below. Please click on the one that applies to you to learn more.
Is there like a general support email? None of the links seem to fit my issue
@meanmob - Yeah it's a little annoying, honestly just pick the closest and explain the situation. I'd look at the "i
- I'd like to change my password one
That's what I did - thanks!
It might take a few days for support to get back to you, thanks for being patient and sorry you are having issues blocking you from doing work.
All good - I'm still working through a textbook rn so I wasn't gonna be doing any competitions or using datasets soon
To put it simply, after training the model, it is often too late to store it because Kaggle is not used for a while, is there a recommended way to store these models permanently?
@still spear in the future it will be possible to upload models to kaggle.com/models , right now some people get around this lack of functionality by saving model weights as datasets.
OK
Hey guys! I'm working on a sales forecasting project using lstm and cnn (multivariate and multistep). When i forecasted the data for 12 months with a lookback of 24 there is a lag of 1 month in my forecasts seasonality.
sequence splitting code :
def split_sequences(sequences, n_steps_in, n_steps_out):
X, y = list(), list()
for i in range(len(sequences)):
# find the end of this pattern
end_ix = i + n_steps_in
out_end_ix = end_ix + n_steps_out
check if we are beyond the dataset
if out_end_ix > len(sequences):
break
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i:end_ix, :-1], sequences[end_ix:out_end_ix, -1]
X.append(seq_x)
y.append(seq_y)
return array(X), array(y)
The model :
callback = tf.keras.callbacks.EarlyStopping(monitor='root_mean_squared_error',patience=6)
model = Sequential()
model.add(tf.keras.layers.InputLayer(input_shape=(24, 3)))
model.add(Conv1D(filters=32, kernel_size=5, activation=tf.nn.leaky_relu))
model.add(Conv1D(filters=64, kernel_size=5, activation=tf.nn.leaky_relu))
model.add(MaxPooling1D())
model.add(Conv1D(filters=128, kernel_size=5, activation=tf.nn.leaky_relu))
model.add(MaxPooling1D())
model.add(LSTM(256, activation=tf.nn.leaky_relu, return_sequences=True))
model.add(Flatten())
model.add(Dense(200, activation=tf.nn.leaky_relu))
model.add(Dense(100, activation=tf.nn.leaky_relu))
model.add(Dense(12, activation='linear'))
model.compile(loss='mse', optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001), metrics=tf.keras.metrics.RootMeanSquaredError())
model.fit(X, y, epochs=100, verbose=1, batch_size=1, shuffle=False, callbacks = [callback])
Green is predicted
Blue is actual
This is a chart where I tried forecasting the test data
Hello all,
I'm just a begineer to this field. I'm facing a problem or in simpler words stuck in a loop.
I'm pretty well aware about the theory and conceptual knowledge required of py, kaggle, maths, ml and all, but I'm not able to put things together to build my FIRST ML MODEL. Can anybody of you help me out with this.
Hi, I have a question in Python I need help with
I have a list of integers:
[252944, 385492, 12345, 47274, 567583]
For any 5 digit number, I want to put a '0' in front to make it 6 digits
So 12345 -> 012345
How do I go about that?
If anyone can share sample code for this would be appreciated
โค๏ธ
if you're ok with keeping the items in the list as strings:
list_of_integers=[252944, 385492, 12345, 47274, 567583, 12, 3] nums=[str(num).zfill(6) for num in list_of_integers]
outputs:
['252944', '385492', '012345', '047274', '567583', '000012', '000003']
Oh thank you so much!!
I appreciate this help a lot!!
I didn't realize there was a zfill method that is so handy!
Glad I could help!
Please can someone rate my code?
and give feedbacks and give tips to improve
i need a team!
In my opinion Kaggle leaderboards are meant for people who have used eda and feature engineering to the best of their abilities, not for people who have gotten lucky using things like Optuna, may I know if there are any plans to handle libraries like that? Thanks!
why does kaggle end as soon as my cells execute?
numbers = [252944, 385492, 12345, 47274, 567583]
new_numbers = [f"{num:06d}" if len(str(num)) == 5
else str(num) for num in numbers]
you can use the format specifier and f-string
To verify the number, I entered my cell phone number, but I didn't receive a verification number, so I can't verify it. What can I do?
@torn cargo Contact Kaggle support (we can't help through discord)
For some reason, ydata_profiling is not executable in the new notebook.ใHere is the old ydata_profiling I wrote. Do you know why?ใใhttps://www.kaggle.com/code/risakashiwabara/easy-data-visualization-eda/edit/run/165414369
Hi! Is there any UI/UX DESIGNER here? Can anyone here help me with my web design?
gpts enough though
How is that, are you saying that gpts can replace UI/UX designers?
not rly but replace a large amount of them
you know given a pic it could replicate front-end code
if tasks easy, gpts enough
Yes maybe UI designers to some extant, but it won't be able to replace UX designers because it needs human interaction
agreed. UX depends on ingenuity
Hello everyone! Could you suggest what I could change or modify in its UI? https://www.figma.com/file/xzl7sDZFJZPnhMHWTk2KY1/Bird-Global?type=design&node-id=0%3A1&mode=design&t=MFKDqexNZua3HVMZ-1
Just comment on it! Thanks
who needs AI cold call agent?
hi, anybody could tell me how to make this kind of bot?
Hello guys! Guys
can someone help me pleaseeee
after the model trains, i cant find it on the working dir
Here is my syntax im using:
Save the best estimator using pickle
with open('/kaggle/working/model_rf_overpowered.pkl', 'wb') as f:
pickle.dump(best_estimator, f)
print("Best estimator saved using pickle.")
# Save the best estimator using joblib
joblib.dump(best_estimator, '/kaggle/working/model_rf_overpowered.joblib')
print("Best estimator saved using joblib.")
return best_estimator
Please help kaggle is the only free platform i can use
thank you guys!!
I have already quicksaved the file with output files, and the new version when i open it no files in the working dir of either the version before trainning, and the quicksaved version in the working dir:
Why does this code not work?
Why is my classification report cursed and why is the score so low
I'm getting ValueError when loading data from the Hugging Face dataset hub and haven't been able to resolve it. Is something at the backend amiss?
!pip install datasets
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
import pandas as pd
from datasets import load_dataset
dataset = load_dataset("glnmario/ECHR")
full_data = pd.DataFrame(dataset['train'])
I was able to fix this by not specifiying a file path to save , just the model name
Hi I'm new to here, I got some rookie question to as about DS, If I said something weird or inappropriate, I'm sorry about that.
background: highschool graduated, 19years old. going for college in septemper this year
Goal: I wanna build a data analyst portfolio for my first DS job
progression: I just finish a Coursera IBM DS course, learned about basic SQL, matplotlib, pandas...
questions:
- Shoud I learn powerBI, Tableau? I saw some job requirements talk about
You should have:
a degree or recognised qualifications in Data Analytics, FinTech, Information Technology, Computer Science or related disciplines
fresh-graduates are welcome
experience in using data visualization software like Tableau and Power Bi is a plus
novice level in SQL, VBA, Python, R or Advanced Excel is preferred
knowledge in banking, finance or treasury is an advantage but not a must
- what projects should I build that could show all my skills? I google some, but it's too complicated
I'm a sophomore computer science major, and I've been really grinding with my studies. Between coding questions on platforms like LeetCode, keeping up with coursework, and trying to learn new skills like data science and machine learning - I've been putting in a ton of effort.
I'm not in it only for money - I have a genuine interest in CS and want to make a positive impact in this field. However, with all the noise around generative AI, the market recession, layoffs happening, etc., I can't help but feel anxious. It makes me wonder if all my hard work will pay off in the end ?
I know I'm still early in my journey, but the current conditions make me question whether I'm on the right path. Is specializing in an area like AI/ML still a viable long-term option? Or will roles be significantly impacted by AI ? I don't want to spend years mastering skills that could become non existent.
Has anyone else been grappling with similar concerns? I'd really appreciate some perspective from other CS students and professionals in the field. Am I overreacting to the headlines, or is a career pivot something I should be seriously considering? I'm determined to follow my passion, but feeling a bit lost on the best way forward given the rapid changes happening in tech.
Let me know your thoughts! I'm all ears for any advice or insights you can share.
โIโve developed a project in Python
kindly check and give feedback on areas for improvement or necessary code changes.
https://github.com/Manavalan2517/Real-Time-Authentication-System-in-Python
use joblib instead
and try and use ! ls -alf to check ur working dir
Is it possible to transmit text from an external source (a file run using Python in VSCode) to Kaggle (as the input terminal for executing code), receive the results, and send them back to VSCode in a loop rather than all at once?
Can AI Learn Forever? Tackling Catastrophic Forgetting.
Is there a way to delete past versions of a notebook?
I created 3 consecutive failed versions of my notebook because I forgot 3 different imports. It's a bit strange to keep those versions because now it's a v7 when it should really be a v4.
In case it's relevant, this is the notebook I've been playing with where I find the issue: https://www.kaggle.com/code/adriadejuan/eda-anova-xgboost-for-insurance-claim-predict
Seems like you are using python 3.9, but your pip3 is for python 3.12
Is there a way to hide cell output in editor?
I am using LlamaIndex, and when creating an Index (or some other operations), there is a "batch" progress in the output. I can "enable output scrolling", to somewhat "hide" the output, but when processing very large data, the "batch" is generated in very large amount and it lags out my laptop.
try putting ; at the end of the line that is producing unwanted output
Nope, still produces the same output
then try:
!ls > /dev/null 2>&1
or
import subprocess
subprocess.run(["ls", "-l"], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
Still no change
in the same cell
Hi,
Anyone have knowledge pytorch_forecasting library?
I have a question about it.
Hi everyone
Can you help me please select csv dataset for my study project
It must contain uncleared data, that will be used to create regression or classification model
I am working on a data set regarding large stocks in kaggle. I am trying to find the trend by year. I am using this excel formula: =IFERROR(IF(YEAR(INDEX(A:A, ROW()))=2005, AVERAGE(XLOOKUP(N2, INDEX(I:I, ROW()):INDEX(I:I, 992512), INDEX(K:K, ROW()):INDEX(K:K, 992512))), ""), "")
A - Dates, I - Stock name list, N - Stock name to seach/apply to, K - current trend
here is a screen shot
I need an excel formula that is =IF(YEAR(A2:A992512)=2005 then Xlookup stock name from N2, find all stock names in I2:I992512 and then average all K2:K992512. How can I write this?
help me
I am making a software aka web app to segment floor and apply pre-made texture. The texture will be in png formate and square shape tile.
Now by using segmentation model I segmented the floor, but now it time to apply texture in cm with dept in image that where i am stuck from many days the linear look of texture dosent look real so we need to do pov scale (near tile bigger further tile smaller) but here issue is of room direction again please sugggese me what I do in this
I tried prespective scaling but room direction is another issue i tried using inpainting but they never listen to prompt and never took my texture as parameter
hello everyone .
is there anyone who can help me improve my background extraction code, which extracts the front garments from any type of background where it is located?
There is my code :
def background_extraction(image):
if image.shape[2] == 4:
image = image[:, :, 0:3]
gray = rgb2gray(image)
#img_equalized=exposure.equalize_hist(gray)
#gauss=gaussian_filter(img_equalized,1)
gauss_top=white_tophat(gray,disk(2))
outline = extraction_class.outline_detection(gauss_top, sigma=1)
#outline = white_tophat(gauss, disk(2))
filled_outline = binary_fill_holes(outline)
structurant_element=disk(2)
smoothing=erosion(filled_outline,structurant_element)
#smoothing=dilation(filled_outline,structurant_element)
extraction = np.copy(image)
extraction[~smoothing] = 0
return extraction
def outline_detection(image,sigma):
gauss = gaussian_filter(image,sigma)
sobelX = np.array([[-1,0,1],[-2,0,2],[-1,0,1]])
sobelY = np.array([[-1,-2,-1],[0,0,0],[1,2,1]])
derivX = convolve(gauss,sobelX)
derivY = convolve(gauss,sobelY)
gradient = derivX+derivY*1j
G = np.absolute(gradient)
threshold = threshold_li(G)
s = G.shape
for i in range(s[0]):
for j in range(s[1]):
if G[i][j]<threshold:
G[i][j] = 0.0
return G
Um. Using advanced edge detection methods like Canny or deep learning-based methods can improve accuracy๐
IF(YEAR(A2)=2005, AVERAGEIFS(K:K, I:I, N2, YEAR(A:A), 2005), "")
Please why am i getting this error
did u import pandas as pd
what do you mean
Pandas is a Python library useful for exploring your dataset. You need to import it first before using it.
import pandas as pd
So when you type pd.read_csv('train.csv') you are telling pandas to read the csv file and you can use the variable train_data to access it.
Thamks sSir
I've been learning ML for a month or so and i am working with a churn prediction dataset from kaggle and there i can see that there are a lot of class imbalances. especially Balance, Has credit card and Exited.
as a result of this the recall scores for the 4 of the models i have tried look like a joke(they don't go beyond 50 roughly) and the f1 scores are as bad too
also the credit score is hard-capped at some value around 850 and it also has a bunch of data and idk if i should cap the whole thing at 800 or something because the number of instances at the hard-capped end are decently high
How am i supposed to work around all of these. I am not sure if I should add synthetic data using SMOTE to all of it...well because its synthetic . I also don't know if its wise to lose data by cutting the dataset
dataset for ref
Hey,
Who you setup arize account to run this google colab: https://colab.research.google.com/gist/PubliusAu/3c0b73ecf5558e4dda2a483693be2b93/arize-wide-llm-kaggle-example-v2-0-1.ipynb?authuser=2#scrollTo=3lgvY4Vdx3N_ ?
Please Why is this not working please
i have successfully imported all the other documents for train and test
- Could you have defined your training data with a name other than "train_data"?
- Or you may need to restart the notebook if time has passed.
I am not very knowledgeable either, but these are the scenarios that come to my mind at first look.
Hey everyone, I am trying to build a model similar to Maskformer for semantic segmentation with 4 classes (class 0 is background). I am confused about two things:
First
Can you check if the way I perform learnable queries for each batch image is correct in this code?
self.queries = nn.Embedding(4, 768) # 4 queries, each with 768 dimensions, matching the shape dimension of Fe
Then in the forward method, I repeat it to match the batch size:
queries = self.queries.weight.unsqueeze(0).repeat(batch_size, 1, 1)
updated_queries, updated_features = self.transformer(Fe, img_pe, queries)
updated_features = updated_features.transpose(1, 2).reshape(bs, 768, 8, 8)
Second
Given the shapes of Fo and Qo, can anyone help me perform the sigmoid between Fo and Qo to get the binary masks for each class? Also, can I use softmax, and if so, how should I use it?
Qoshape:[8, 4, 768]# (batch_size, num_class, queries_dim)Foshape:[8, 4, 256, 256]# (batch_size, num_class, height_original_image, width_original_image)
Thank you for your help!
sigmoid_output = torch.sigmoid(torch.matmul(Fo.view(-1, Fo.shape[2], Fo.shape[3]), Qo.transpose(1, 2)))
And, I would suggest adding a bias term to the queries by adding a learnable vector queries_bias and applying it to the queries using torch.nn.functional.bias_add
Hello everyone,
I'm building a summarization system. Here are some challenges and questions I've.
- Don't have ground truth summarization for my dataset. Created synthetic summaries using Llama 3.1 405B, but not sure how to measure it using metrics and which metrics to use.
- What would be the best approach to solve this type of problem? Fine tuning, Pretrained with Prompt Enginerring with DSPY?
- Which metrics will be better to compare GT with Generation? Currently using Rouge and BERT Score
Need urgent advice to work on. If someone have built a system like this I'm open to any advice
Thanks for reading
JUST SCROLLING UP HERE in this chat thread. ๐ง WOW. For all you having specific technical problems, why don't you use AI to solve the particular aspects of your problems. I mean not in the overall algoritm to solve your problems, but particular aspects. I come up with overall, ON MY OWN, and use Ai to implement the specifics. TRY IT!!!
Is there a way using softmax to get the segmentation mask by multiply Fo and Qo ?
Help me understand, Importance of homogeneity of variance before Annova test.
I have been learning about Annova test. i have read in order to perform Annova test, it is necessary to check groups have normal distribution, homogeneity of variances, groups are independent.
Then, I saw tutorials of using Anova test in python. I have never seen anyone checking for homogeneity of groups. ls it not important?
Hi, everybody.
I have a question
I need to extract the abstract of papers not using GPT4, I have to rely on local resource.
from py_pdf_parser.loaders import load_file
from py_pdf_parser.components import ElementOrdering
document = load_file("JPM-2022-Harvey-25-46.pdf")
file_path = 'JPM-2022-Harvey-25-46.pdf'
document = load_file(
file_path, element_ordering=ElementOrdering.RIGHT_TO_LEFT_TOP_TO_BOTTOM
)
So I parsed the pdf using py_pdf_parser, and I'm going to merge the pieces until obtain the compelete abstract.
Now I try to use embedding models for this. But that doesn't work well.
If somebody has solution to about this, please help me.
In the case that I have to use the LLM models, the size should be under 2GB.
Thanks!
I think you can instead of using an embedding model, you can create a smple word-to-vector mapping using techniques like TF-IDF or Word2Vec
This will help you capture the semantic meaning of the words in the abstract
Thanks for your response, @cobalt raptor
I solved this with the gguf model of Gemma of which size is 1.5 GB.
ok, I have one suggestion. Could you please accept me?
Hi, everybody. I have a question.
I want to make a method to architecture the neural network for given real problem.
Is this possible?
So, I mean can we make the certain arhictecture of network based on neuro science?
Please help me overview of this and methods.
Where I can find the proper references?
Has anyone worked with video classification using convlstm? Want to connect for some help.
can anyone help me ? whenever i went for submit it show inference error and got reject
Hello, I have a question, can I import a csv that is in kaggle from google colab for example? I ask since routes are generally relative
buenas tengo una consulta, puedo importar un csv que esta en kaggle desde google colab por ejemplo? pregunto ya que las rutas por lo general son relativas
i would download the csv and upload it to google colab
Hey everyone,
This is Harsh and I am currently working as an AI researcher and have experience in machine learning. I recently started participating in Kaggle and would really appreciate it if you could take a look at my notebook for the Used Car Regression competition and provide feedback. Hereโs the link: https://www.kaggle.com/code/harshsharma1128/used-car-regression/notebook?scriptVersionId=199146028.
Any insights or suggestions would be incredibly helpful. Thanks in advance!
@pine narwhal I feel so strongly about Hugging Face datasets x Kaggle datasets integration I decided to tell you here too ๐ This would be an absolutely fantastic feature
amazing! if you want to chat with me about it some more, feel free to send me an email too: meg@kaggle.com
No bias here but it would be cool indeed ๐ค
Ha! HAI ๐ค
A bit sad though HuggingFace -> Hugging Face ๐ฆ
๐ค
๐คฏ Meg what are you doing here? ๐
lol Meg is the one who sent out the survey ๐
where am i right now ๐
in my defense i was on vacation when the survey was drafted but you're right i totally missed this typo -- it also used to drive me and others nuts at Stack Overflow when people would say StackOverflow ๐
Also you don't get opportunities to use emojis in survey every day
do you see @real dew on the right? ๐
o7
@jade dagger 
the note books dont work with seabor do to an up date so ๐ฉ
o thars a bug chanel sory
@hardy quail @jade dagger i shared the results of our survey here -- HF integration was #3 most important according to survey respondents ๐ https://www.kaggle.com/discussions/product-feedback/540751
[Feedback Requested] Sharing results of our Datasets feature survey.
Very cool! Thank you for sharing ๐ค
Really surprised "Dataset Creation Competitions" didn't score higher ๐ฆ
@pine narwhal One thing that came to mind and that wasn't in the survey: the ability for the dataset preview to handle parquet files. (The HF implementation for example is open source: https://github.com/huggingface/dataset-viewer)
This occurred to me in the context of uploading org datasets to kaggle and having to switch the parquet files to csv to have the viewer work.
@hardy quail that's awesome thanks for sharing. we actually have a new eng starting next week and this is likely to be their starter project so this timing is perfect. i'll share this reference with the team!
also re: less interest than expected in dataset creation competition, my hypothesis is that it's a result of the reality that there's actually not a huge overlap between users of Datasets and ppl who participate in competitions. since we targeted the survey at the former, that may be why. the other thing i didn't mention in my post which i probably should is that overall ppl rated pretty much everything as important. so it's just a relative ranking. ppl kind of liked all the ideas (i joked we should have thrown in some intentionally bad ideas ๐ )
NICE! Very excited about this ๐
ME TOO ๐
@hardy quail PS lmk if you have any other feedback on uploading org datasets - that's another thing we want to improve
@latent plank please share your work in channels like #๐พโdata ๐ this channel is for discussions w/ kaggle dev team
Ohh okay soory for that
Super smooth process all round!
One thing that confused me initially with the New Dataset process was needing to first provide data files before being able to set the creator of the dataset to be the Org profile
Hi everyone. Iโm looking for suggestions on tackling this problem. I have about a 100,000 unlabeled job description data that Iโm trying to use to determine the category of job. For example, from a job description text I want to know if itโs in IT, Software, Admin/Clerical etc. I tried using pre trained models from hugging face transformers but it didnโt work well. I have thought about labeling the data but it would take time to do it for a 100,000.
Hi!
I understand the problem. In order to have 100k job descriptions get labelled, you can first of all train a model based on labelled data, and then use that model to classify the 100k job descriptions.
Here are some datasets (labelled) I found for this problem:
-
Job Title & Description (4.6MB)
-
Predicting Job Titles from Resumes (163.94kb)
You can train classification model over this data if it suits your needs, and can label those 100k descriptions
Is it normal that datasets that I just made public don't immediately show up in an organization's overview? This is the org: https://www.kaggle.com/organizations/lichess
The overview currently " has no content yet"
Alright never mind, it just needed time ๐ (ca 10 mins)
yes it's a caching thing, sorry about that!
hello,
I am working on a loan approval prediction project and am considering modifying features by removing less important ones or engineering new ones based on their importance. If I apply these feature modifications consistently to the training and test datasets, will my notebook be eligible for acceptance within the competition?
Additionally, as I am new to Kaggle competitions, I would appreciate an overview of the typical roles and responsibilities that participants might take on.
this channel is for asking Kaggle developers questions about the site. you'll have a better chance of getting a response by asking in a more appropriate channel like #โโask-a-question where the community can answer.
it might be cool to add options not related to time to the update frequency drop-down. We have a dataset that's updated occasionally, i.e. whenever someone sends a PR (new data, typos, new concept)
nice. we've talked on and off about how to improve this. i'll file something to our backlog.
cool! I'd vote for "it will happen when it happens" 
e.g., another obv. way to improve it is based on datasets that are maintained using notebook pipelines. you can schedule a notebook to run > generate an output that's used as a dataset version. so each time the notebook runs on a schedule, it triggers a dataset update.
"vibes"
haha
"solemnly swear to update when the vibes tell me to"
There's a 24 hour community competition hosted by ODSC and Nvidia and lots of participants are reporting website crashes and can't get into the competition, we're looking for quick technical help. Who from Kaggle team could help me with some information?
How can we use our fine-tuned model on Kaggle if it's saved locally?
@hardy quail
I don't work at Kaggle, but I assume you can upload it here: https://www.kaggle.com/models (+ New Model)
you can also use kagglehub client library to push your model weights to Kaggle Models: https://github.com/Kaggle/kagglehub
so sorry i missed this message (but fortunately they got in touch with us via some other channels and we got things resolved i believe)
anyone know how to troubleshoot a notebook saying "failed to save draft"? when I try to quick-save the same notebook, it looks like it attempts to run the whole thing, but the "execution" throws a KeyError: 'state' in some nbconvert code and ultimately the notebook shows up as a script (JSON dump) instead of a notebook - though my own draft view renders fine
I'm guessing that these notebooks are imported from Colab? There is more information in this issue, but TL;DR is that you'd need to manually clear some metadata from the ipynb.
That fixed it ๐ I had been meaning to fix up the formatting too, so this forced my hand. It was nice and easy to download and re-upload back into the same notebook too, after running nbfmt, pyink, etc.
nice
Hi, I can not verify my Kaggle account through Phone. Even on my first attempt it gave me a message of "Too many request" I am from Pakistan
Same on my side too
same here
Contact Kaggle support. I did and just received email from them that they have verified my account manually.
What are the performance implications and optimization constraints of running PySpark on Kaggle's infrastructure, particularly regarding local vs. distributed execution modes?
Would I dare dreaming, featured lazy evaluation, task scheduling, Driver to cpu with optimized GPU offloading (with CUDA, my beloved Tensorflow, XLA, cuDNN, Cupti)...
I have invites on other gmail account but I use another gmail account for Kaggle and github. what to do link both gmail accounts or use other account
Hi! I have a few questions.
It seems that you can't change usernames on Kaggle. I recently created an account using a Gmail login. If I delete the account, will the username be available for reuse and will I be able to use the same Gmail account for creating a new account with a different username?
Is this possible? Can this be done immediately? Or are there restrictions in place for email and/or username reuse? Is there a waiting period for reusing email accounts and usernames?
if i wanted to make something to visualize ML models that people can use to learn about them, would streamlit be a good idea? if not streamlit, the only other option i could think of would be manually making everything with javascript
yeah streamlit is good for that!
Gradio is another popular option which needs very little time to set up, but I'd also go with Streamlit.
hello every one
i am vaibhav my skillsets - MERN stack & Flutter
now i want to move towards the GenAI
@here can you guys give me tips or from where should i start learning ?
@main zinc @finite crest spammer
cc @pine narwhal they also posted it in multiple other channels
thank you - i reported to our mod team
please read the channel descriptions before you post and don't try to use "at.here". you will have more luck posting in a channel like #๐โgetting-started
sure
Is there a way to know if a notebook is being run in "submit mode" for code competitions?
E.g. in a code competition, you usually run your notebook once to submit it, with the real test data unavailable. Then, you can submit it with the actual data, but the results of the run invisible to you.
I would like to do cross-validation in the case that the notebook is not being submitted, so is there a flag (like an ENV_VARIABLE) that tells you that a notebook is being run as a submission?
Hello everyone
yes! take a look here: https://www.kaggle.com/discussions/product-feedback/315792
[Product Update] Notebooks environment variable for Competition Reruns.
there is environment variable KAGGLE_IS_COMPETITION_RERUN for reruns
UI question: what's the shortest way to get from my kaggle profile to the profile of an org i manage?
I can currently hover on notifications and click on the "your org has been approved" one to go to the profile, or I can go to my datasets and choose an org dataset, then click on the org name from that page.
Is there a shorter way I'm missing?
i think we need to make a shorter path. probably we should add a list of your orgs to your profile. we also don't have a kaggle.com/organizations global list of all accessible orgs...
would adding orgs you're a member of to your profile fulfill what you're looking for?
absolutely! Thank you ๐
I think it might also make sense to add it to the collapsible profile panel.
The thing I can access from any page when clicking on my profile photo. Not sure what the correct term is ๐
mine has not
@pine narwhal please help
Support is the best way to get help. I'm not on our moderation team and don't have the ability to assist with this.
all of the resources are gathered here in a self-paced learn guide: https://www.kaggle.com/learn-guide/5-day-genai#GenAI
i did but they never replied
like i worte 3 weeks ago and i am still waiting
i worte one today and i am waiting
Can anyone help me with submitting my prediction? my notebook is linked but submit button seems to be disabled for some reason, even added a description
you need to share more details, ideally a notebook example, so people can try to reproduce. recommend asking in #โโask-a-question because probably the community can help you with this.
figured it out, thanks though
I'm building my first language. It's an Universal Scripting Language, that uses .usl files to let you use 466+ coding languages in one place. I've been asked to make a transpiler for everything so I'm working on that but there is a github to try out a test script in .usl through python3. If any of you have any advice I'm open to the help. I'm a beta dev for OpenAI (I program the reasoning systems each night) so I built it into all of the levels I had access to. Oh the langauge was given its MIT license already. It also grows more advanced as time goes on. But while I learn to build this transpiler, what else would you be neededing? I have syntax and commands on the github.com/jordan-townsend/usl and the book is 486 pages cover ready, but I need to get the code fully running as intended before go through all those script and cookbook examples. I'm also porting it onto my watch since I'm upgrading the systems throughout that build. Trying to figure out which one to record first/who to talk to about such things.
I'm a hobbyist diving into a data analysis/ science project close to my heart. I have little knowledge in coding (used php many years ago as a hobbyist and have used python, pycharm and jupyter Notebooks for a year or so.), but my topic revolves around standardising CGM (constant glucose monitor) data from various devices, and cleaning, aligning and quality scoring the user's records and providing a list of possible analyses depending on the level of information kept by the user. I am a type 1 diabetic, wearer of a cgm and have access to two years of my own data and lots of technical knowledge in this field. I'm interested in anyone's input on exactly how to structure a project like this that starts with a simple parser for one device / api, with the idea of adding modular functions in various stages after defining a standardised data format. I have began the logic for my use case, Xdrip+ sqlite backup, but I'm looking for advice on building my structure for such a project. Here's my work so far: https://github.com/Warren8824/cgm-data-processor
And a notebook using the source code: https://github.com/Warren8824/cgm-data-processor/blob/main/gastro_eda_book%2Fcontent%2Fx_gastro_eda.md
The dataset was very messy which I know is what 'real' data tends to look like. I've done a lot of work trying to get the data in a suitable format for further analyses but with this being my first solo project I'm concerned about moving forward with any analysis if my data is 'bad'.
If you wouldn't mind having a quick look and pointing out any rookie errors or overlooked aspects of my process.
Thanks in advance.
GitHub
Python package for processing CGM, insulin, and meal data from diabetes management systems. Features robust gap detection, data alignment, and quality assessment. Built by a T1D developer. - Warren...
This is my multi gpu fine tuning notebook.
It gives of memory error after 5 steps of training , I've tried many things still can figure out.
Pls help
Link :- https://www.kaggle.com/code/shaswatsingh69420/ddp-sft-trainer
I tried to verify my phone number only to get "too maby request" error on first click and then it refuses to send me any code. It's been days and I need to verify to be able to use internet connection in notebook.
Hey guys
I am looking for a metric which check similarity between sentences with semantic understanding.
ground_truth = [
ย ย "There is no pleural effusion or pneumothorax.",
ย ย "The heart size is within normal limits.",
ย ย "Mild interstitial prominence is noted.",
ย ย "No acute abnormality detected."
]
ย
generated1 = [
ย ย "No Pleural effusion or pneumothorax is observed.",
ย ย "Heart size appears normal.",
ย ย "There is mild interstitial prominence.",
ย ย "No significant acute abnormality seen."
]
ย
generated2 = [
ย ย "Pleural effusion or pneumothorax is observed.",
ย ย "Heart size appears abnormal.",
ย ย "There is massive interstitial prominence.",
ย ย "Very significant acute abnormality seen."
]
Here generated1 is similar to ground truth and generated2 contradicts ground truth.
So
Basically I need a metric which give good score for ground truth and generated1, low score for ground truth and generated2
can anyone help me with this issue?๐ฅน
https://www.kaggle.com/code/sidharthad/custom-training
hey guys can you help me build a model for my eeg analysis you can find the notebook here - https://www.kaggle.com/code/pramitroy/data-processing i have an interview on sunday would love your support
Hi, I am new to computer vision and currently I am trying a multiclass image segmentation using PyTorch, from scratch. From past couple of days, I am stuck with it. Anyone could help me, please? Here is the link to the notebook https://www.kaggle.com/code/amitsur20012009/aerial-image-segmentation-using-pytorch
if im making a movie recommendation system, how do i make sure it's fast? i'd want it to look at a users 100 movies and ratings, and be able to spit out recommendations in under 2-3 seconds
but i just don't know how it would be fast. if i did something like embeddings, i'm assuming that putting in 100 new movies for a user to get results back would take a minute to do
To make your movie recommendation system fast avoid computing everything from scratch. Instead, precompute movie embeddings and store them in a FAISS or Annoy index for quick similarity search.
Hello guys , Can anyone tell me how I can import data using Spotify API ? I am trying to import data using Clientcredentials call but it's telling me that there's an error in my request.
so do you mean like basically pre-compute every movies similarity to each other, and then if someone says they like a movie it would just look up it's similarities?
@hardy quail it was awesome to see Lichess sharing the datasets i believe you helped to upload on Kaggle! ๐ out of curiosity, did you work directly with Lichess? would love to ask some more follow-up Qs, too, about dataset publishing if you are open to emailing me: meg@kaggle.com https://x.com/lichess/status/1886022562959839271
Lichess is now on @kaggle!
Use our puzzles, openings, and engine evaluation datasets directly in your kaggle notebooks: https://t.co/2Q8qrrprcb โ๏ธ
Hi Meg! I'm glad you liked the release ๐
Yes, I joined Lichess at the end of last year to help out with data management.
Happy to talk more and answer questions! I'll email you in the morning ๐
(EU morning)
Hello everyone, Iโm currently interning and exploring how AI and ML can be applied to Neuro-Linguistic Programming (NLP). Iโm still figuring out what innovative ideas could come from this combination. Iโd really appreciate your insights on potential applications or directions to explore. Looking forward to your thoughts!
He meant to say that store those 100 movies embeddings in the FAISS or ANNOY index like a VectorDB and when the user Asks for the query just applied embedding on the query and then apply the similarity search of the query embedding with those stored 100 movies embeddings.
okay thank you!!! i'll try that outp
I really love the Kaggle identicons. By far the prettiest I've seen. Any chance the code (or a description of the underlying code) is available somewhere? ๐
no pressure whatsoever, but I just wanted to let you know that I sent out the email ๐ค
I made these myself back in 2017, they were made by hand in photoshop, no code I'm afraid haha
hahaha sorry I assumed they were machine generated. They're so pretty! Thank you for replying ๐
Man
Just change ur persona verification process, literal shit
Always fails, even after removing glasses no glare
Please just bring another verification
Same +1
Fr been trying for 2 days now
They have fixed it now
Hi @hardy quail Thank you for letting us know about the error. It has been fixed. Can you please refresh the link and try again. ๐
works great now, thank you!
guys i need help setting up the kaggle gpu, for some reason it's not working even when it's set on the session
if i believe that someone from this group is trying to scam me what should i do?
i can see theres some scam people comming to talk in private, i would like to know how to report it for some administrator
Feel free to dm me names of spam accounts and I can ban them. Sadly Discord doesn't give us many tools to stop this kind of spam/scam.
ok i will do it next time, for now brenda flynn helped me, thnx
hi ppl here's my linkedIn post for my project BHP | ML
checkout and share ur thoughts
https://www.linkedin.com/posts/huzaifawatto_datascience-machinelearning-python-activity-7303725708785205248-B_zx?utm_source=share&utm_medium=member_desktop&rcm=ACoAADpAOFcBfIUcnVcB_B3BGegaJiHW1oulA34
I think #๐โsharing-projects would be a better fit for this. ๐ This channel is for questions to the devs
hello 
I was wondering if there was a sharding recommendation for larger datasets. Is it better to have one big parquet file or smaller shards? How big should the shards be?
yeah i didn't know that before now i do
I mean to start with, a simple angle between the vectors should be good right? If you want you can have an SLM give the score too.
HIRE ME | Full Stack Developer
I am currently exploring new opportunities where I can leverage my expertise in full-stack development to drive innovation and create impactful solutions.
With over 7 years of experience, I have developed scalable, high-performance applications with a strong focus on security, efficiency, and user experience.
What I Bring to the Table:
-Fullstack Expertise: JavaScript, TypeScript, PHP, Python, Node.js
-Frontend Development: React, Vue, Angular, Next.js, Tailwind CSS
-Backend & APIs: Express.js, Laravel, Django, FastAPI, RESTful APIs
-Blockchain & Web3: Ethereum, Solidity, Web3.js, Ethers.js, Smart Contracts, DeFi, DEX, NFTs
-AI & Agent Development: LangChain, LLMs, OpenAI APIs, Rasa, AutoGPT, AI Workflow Automation
-Data Engineering: Data Pipelines, Apache Kafka, Apache Spark, Airflow, ETL, SQL, BigQuery
-Cloud & DevOps: AWS, Google Cloud, Docker, Kubernetes, Terraform, CI/CD
-Database Technologies: MySQL, PostgreSQL, MongoDB
I am passionate about scalable web development, DeFi, blockchain innovations, AI-powered agent systems, and data-driven solutions.
I am seeking opportunities to contribute to cutting-edge projects in Web3, AI automation, and immersive technologies while collaborating with forward-thinking teams.
Let me know if youโd like to refine anything further!
AYOOO you made the โkโ red
Took long but was happy to see
Thanks
@sweet grove ๐๐ธ
Still in the bug backlog, sadly it hasn't bubbled to the top
Im not sure what you mean.. is it not working with you?
Oh lol, nevermind then, I guess it was introduced afterall haha
-> we need a red version of this now ๐
I want everyone's opinions about this topic in the linkedin comments.
๐ ๐๐ ๐ถ๐ ๐๐ ๐ฝ๐น๐ผ๐ฑ๐ถ๐ป๐ดโฆ ๐๐๐ ๐๐ฟ๐ฒ ๐ช๐ฒ ๐๐ผ๐ฟ๐ด๐ฒ๐๐๐ถ๐ป๐ด ๐๐ต๐ฒ ๐ฅ๐ฒ๐ฎ๐น ๐ง๐ต๐ฟ๐ฒ๐ฎ๐?
As an ๐๐ ๐๐ป๐๐ต๐๐๐ถ๐ฎ๐๐, ๐ฃ๐ฟ๐ผ๐ฑ๐๐ฐ๐ ๐๐ป๐ด๐ถ๐ป๐ฒ๐ฒ๐ฟ, ๐ฎ๐ป๐ฑ ๐๐ ๐ฅ&๐...
can anyone tell me where the #5dgai-announcements channel is? For GEN-AI announcements?
How do I calculate the MSE?
The link you have above worked for me - are you having trouble accessing it?
check ur DM
Hi,
I used things like device map/max-memory, auto, balanced . But still I could see 2 of the 4 GPUS (L4*4) underutilized through session metrics. I can see all 4 GPUS utlized if I manually move the layers but I don't want to do that with QWEn 32b model
Can someone please help. I am really struggling
Can someone please please help
How to install java development kit on kaggle TPU...
i am running h2o python module which needs java
with no accelerator i did nto run into any issue however when i switch to TPU i am unable to install jdk
!sudo apt-get update
op:
/usr/bin/sh: 1: sudo: not found```
Keep on trying ...
Thank you!
Do I need to upload my dataset on Kaggle if I want to train my model using the gpus provided by Kaggle
it doesn't have to be hosted as a public dataset on Kaggle, but it will at some point have to be uploaded somehow so the notebook can access it
Is there any compulsion to create project in .ipynb format file
thanks
I believe yes
From the Gen AI capabilities that we have discussed, which capability would be best to use for Satellite Images?
Wildfire prediction via satellite images
Genai could be used in numerous ways
GenAI has numerous applications in wildfire prediction via satellite images, including ยน ยฒ ยณ:
- *Early Detection and Monitoring*: GenAI can analyze satellite imagery to detect thermal anomalies, identifying new fires and monitoring existing ones in real-time.
- *Fire Spread Prediction*: By leveraging machine learning algorithms and satellite data, GenAI can predict the spread of wildfires, enabling emergency response teams to plan and allocate resources effectively.
- *Fire Susceptibility Assessment*: GenAI can assess areas at risk of wildfires using high-dimensional weather and terrain data, capturing fire dynamics to deliver pixel-based predictions of burned areas.
- *Wildfire Forecasting*: GenAI models, such as FireCNN, can be trained on satellite imagery and weather data to identify areas of high fire risk, potentially preventing up to 76% of wildfires.
Some notable GenAI models and techniques used in wildfire prediction include ยน ยฒ โด:
- *Deep Learning Architectures*: CNN-LSTM and U-Net models are used for spatial and temporal prediction of wildfires.
- *Random Forest Surveys*: These surveys estimate tree mortality by analyzing satellite data, such as surface texture and plant health.
- *Hybrid Wildfire Forecast Systems*: Combining deep learning algorithms with existing weather forecast models can improve wildfire predictions by up to 7 days.
The benefits of using GenAI in wildfire prediction include ยฒ:
- *Improved Fire Prevention and Resource Allocation*: GenAI can help identify areas at risk of wildfires, enabling proactive measures to prevent fires and allocate resources effectively.
- *Computational Efficiency*: GenAI models can process large datasets quickly, enabling rapid predictions and decision-making.
- *Enhanced Predictive Accuracy*: GenAI can improve the accuracy of wildfire predictions, reducing the risk of false positives and false negatives.
@hollow lynx sorry for the ping, but the scam above was posted in multiple channels
cc @sweet grove as well ๐
Thanks for the report, I've deleted them now
Hi, I'm new to Kaggle and I'm unable to click on the 'GPU (100 )' option. Can someone please help me with this?
Have u verified ur account with phone number ?
if no ; once u verifiy it and close it then reopen it, the issue will be resolved .
Yea that's the issue , i get it
I havenโt got separate email with capstone project. How should I proceed?
I seem to be using a considerable large resource: data from soccernet. I am at the point of unzipping files but I got a run cancel. What does this mean and how can I resolve the error, please?
@sweet grove same scam again ^ ๐ซฉ
Thanks
can I run cuda on my gtx 1650 laptop
why it is so hard to upload csv file in mysql ?
i am tired
Is there a way to automate the update of dataset descriptions and metadata? I tried to look in the kagglehub repo but couldn't find a way. We have datasets that update monthly and i was wondering if i could somehow update the "about" section for example when pushing new data.
Hello,I was working on a trend analysis, but the dataset I have presents data with a singular temporal pattern. Given that, during the time period covered by my dataset, specific situations occurred that modified the data, making it slightly different from reality. Does anyone know how I can handle this type of data to perform a realistic trend analysis?
anyone recently worked with openai's api key?
before we used to get "sk-...." type keys, now we getting "sk-proj...." type keys. idk why are they not working or getting authenticated in langchain.
is this problem just with me or anyone else? any solution for this?
feels like openai changed their api key structure and langchain havent made any updates according to that new openai api key structure
Hi I'm trying to use Qwen3 with vLLM, can anyone share a starter notebook for this?
How is the submission time for hackathons determined?
I was trying to make a last minute update to my submission, but now it is stuck as a draft and I can no longer submit it.
It seems the submissions close at 23:58 instead of 23:59 UTC, as I (and many others) understood.
Quite a few people in the OpenAI to Z hackathon fell afoul of this subtlety.
See screenshot (CET time zone) showing that I could no longer submit.
Could the last draft just be auto-submitted? I had submitted a version earlier.
are you getting 429 error, saying limit exhausted or something similar to it??
Hi, I've opened a pull request to add a new package to the base image a week ago. So far there has been no reaction, would you mind having a look? https://github.com/Kaggle/docker-python/pull/1487
GitHub
This adds anywidget, which is basically a custom ipywidget that allows the creation of other custom widgets without the need to install additional Jupyter extensions. According to the website, it&a...
Hello, I'm a very new Kaggle user, but my "All of Your Work" page shows 30,175 items. This number seems impossible to me.
The vast majority are closed competitions I've never seen or participated in.
I've recently used a CLI tool (made with Claude Code) that interacts with APIs. Could I have accidentally triggered API calls that caused this, or is this a normal display issue related to filters for new users?
Any insights are much appreciated!
P.S. I'm also concerned I have made unintentionally excessive API calls...
hi ppl hope ya'll good
quick question
Why is trainer.train() taking so long to start or show progress, even though I'm using a T4 GPU on Kaggle? My dataset has only ~2.3k TurkishโUrdu sentence pairs. Is there a way to speed up the initialization or view training progress earlier?
its been 2 hours and am waiting for result!
Refresh the page and use the default engine
bro its been 6 hours now r u sure i should refresh it
also am yesterday i tried it without any accelerator
and after 2 hours i accidently refreshed the page nd it was all gone no progress i didn't get my model
Whenever you are doing coding, always have a downloaded copy saved somewhere accessible. Also, start doing it habitually. It will help you when you land your job. Always keep backups of your code.
where did this come from bro
am not asking for job lol๐
am stuck in training my model i need suggestions for that
I was just saying in a general way. Due apologies. Coming back to your code structure. Here's what i think (i could be wrong.. but i'll try, since i'm learning stage as well)
am in learning stage as well so try no need to worry
-
trainer = Seq2SeqTrainer(
tokenizer = tokenizer, #needs to be replaced by code below
instead use --> processing_class = tokenizer -
training_args = Seq2SeqTrainingArguments(
output_dir="./model_tr2ur",
run_name="my_first_seq2seq_run_20250702"
right now this is how far i could stretch my brain.. 
so i should refresh and try this
save your current code first in a notepad or download the notebook
edit the code... if it's possible.
It could depend on many factors like the number of tokens per training example, the model size (number of parameters) and how you have configured training and logging
thanks for reply but i have solved it
subscribe this channel and see the latest update and learning video from here :https://www.youtube.com/@AuraaiX/videos
YouTube
A U R AโโขโA I
Welcome to Aura AI, where we explore the subtle yet powerful presence of Artificial Intelligence shaping our world. We go beyond the headlines to uncover the essence of AI, demystify complex concepts, and illuminate its future impact. Join us to understand the technology that's defining tomorrow. Subscribe for deep di...
Hi, I have recently made a notebook with plotly graphs, it shows perfectly in Editor mode but it is not showing on Kaggle's notebook viewer, even when I used .show(renderer='iframe').
Notebook: https://www.kaggle.com/code/stevensio/kaggle-journeys-cohorts-and-competition-shifts
I tried to look at it on my girlfriend's macbook and it shows, but somehow, the graphs don't show when viewed on my windows laptop. Could this be an OS issue?
Would really appreciate if anyone could find the issue in my code
is kaggle still working on converting the forks/copies of notebooks to upvotes? curious because after the new update i seem to have 11 pvt copies for a notebook of mine but the upvote count is still 3
Unless i got something wrong in the progressions guide, would love to be corrected: https://www.kaggle.com/progression/code
not sure where to report this, but i'm seeing ElasticSearch errors when trying to access user profile tabs
specifically the following when accessing the discussions tab:
Error: ElasticSearch query returned an error: Connection refused (10.42.0.56:9200)
at h (https://www.kaggle.com/static/assets/app.js?v=7c912cf30766839ec9e0:2:658661)
at https://www.kaggle.com/static/assets/app.js?v=7c912cf30766839ec9e0:2:457285
at https://www.kaggle.com/static/assets/app.js?v=7c912cf30766839ec9e0:2:453658
at Object.next (https://www.kaggle.com/static/assets/app.js?v=7c912cf30766839ec9e0:2:453763)
at j (https://www.kaggle.com/static/assets/app.js?v=7c912cf30766839ec9e0:2:452204)
at a (https://www.kaggle.com/static/assets/app.js?v=7c912cf30766839ec9e0:2:452407)
Can't attach photos
sup devs! ๐
real talk - what made you quit your last online course?
we're collecting failure stories to build something that doesn't suck
๐ spill your learning trauma here: https://tally.so/r/w2GzZe (2 mins)
please share with your friends too - we need your support! ๐
appreciate y'all! โจ
Are trying to build a LLM? That learning rate tells the fact. I have never developed one.
I have one interesting question about AI engineering.
This may be the wrong Discord for this question. Please advise me on the right one.
I have a collection of order documents, and I need to be able to answer questions about the data.
This looks like a perfect solution for rag, but no. Because the questions could be like this:
"What did we purchase from the electronics department last month, and for what amount?", "Which seller sold most garden equipment?"...
This requires some data aggregation to categorise and perform operations of sum.
Please advise on what solution would be good for such a task.
I have a suspicion that a good solution would be to digitise all the documents to a SQL and to have some agentic framework with an MCP tool that gets resources from SQL dynamically generating SQL queries.
Please advise Better solutions or resources for such a task.
Hey there
Hope you are doing well
I am a senior full stack developer who has full experience in web and AI development
So if you have some projects, please let me know
thank you
https://github.com/typhon0130
https://figma.com/@typhon0130
i was trying to build a language dubber model like english to turkish u give input as video in english it will dubbed the auido of video in turkish
Check out my approach to the NeurIPS - Ariel Data Challenge. I focused on engineering features from the raw FGS1 and AIRS time-series data to predict transit depth. The results from the Ridge model are quite promising (Rยฒ > 0.92)!
An upvote would mean a lot. Please let me know your thoughts!
Link: https://www.kaggle.com/code/mayuringle8890/neurips-ariel-data-challenge
Who here understand docker (beginner or expert).
Need help with dockerizing an ai project
reach out to me , can help u
Job Title: Part-Time Senior AI/ML Engineer (Remote)
We are seeking a skilled and experienced Senior AI/ML Engineer to join our remote team on a part-time basis. The ideal candidate will have a strong technical background, excellent communication skills, and the ability to work independently in a fast-paced environment.
Requirements:
-Minimum of 7โ10 years of professional software development experience
-Proven experience working effectively in a remote environment
-Advanced English proficiency (C1 or higher); an American accent is preferred
-Availability to work 10โ15 hours per week during EST or CST business hours
If you're a highly motivated engineer with a passion for building high-quality software and can commit to a flexible part-time schedule, weโd love to hear from you.
You can connect with me on WhatsApp: +1 (567) 469-5384
Extract data from order documets and store in relational DB and then use text-to-Sql LLM Agent that translates Human query to sql query
Use report_to="none" in training_args. In the Vs code or Cursor , it doesnโt seem to work, but it asks for WandB API keys to monitor the modelโs metrics on Wandb.ai. Without it, training waits for input and lasts until the session ends.
thanks for reply man appriciate it
these are the times when cross enterprise workflows look so handy but never a possibility. One organization has data neatly in a database and fed to ERP. It even has order management system from which a B2B or B2C customer purchases items. Now the customer has invoices in PDF or in printed form. The customer then reaches out to RPA, Image Recognition, uses libraries to OCR and reconvert that order back to a table. The customer meanwhile also uses confidence score to check if the data fetched is correct. If it is RPA, and OCR, then the format of the script changes with the location of the document that has name, order number etc etc.
In a parallel universe, the two systems talk through an API call, or XML, or worst case EDI (Electronic Data Interchange). That's all that was needed.
Hey I am new to data science! I am eager to learn more. Can someone put me in the right direction
[URGENT] System bug: Writeup submission cancelled after deadline due to edit button issueHello Kaggle Dev Team,
I'm reporting a critical system bug that caused my writeup submission to be incorrectly cancelled after the deadline.
Bug Report:
- Account: giryun288@gmail.com
- Competition: BigQuery AI - Building the Future of Data Hackathon
- Project: "BigQuery AI Hackathon: Multimodal Health Analysis"
Technical Issue Details:
- I successfully submitted my writeup at 8:20 AM (40 minutes before the 9:00 AM deadline)
- After the deadline (at 9:00:02 AM), I accidentally clicked the "Edit" button to make minor text corrections
- The system incorrectly interpreted this as a submission cancellation, changing my status to "Deadline Missed"
Evidence of Pre-Deadline Submission:
- Complete writeup was submitted at 8:20 AM (before deadline)
- All project files were completed and uploaded before deadline
- GitHub repository: https://github.com/mkmlab-hq/bigquery-ai-hackathon-submission
- Browser history shows successful submission at 8:20 AM
Request:
- Please restore my original submission to its pre-deadline state
- This appears to be a UI/UX bug where the edit button should be disabled after deadline
- The submission was completed before the deadline and should not be marked as "Deadline Missed"
Project Status:
- Complete multimodal health analysis system
- All technical requirements fulfilled
- Production-ready code with comprehensive documentation
- Full GitHub repository with all source code
Technical Details:
- The edit button functionality after deadline appears to be causing submission cancellations
- This could affect other participants who accidentally click edit after deadline
- The system should either disable the edit button after deadline or not cancel the submission
Thank you for your assistance in resolving this technical issue.
Best regards,
์ค์๋ฏผ (familyunion)
Kaggle ID: giryun288@gmail.com
can anybody tell me about standard and minmax scaling i have a problem with standard deviation
Hello, I'm wondering if there's any way to get the current Kaggle version name from inside the notebook - this would help a lot with integrating with external monitoring tools
Try this:
!pip show kaggle
It gives output like
Name: kaggle
Version: 1.6.14
Summary: Kaggle API
cc @sweet grove
I am working on a cGAN project for skin disease classification using the HAM10000 dataset. I am facing a significant problem: overfitting occurs during GAN training and the FID (Frรฉchet Inception Distance) score never drops below 100. Please advise on the best approach I should take to overcome overfitting and lower the FID score.
https://www.kaggle.com/code/akbariffianto/val-cgan-ham10000-6
Can something be done to improve the experience when having the same notebook open in multiple tabs? Right now it's like the way it works is the opposite of realtime collaboration in every way
I edit the notebook, click "Save Version" and come back 3 hours later to find that because I had the notebook open somewhere else, it actually saved the same version as I last ran (??!!!)
Hi, I have a problem with Azure Machine Learning Deployment
I'm trying to deploy a model to Azure Machine Learning for the first time. It's a language model. My problem is that when I invoke the endpoint, I get an error related to dependencies in the image creation job.
It's an MLflow model. I've already tried manually modifying the conda.yaml and requirements.txt files, adding NumPy version 1.26.4, and using the same environment I use with Jupyther Notebook in the deployment creation, but it still doesn't work.
ERROR: Ignored the following versions that require a different python version: 1.21.2 Requires-Python >=3.7,<3.11; 1.21.3 Requires-Python >=3.7,<3.11; 1.21.4 Requires-Python >=3.7,<3.11; 1.21.5
Requires-Python >=3.7,<3.11; 1.21.6 Requires-Python >=3.7,<3.11
ERROR: Could not find a version that satisfies the requirement numpy==1.21.3 (from versions: 1.3.0, 1.4.1, 1.5.0, 1.5.1, 1.6.0, 1.6.1, 1.6.2, 1.7.0, 1.7.1, 1.7.2, 1.80...
ERROR: No matching distribution found for numpy==1.21.3
ERROR: failed to solve: process "/bin/sh -c ldconfig /usr/local/cuda/lib64/stubs && conda env create -p /azureml-envs/azureml_759b376b50be14779b3978d6ae3ce445 -f azureml-environment-setup/mutated_conda_dependencies.yml && rm -rf "$HOME/.cache/pip" && conda clean -aqy && CONDA_ROOT_DIR=$(conda info --root) && rm -rf "$CONDA_ROOT_DIR/pkgs" && find "$CONDA_ROOT_DIR" -type d -name pycache -exec rm -rf {} + && ldconfig" did not complete successfully: exit code: 1
Hi can I get rewards
i linked my kaggle account with discord but still cannot msg
check a mail
How to creat api using Ai studio
- Go to the website:
๐ https://aistudio.google.com/ - Sign in with your Google account (your Gmail).
- Click your profile picture in the top-right corner.
- Select โGet API keyโ (or โAPI Keysโ option).
- Click โCreate API key.โ
- Copy the key that appears โ it looks something like this:
AIzSyA7g************abc123 - Keep it safe and private.
Donโt share it publicly or post it online. - You can now use this API key in your code to connect with Googleโs Gemini AI models.
Thatโs it! ๐
Is there a quick YouTube tutorial for using API within kaggle for different platforms and projects?
Adding API key to your kaggle notebook + creating your first kaggle notebook : https://youtu.be/4q816Qgkl0A
What is api
Iโm an EEE student and donโt have much coding knowledge yet, though Iโm really interested in it. Do you think this course would be beneficial for me? Is it suitable for beginners, or mainly for advanced learners?
yes
Hi everyone
Did you know about api.key and how to use that' anybody knows that' pls.explain about it ๐
Hii
Yes I know
- Go to the website:
๐ https://aistudio.google.com/ - Sign in with your Google account (your Gmail).
- Click your profile picture in the top-right corner.
- Select โGet API keyโ (or โAPI Keysโ option).
- Click โCreate API key.โ
- Copy the key that appears โ it looks something like this:
AIzSyA7g************abc123 - Keep it safe and private.
Donโt share it publicly or post it online. - You can now use this API key in your code to connect with Googleโs Gemini AI models.
Thatโs it! ๐
@arctic crow thank u ๐๐
can anyone help me ?, in the initial conditions it was mentioned that i need to be able to create a api key via google's ai studio but i am blocked from doing so as i am still 17.
Halo เดฎเดฒเดฏเดพเดณเตเดธเต
Yes bro
How can we leverage the concepts from this 5-Day AI Agents Intensive to design autonomous systems that can adapt and improve in real-world, dynamic environments beyond the lab setting?
How can i ask quetion for Tommorom 5 days agent course
hi
Hello bro can u please answer my question like how do I generate the api key if I am under 18
I'm finding a US developer for the collaboration. If anybody interested, please dm me.
Hello I am trying to verify my kaggle account but unfortunately my persona link has expired, does anyone know how can I get a new link?
Yes I need also to find this same problem plz anyone?
@honest aspen @smoky bone
You guys should check your account settings page first:
https://kaggle.com/settings
Look for the "Identity verification" section. You should see an option there to restart the process or get a new link.
If that option is disabled, you need to contact Kaggle Support directly. They can manually reset your verification status so you can try again.
Support Link:
https://www.kaggle.com/contact#/account/verify/rejected
(Fill out this form)
Tip when you re-verify:
This is a common issue and their system often fails because of glare. Take off your glasses and make sure you're in a well-lit room but with no strong light or glare on your face or in the background.
๐ Hello Kaggle Experts! ๐
Iโve poured a lot of effort into my 3 notebooks. One of them even received 141 votes within just 3 days of release! The other two have over 100 votes as well, yet I havenโt earned a medal.
Iโd really appreciate your expert insights on how to improve them:
https://www.kaggle.com/code/norikokono/spicy-ai-debates-updated-analysis
https://www.kaggle.com/code/norikokono/spicy-ai-debates
https://www.kaggle.com/code/norikokono/sentiment-analysis-on-pizza-hut-reviews
Any tips, suggestions, or critiques would be super helpful! ๐๐ก
Thanks so much for taking a look! ๐
Hi guys, i'm a day behind and i got no clue on what to do! can anyone help me figure it out pls?!!
Hey Keerthana,
You can catch up.
Here's the playlist for all the course livestreams. The first video is the Day 1 kick-off, which explains the basics:
https://youtube.com/playlist?list=PLqFaTIg4myu9r7uRoNfbJhHUbLp-1t1YE
Since you're a day behind, you should focus on the Day 1 stuff first:
- Watch the Day 1 Livestream.
- Read the Day 1 Whitepaper ("Introduction to Agents").
- Complete the two Day 1 Codelabs (โbuild your first agentโ & โbuild your first multi-agent systemโ).
Then, to catch up, you can start the Day 2 material. Today's topic is "Agent Tools & MCP", so definitely check out the Day 2 Whitepaper when you're ready.
You can find all the links on the main Kaggle course page. Here:
https://kaggle.com/learn-guide/5-day-agents
How I can complete my day 1
Hi, when adding API key as a secret when opening ADK web page and start sending a message an error a raised because of invalid API key , please is the following command synatx correct:!adk create sample-agent3 --model gemini-2.5-flash-lite --api_key GOOGLE_API_KEY
Hello
โHey everyone, Iโm 16 and I donโt have a credit card. Is there a free access link or student setup for the AI Agent Challenge?โ
Hi, You can watch the live sessions (and recorded sessions) on the YouTube playlist:
https://youtube.com/playlist?list=PLqFaTIg4myu9r7uRoNfbJhHUbLp-1t1YE
And also check out the files and codelab notebooks here:
https://kaggle.com/learn-guide/5-day-agents
Hi
Thank you, solution provided in Day2 trouble shooting colab, I will try it
When to use MCP and when not to use it in terms of Agents ?
If I am creating tools that my agent will be interacting, these tools are only specific to my project so should I use MCP protocol there for communication with agent ? My thought is to use MCP protocol when for MCP servers we are connecting with our agents like slack mcp server, github mcp server etc. Correct me if I am wrong please.
Did anyone finish exercise of day 2_b? The tiny image generator can only retrieve 1 image, so trying to ask more is a problem
is there any way to get yourself unblocked in this server. One of my friend got blocked for no reason. He joined only to see updates for the Agent AI Course but server blocked him. Same for me I got my IP blocked for no absolute reason. We didn't spam, or abuse anyone nothing at all.
Hi
Try quick save without running
can anyone suggest how to delete orevious agents and multiagent and start from scratch in code lab?can anyone help
hi everyone,im new to AI/ML . I have been give a task of fine tuning a SLM of size 1B on a domain specific dataset.It should be conversational.
what should the dataset loot like for conversation flow in small models 1 B size,
should it be like multiturn conversation on human language+ domain specific data
or only multi turn domain specific data
only suggest a good model for this.
Will participants in the course receive two separate certificates?
One for attending and completing the course itself, and another specifically for the final Capstone Project on Kaggle?
Hi, this is Shawn
We have just launched the new management interface for our Seedream and Seedance APIs: https://www.siray.ai/
You could sign up to use it completely for free!
In return, we are actively seeking your valuable feedback. Your real-world experience is crucial for our next major update, covering everything from product features and API documentation to the overall UI/UX design.
All input is welcome!
We look forward to your tests and suggestions!
I have no credit or debit card, so how can i get access Google cloud service, please help me!
If you're a student at a recognized college, you can access GCP completely free with Google Cloud Student Credits
try to highlight something novel abt your approach
Thanks
Hi @pale cove
I think If the tool should be a service, use MCP.
If the tool is just logic and internal stuff, donโt use MCP.
https://www.linkedin.com/feed/update/urn:li:activity:7399828693520662528/
Hi guys, I posted my project in Linkedin, please have a look at it and Review, I'm Excited to receive your feedback and while you are there small like and comment would be helpful ๐
I'm facing challenges to submit my project, specifically it's taking long time but still not get submitted. Someone help please.

Hi buddies, I'm Jai, just completed my 7th semester in my Computer Science course and specialised in Machine Learning.
hey guys, can i have some feedback on the diabetes prediction competition? I cant really seem to improve my score. would love some insights on reasons why my scores aren't improving.
Hey everyone! ๐
I'm currently looking for part-time remote opportunities or project-based work as a Junior AI Engineer.
What I bring:
- 1+ year building production agentic AI systems with LangGraph/LangChain
- Full-stack development (React, FastAPI, AWS, Docker)
- Proven experience in RAG systems, multi-agent architectures, and LLM applications
- Currently working on enterprise AI recruitment and HR automation systems
I'm open to freelance projects, contract work, or part-time positions in AI/ML development. If you know of any opportunities or are working on something interesting, I'd love to connect!
Resume: https://drive.google.com/file/d/1uJjhUuCPQO2-W_ih6lQ_n9G0puqbfs3X/view?usp=sharing
Feel free to DM me or drop any leads here. Thanks! ๐
I don't know how to push my notebook to kaggle via API, can you help me with that?
I love that parquet files are now parsed by the dataset detailed view!
How can I find out what the size limit is?
is there any discord bot?
small bug though: https://www.kaggle.com/datasets/lichess/chess-puzzles
It seems to be counting all columns across shards. (It's the same dataset split into 3, so 10 columns and not 30 columns like the UI says)
related: i have to name each file and its columns, but it's just different shards of the same data. not sure if there's a way to specify that by grouping all shards into a folder or something
Hey everyone ๐ I just rebuilt my full-stack + AI developer portfolio and would love some quick feedback.
๐ https://syeda-hoorain-ali.vercel.app
Iโm especially looking for suggestions on how itโs built (tech / structure) and how to make it more appealing to clients.
Some sections are still in progress, but honest critiques are very welcome.
Happy to answer questions too ๐
Iโm a student learning ML and kept getting stuck jumping between random resources.
I built a small free MVP (for personal use initially) that turns any topic into a structured learning path โ including mixed fields like ML + X.
My question: does this kind of structure actually help when learning ML, or does it feel too artificial?
Link (only for context): https://omniscientailearningg.in
Would really appreciate honest feedback โ whatโs confusing / useless is more valuable than praise.
Hi everyone,
I am currently working on a skin disease classification project and using ACWGAN-GP (Auxiliary Classifier Wasserstein GAN with Gradient Penalty) to augment the HAM10000 dataset, specifically for the minority classes.
Despite implementing several best practices, I am struggling with a high FID score (poor image quality/diversity). I would appreciate any insights or suggestions from the community.
Dataset: HAM10000preprocessed to 128x128
The Issue:
The FID score remains high even after many epochs. The generated images look blurry and lack the fine textures necessary for skin lesion features (like pigment networks).
Questions:
- Are there specific architectural tweaks for ACWGAN-GP that work well with medical/skin imaging?
- How do you balance the Auxiliary Classifier loss versus the Adversarial loss to prevent the "garbage" output?
- Would you recommend specific data augmentation on the real images before feeding them into the GAN for this specific dataset?
Iโve attached a snippet of my training loop below. Any help would be greatly appreciated!
https://www.kaggle.com/code/akbariffianto/acwgan-ham100000
fix FID first
compute FID with identical preprocessing for real,fake and use โฅ5kโ10k samples
128ร128 is too low for pigment texture
so if possible train at 256ร256
and then make the discriminator more texture-aware
use a Patch or multi-scale discriminator
and donโt let the aux classifier dominate
Crop to lesion ROIso the GAN learns lesion detail, not background skin
augment real data lightly
now getting:
Unable to show preview
We don't have metadata for this file
is the feature documented anywhere i can follow along? ๐
Need help to create a chatbot.
Please go through this reddit post :
https://www.reddit.com/r/learnmachinelearning/s/3NUzEOqVB8
Need to create this .
Can anyone help me .
Thnx for the idea
Hi @everyone
๐ Python Loops & Strings โ Kaggle Notebook ๐
This notebook explains Python loops (for, while) and strings in a detailed and easy-to-understand way, with clear examples.
Itโs especially helpful for beginners ๐
Please check it out and leave a vote โญ and a comment ๐ฌ โ your feedback is highly appreciated! ๐
https://www.kaggle.com/code/dastgeerjutt/3-loops-and-strings-detailed
Hii guys can anybody help . I am new in kaggle competition and unable to make submission . I am getting this red flag
Cannot submit
Your Notebook cannot use internet access in this competition. Please disable internet in the Notebook editor and save a new version.'
You have to Open your Kaggle notebook โ go to Settings (right side) โ turn Internet = OFF โ Save Version (make sure itโs saved with internet disabled) โ then try submitting again.
Don't think this is what this channel is for
@rustic blade it's kings command not opinion so u better follow
Assalam o alikum
@everyone @Muhammad Aammar Tufail @M. Abubakar Ansari @Muhammad Usman
HBL PSL 2026 Cricket Players Auction Analysis
https://www.kaggle.com/code/hammadansari7/hbl-psl-2026-cricket-players-auction-analysis
Please check it out, point out my errors, and let me know how I can improve.
Assalam o alikum @everyone
please vote my notebook
https://www.kaggle.com/code/hammadansari7/hbl-psl-2026-cricket-players-auction-analysisAssalam o alikum @everyone
please vote my notebook
https://www.kaggle.com/code/hammadansari7/hbl-psl-2026-cricket-players-auction-analysis
Assalam o alikum @everyone
please vote my notebook
https://www.kaggle.com/code/hammadansari7/hbl-psl-2026-cricket-players-auction-analysis
Hi everyone!
I need some help.
A notebook I deleted is stuck in "Version 1, cancelling" for over 14 hours. The notebook was imported from GitHub and deleted only on Kaggle. There is no option to stop sessions or delete it from the active events list.
How can I stop it?
https://www.kaggle.com/datasets/mabubakrsiddiq/developer-stress-simulation-dataset
please explore this dataset and upvote...
I'm having the same problem as well, mines at 15 hours.
Also, while I'm here check out my code in my notebooks from this past month. I'm trying to push more upvotes for it when i can: https://www.kaggle.com/code/gastondana/heart-disease-recirculation-loop-auc-0-964
Thx.
The download resume is a chefs kiss! Nice job!
Hello, I would like to ask professionals to help me and provide feedbackโam I doing this correctly? Has it been done properly, or is AI just leading me in circles? How should I proceed further? Any recommendations would be greatly appreciated. https://www.kaggle.com/code/kiza123123/heart-disease-s6e2-protocol
Mine got fixed, took just over a day and a half or something
I guess they do cleanup at a regular interval
@left jewel Yeah it was all weekend for me, mines updated now. I noticed safari was fixed first before chrome was, but either way I'm glad. Now I got to pick my flow back up this afternoon that i established before the wave of dead kernels.
Hello everyone, I'm having problems with inference performance on the kaggle platform. I was trying to submit a U-Net to a challenge and, the first time (some days ago) each inference was taking around 30 seconds to complete. Now it is taking more than 2 minutes and 30 seconds. Do you have/had the same problem? Thanks!
Hello, Devs i have completed Data Science and Machine/Deep Learning, and i also done some projects with integrating machine learning, so what is my next step to prepare for job, so can someone guide me on my next steps
If you are interested about digital twins I have some libaries on my page, I'm a PhD student rn but I am doing some notebooks as a Teaching Assitant on that kind of stuff, its useful if you want to replicate irl systems, i.e. simulate a factory etc https://www.kaggle.com/petrumihaicraciun
You might be interested in this amazing new dataset that got published on Kaggle: https://www.kaggle.com/organizations/sordi-ai/
SORDI.ai is the Worldโs Largest and Most Comprehensive Collection of Multimodal Synthetic Datasets for Industries.
This does look like a pretty cool data set thank you ๐
I am looking for a data scientist
We are seeking a Senior Data Scientist to serve as the analytical architect of our SaaS platform. This role is not about general data analysis; it is about building the predictive core that dictates operational efficiency for mission-critical environments. You will be responsible for developing high-precision forecasting models, curating industry-specific constraint databases (e.g., healthcare hygiene regulations), and architecting scalable data pipelines. Working at the intersection of operations and finance. You will turn raw spatial and historical data into automated, optimized cleaning plans that define the gold standard for hygiene logistics.
Hi there,
I found some possibly odd glitches during 1 of my sessions earlier in the week mainly in an active notebook, I documented all the weird kernel mishaps during it. I can provide a 1 page quick summary of what I found, & if you need images/videos I can provide them as well. If you'd like of course.
Thank you.
Gaston D.
Would anyone here know where would be the best place for a new graduate in Data Analytics to start here in Kaggle to further develop my skills in AI? I look forward to hearing from you soon!
You should start doing some kaggle beginner competitions, like #๐ โhouse-prices-advanced-regression-techniques
Hey Everyone! I'm a CS student and a football referee. I'm exploring the idea of building an AI project/startup to help the referee community (training tools, match incident review, etc.).
If anyone is interested, I would like to collaborate or share ideas, feel free to DM me.
Hello hackers,
I need some help. Iโm training a conversation disentanglement model using this repo: https://github.com/jkkummerfeld/irc-disentanglement
. It will be used to prepare a conversation dataset for a project.
I donโt have access to compute resources that can run continuously for five days. Iโm using Google Colab, but sessions eventually stop when the tab closes or times out. I also canโt afford a cloud provider right now.
If anyone has a home setup that can run uninterrupted for several days and is willing to help, I would really appreciate it. Thanks!
Hi devs, I am looking for a mid/senior engineers in various fully remote roles (Frontend, Backend, Full-stack, AI/ML, Data, DevOps). it's full-time freelancing work and the pay is up to $5/hr for mid level and up to $8 for senior level. Please DM me.
@slow rover
Hello,
My name is Pius Shedrach, Frontend engineer.
I saw you post on #๐ปโask-a-dev message
Am available to start working right away. If you still need a frontend dev I can send link to my works.
Thanks
Hello everyone! ๐
If you want to upgrade your IT skills and learn more about the Microsoft ecosystem (Azure, AI, Cloud, etc.), come join the Microsoft Elevate Training Center! ๐
This program is great for those who want to prepare for official certifications or simply stay updated with the latest technologies together with Dicoding.
Register for free through this link: https://www.dicoding.com/elevate/registration?referrer_id=5510036
Letโs go while the opportunity is still there!
Hey @everyone
I have just completed my first project by using python,numpy,pandas,matplotlib,seaborn...
Just view it once and and give me your feedback.
Your feedback really matters a lot to me.
https://www.linkedin.com/posts/zia-ur-rehman63_icodeguru-dataanalytics-python-activity-7437385372671688704-MyBu?utm_source=share&utm_medium=member_desktop&rcm=ACoAAFi9ZNsBS730JlvcudUp_BZUGk5XmwWSkaM
If you found this project valuable, feel free to like or share your thoughts.
Hi @slow rover plz check DM
Great work
Nice
Hello everyone @everyone , I am a Grade 10 student passionate about Aerospace Engineering. I have decided to to learn ML to integrate it with aerospace as i believe this is going to be the way moving forward in terms of job skills. This is my first ML project related to this and hope you find it interesting
I used Claude for the SMOTE code as the dataset had too many class imbalances
Really looking forward for your feedbacks!
HELLO ADHARVA great to here u are into ml I am also a tenth grader and just started lerning it , its confusing but interesting
Hey guys! Iโve been working on Nexbook.tech
, an EdTech "academic super-app" for college students that provides 5000+ notes, PYQs, and a book marketplace. Iโm trying to optimize the UI/UXโspecifically the PDF viewer experience and the marketplace navigation.
I would love some professional feedback from this community: Does the layout feel too cluttered for a student-focused app? Any suggestions on improving the PWA performance? ๐
hey, great project, I hope you deactivated your KGAT token which is committed in the repo
Hi, your work is showing Kaggle tokens; please fix the issue quickly.
That key needs to be deactivated and a new one obtained; otherwise, it's possible to view the key again through the old settings.
i did that
Alright, then I wish you success in your endeavors.
Thanks!
Help me decide? ๐ง
We're tweaking the interface for Nexbook.tech and Iโm stuck between two styles. Would love to get some fresh eyes on the current UI.
Link: https://nexbook.tech
Whatโs the first thing youโd change?
Any one here know python
I need help with a small project
class Pokemon():
def __init__(self, PokemonName, PokemonLevel, PokemonType, PokemonStrength, PokemonWeakness, ImagePath):
self.Name = PokemonName
self.Level = PokemonLevel
self.Type = PokemonType
self.Strength = PokemonStrength
self.Weakness = PokemonWeakness
self.ImagePath = ImagePath
def displayInfo(self):
print("------------------")
print("Pokemon Name : ", self.Name)
print("Pokemon Level : ", self.Level)
print("Pokemon Type : ", self.Type)
print("Pokemon Strength : ", self.Strength)
print("Pokemon Weakness : ", self.Weakness)
def showImage(self):
img = Image.open(self.ImagePath)
img.show()
pikachu = Pokemon("Pikachu", 100, "electric", "water, flying", "ground",
"/sdcard/Android/data/ru.iiec.pydroid3/files/pokesprites/25.PNG")
pikachu.displayInfo()
pikachu.showImage()
charizard = Pokemon("Charizard", 100, "fire/flying", "grass, bug, ice, steel, fighting", "water, rock, electric",
"/sdcard/Android/data/ru.iiec.pydroid3/files/pokesprites/6.PNG")
charizard.displayInfo()
charizard.showImage()
dragonite = Pokemon("Dragonite", 100, "dragon/flying", "dragon, bug, fighting, grass", "fairy, dragon, rock",
"/sdcard/Android/data/ru.iiec.pydroid3/files/pokesprites/149.PNG")
dragonite.displayInfo()
dragonite.showImage() ```
It works fine but the images don't show
is this the poke api thing ???
Nah its just a small project I'm doing
Im doing it more for learning than achieving an end goal with it
But idk why the pictures won't show
What pokeapi thing are you talking about tho?
so i built a single window application using pyside6, where i fetched data of Pokรฉmon name entered by the user from a free api available online named pokeapi it showed name type and picture of the pokemon
Oh i just searched it it's cool
Did u use python?
yes
But my program isn't really about the pokemon it's mainly for me to learn and for fun
Do you know why the pictures might not be loading tho?
are u running it on laptop???
the path u r trying to get the image from only exists for phones not for windows or mac
Pydroid 3
On my phone
It's basically the python for androids
man idk then most probably the path is wrong or the folder/files dont exist, cuz the code block is almost correct
I would have gotten an error if the path was wrong
thats true :3
Hey @Eveyone is here anyone knows MS Adams software (advance level) I need some help in my project please contact me or DM if you can
Happy Weekend!
Hello Everyone!
If you know someone who have good skills in Python and Machine Learning, Please invite me!
Our Company is open to hire Python and Software Engineer.
Requirements:
- 2+ years of Software Engineering Experience
- C1 or Native English Level
- Good vision of Software Trent
Benefits:
- Competitive Income
- Supporting Several roles and chances
- Multiple Role Working is enable
Important:
- Our company is designed for Capability Person.
Questions:
- For Junior Persons?
Do not give up, strong enthusiasm is also big point and our company also focus on the person's enthusiasm.
Thanks again.
Sophia
Selected at Stanford University as SL for the 2nd time โค๏ธ๐ฅ
it means a lot if I get a supportive comment from your side
Hi guys rate my website from my profile, any improvement, recommendation etc?
do people not read channel descriptions?
this channel is for questions about Kaggle as a platform 
Ask Kaggle developers questions about the site and we will do our best to answer you! This is a place to ask about how things work, or any behind the scenes information that we can share with you. Feel free to ask questions about anything our developers may know, and we will do what we can to satisfy your curiosity. Fun questions are welcome!
After long evaluation and thought.... I am pretty confident in saying no, no one reads the channel descriptions before posting. LOL. I will remind everyone it is a violation of server rules to post the same thing across many channels.
Hello Everyone!
If you know someone who have good skills in Python and Machine Learning, Please invite me!
Our Company is open to hire Python and Software Engineer.
Requirements:
2+ years of Software Engineering Experience
C1 or Native English Level
Good vision of Software Trent
Benefits:
Competitive Income
Supporting Several roles and chances
Multiple Role Working is enable
Important:
Our company is designed for Capability Person.
Questions:
For Junior Persons?
Do not give up, strong enthusiasm is also big point and our company also focus on the person's enthusiasm.
How to apply?
DM with resume and 1min's record of your English Speaking
Thanks again.
Sophia
@violet locust Can you please help me, Asking since you are staff here
Hi Kaggle Team , I need your support, when i am trying to verify my account using my mobile number , it says that i am using multiple accounts. Can you guys please help me know what that other account is so that i delete it , or can you please delink my mobile number from it ?
Please .
cc:- @pine narwhal , @knotty bobcat Can someone please look into this ? Please , I have also raised this issue at email support@kaggel.com
@lyric bane support@kaggle.com is the right place to get help with this (note that you have a typo in your message so you may want to double check you sent an email to the right address).
Thx @pine narwhal my bad I mailed at the right address but shared wrong one here ! I sent them mail but didnโt get any positive response back from there ! Can you please escalate it ?
Hey folks,
i am taking this course https://www.kaggle.com/code/markishere/day-1-prompting
and in the very first question - i am hitting gemini api usage
'You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits.e
how to resolve that?
Hi everyone, I'm currently working on my Masters in Data Science with a particular interest in data visualization.
Are there any data visualization tools, programs, or best practices that you could share to help me on my journey?
Hey, I'm teaming up in a team competition for the first time, but experiment tracking, model sharing is now a big chaos, I saw a few solutions like mlflow, kedro, dagshub, but I don't really understand what should I actually chose for, I tried a few things but everuthing is still messed up, all the 4 member right now have multiple notebooks, some forked, some dead and lost on kaggle, we all have our own leaderboards...
Yeah I know, it's more of a team bonding issue than a tool issue, still can you share some advices?
Hi everyone, I'm currently working on my Masters in Data Science, IS there any dataset about GASOIL for free
Yeah I know someone.
I sent you a friend request.
Seaborn is a data visualization tool . In seaborn website and kaggle dataset help you in data visualization practice
Hi everyone
I built WhyAnalyst AI because I was tired of having $10 problems that required $1000 workflows to solve.
A simple question like:
โWhy did sales drop last month?โ
usually turned into:
exporting CSVs
cleaning spreadsheets
writing SQL
opening dashboards
debugging charts
wasting hours
The problem itself was tiny.
The workflow around it was insane.
So I built WhyAnalyst AI.
Now I can:
upload a dataset
ask questions in plain English
get charts + insights instantly
No dashboards.
No spreadsheet chaos.
No analyst bottleneck.
AI wonโt replace analysts.
But it will replace a huge amount of repetitive analysis work.
That shift is coming much faster than people think.
