#🔗┊sharing-projects

1 messages · Page 1 of 1 (latest)

ruby nymph
solar crown
shy plume
mint wagon
umbral thicket
weak sierra
solar crown
#

This is a notebook for face verification and Recognition that I have enjoyed structuring it: https://www.kaggle.com/code/diaaessam/face-verification-and-recognition and this is a whole project I made on top of the idea: https://github.com/DiaaEssam/Face_Verification_and_Recognition_System

GitHub

Contribute to DiaaEssam/Face_Verification_and_Recognition_System development by creating an account on GitHub.

lost elm
#

Hi everyone, sharing this Global Disasters/Accidents datasets that can be useful for EDA and visualization for beginners and intermediates.

https://www.kaggle.com/datasets/warcoder/earthquake-dataset
https://www.kaggle.com/datasets/warcoder/oil-spillage-data
https://www.kaggle.com/datasets/warcoder/civil-aviation-accidents

gentle notch
dense grove
#

Hello Everyone, inviting you to have a look on my work regarding how scripting back pain issues be like !! I just uploaded a Dataset on Kaggle which involves Notes being framed by Radiologists with respect to Lumbar Spine which are refined by me before being uploaded. Would be happy to receive any feedback or areas of improvement from the community.

Link :- https://www.kaggle.com/datasets/tejaskarkera001/radiologist-notes-lumbar-spine

solar crown
gloomy plinth
solar crown
# gloomy plinth Want to use Time-series in it?

I have looked at some time series data.
in the column date, it moves day by day but in my data it moves week by week, so I don't know if my data will be applicable for time series or not

gloomy plinth
#

Need to set the appropriate horizon and window size

solar crown
sour junco
#

hey hi eveyone here is my dataset on world wide cargo ships that sail all over the world.
You can use this dataset to train a model to predict the weight of the ships if the dimensions are given,
This dataset also contain's the name of the company and the year it was built,
Here is the link :

orchid geyser
umbral thicket
dusky zephyr
#

Hi everyone! The following starter notebook may be helpful to the new participants in the competition CommonLit - Evaluate Student Summaries. Apart from EDA and baseline models, the key highlight of the notebook is two different approaches to dealing with regression problems with multiple output variables. Any review/suggestions would be much appreciated. Thanks!

https://www.kaggle.com/code/sugataghosh/commonlit-multioutputregressor-regressorchain

neat mirage
#

hi guys, are you interested in Natural Language Processing but don't know where to start? I have recently made an entire series of notebooks to let you have a comprehensive overview of NLP. The notebook is tailor-made for beginners and if you are interested, go check it out!

Let me know if there are any improvements/corrections if you see any, and feel free to comment as well!

✅ Comprehensive Overview on NLP for Beginners 🥳 (collection of all series)
https://www.kaggle.com/code/crxxom/comprehensive-overview-on-nlp-for-beginners

🔴 NLP Beginner Series Part 1: NLP Preprocessing
https://www.kaggle.com/code/crxxom/nlp-beginner-series-part-1-nlp-preprocessing

🟡 NLP Beginner Series Part 2.1: Word Embeddings
https://www.kaggle.com/code/crxxom/nlp-beginner-series-part-2-1-word-embeddings

🟢 NLP Beginner Series Part 2.2: Embedding Models
https://www.kaggle.com/code/crxxom/nlp-beginner-series-part-2-2-embedding-models

🟣 NLP Beginner Series Part 3: Case Study
https://www.kaggle.com/code/crxxom/nlp-beginner-series-part-3-case-study

lost elm
solar crown
weak sierra
weak sierra
frank peak
#

📈Forecasting in Data Science: Predicting the Future with Data-Driven Insights🔮🚀

Hi everyone! This is my second topic on Kaggle that discusses everything you need to know about forecasting, including:

👉 Definition of forecasting,
👉 Techniques in forecasting,
👉 Step by step doing forecasting, and
👉 The challenges

Link to Kaggle Discussions: https://www.kaggle.com/discussions/general/428168

I hope you find this topic useful, and please let me know what you think about this one. Thank you very much!

orchid geyser
#

Hello everyone! I've made my very first dataset on Kaggle, and it's a complete list of all rollercoasters, past and present, globally!

Link: https://www.kaggle.com/datasets/mcpenguin/rollercoasters

Please play around with the data and upvote if you think this is good, and let me know if you have any feedback! I will publish a starter notebook soon even though I should really be studying for exams right now lol

frank peak
#

Unlocking Hidden Gems: Lesser-Known Secrets, Tips, and Tricks in Pandas

Hello everyone! This is my third discussion on Kaggle. This time, I discuss tips & tricks about Pandas Library that you may not have known yet. This discussion summarizes 10 tips & tricks, including their codes and how to use them.

🌎 Link to Kaggle Discussions: https://www.kaggle.com/discussions/general/429798

I hope you find this helpful topic for you who started learning about Pandas or want to expand your knowledge of Pandas. Also, if you want part 2, comment on the discussion. Let me know what you think about this one & if you know other tips & tricks that haven’t been mentioned in the article. Thank you very much!

buoyant scaffold
#

Hello everyone! I created my FIRST kaggle notebook.
📊Check out my Kaggle notebook for exciting analyses and visualizations. Dive in now: 🚀https://www.kaggle.com/code/cauelias/diabets-notebook
If you have some tips or advice please comment. I'm so excited to read what you have to say!

neat mirage
dusky zephyr
trail fractal
cyan pier
orchid geyser
#

Make sure you set index=False as well

novel sedge
#

🌍 Twitter Earthquake Data Analysis

This Kaggle dataset contains 📊 information and 📈 statistics related to a recent earthquake event in Turkey. The dataset can provide valuable insights into the online response, interaction, sentiment, and more 📊 surrounding the earthquake through Twitter data. This dataset contains data from February 6, 2023, to February 11, 2023.

📋 Data Conditions

  • Have at least 1 👍 like OR at least 1 🔄 retweet

🌐 Language Variants

  • The keyword for English_V2 files is #TurkeyEarthquake.
  • The keyword for Turkish and English files is #deprem.

The dataset contains the following basic properties:

  • 🔗 URL: URL of the tweet.
  • 👤 Username: Twitter username of the author of the tweet.
  • 📅 Date: The date and time the tweet was sent.
  • 📝 Tweet: The content of the tweet.
  • 🔗 Hashtags: Hashtags used in the tweet.
  • 👥 Mentions: All Twitter accounts mentioned in the tweet.
  • ❤️ Number of Likes: The number of likes (favorites) the tweet received.
  • 🔄 Number of Retweets: The number of times the tweet was retweeted.
  • 💬 Number of Replies: The number of replies the tweet generated.

The dataset provides insights into engagement metrics such as the number of likes, retweets, and replies for each tweet. It also includes details on hashtags, mentions, and the content of tweets, providing a comprehensive view of how the earthquake event was discussed and shared on Twitter.

Furthermore, the dataset includes three additional sections, each providing specific information:

  1. 📊 Number of Tweets by Date: A breakdown of the number of tweets posted between 6 February 2023 and 11 February 2023, categorized by different time periods. This information helps to understand the volume of Twitter activity throughout the day.

  2. 🔖 Tweet Tag Counts: This section presents numbers grouped by different value ranges. The values have some kind of classification or labeling.

  3. 📜 Individual Tweet Details: A list of individual tweets, including their content, author details, and engagement metrics. Each individual tweet can provide insight into Twitter users' emotions and reactions to the earthquake event (with appropriate analysis).

In summary, this dataset provides a valuable resource for understanding the real-time reaction of Twitter users to the earthquake event that occurred in Turkey on February 6, 2023, and for conducting sentiment analysis and engagement metrics analysis. Researchers and data analysts interested in social media analytics, disaster response, and sentiment monitoring will find this dataset useful for their analysis.
Check it out here: Twitter Earthquake Data Set

turbid prairie
#

I'm thrilled to announce the launch of my new newsletter: Data People! 📊📈🤓

Each week, I'll distill the work of highly-skilled data professionals via transcribed interviews that take <5 min to read. Everyone from aspiring analysts to seasoned data veterans can learn from these world-class data experts.

This Week: Meet Ken Jee!
For our inaugural post, I talked to sports analyst, Kaggle advocate, and YouTube educator Ken Jee. We cover:
⚾️ Breaking into sports analytics and what his role entails
🧠 How he uses LLMs in his day-to-day work
💪 Advice for data scientists looking to grow their skill sets
https://www.askdatapeople.com/p/ken-jee-33ae

In upcoming issues, we'll learn from: an econometrics professor, an AI researcher, a data privacy advocate, an ML Ops evangelist, and more.

Interested in bite-sized nuggets of wisdom from world-class data professionals? Subscribe for future interviews.

Data People

Analyzing athletes, growing your data skills, and how LLMs will change data science.

slow jacinth
turbid prairie
#

Thanks! there are like three sections dedicated to Kaggle in this newsletter 🙂

turbid prairie
slow jacinth
turbid prairie
low valley
neat mirage
#

Football Transfer News Articles for NLP

The football transfer market is going wild these days, especially in the premier league, with billions of spending by clubs like Chelsea in these several transfer windows, are you quick enough to grasp the rapid transfers and rumors that are going all over the place every day in the transfer market?

This dataset provides you with all the news articles related to the transfer market in football published on 90min.com from May 2020 to August 2023.

Train your NLP model with a good amount of text and content in the dataset and make cool predictions and applications to have a better insight of the market!

https://www.kaggle.com/datasets/crxxom/football-transfer-news-for-nlp

ruby nymph
rugged scarab
mint wagon
lost elm
stiff mango
rain crown
#

Great work, @stiff mango. Thanks for sharing!

vapid blaze
#

Hi everyone,
Here is a list of some of my projects as well:

I have also created a GitHub repository where I have uploaded most of my Kaggle notebooks, you can check it out here: https://github.com/Anubhav-Goyal01/Machine-Learning-Projects

GitHub

This project aims to predict the price of a laptop based on various features. It utilizes machine learning techniques to train a model and make predictions. - GitHub - Anubhav-Goyal01/Laptop-Price-...

GitHub

This project aims to predict the rent of a house based on various features such as location, furnishing status and square footage. The machine learning model has been trained on a dataset consistin...

GitHub

This project aims to demonstrate the end-to-end workflow of training and using the YOLOv5 model for object detection tasks. - GitHub - Anubhav-Goyal01/YOLOv5-Object-Detection: This project aims to ...

GitHub

This repository contains several machine learning projects implemented using Jupyter Notebooks. Each project is contained in its own folder and comes with a corresponding dataset. - GitHub - Anubha...

rugged scarab
#

Hi Kaggle fam🙋‍♂️, I have created a new dataset on 🏨hotel data of different Indian cities. Data was scrapped from MakeMyTrip booking site. The data includes price per night, star category, ratings, etc. for each hotel. Cities like Mumbai, Delhi, Bangalore are available. More to be added soon. Only nearly 100 hotels have been added for each city. Please have a look at it and drop your comments💬⬇️.
https://www.kaggle.com/datasets/andrewgeorgeissac/hotel-price-data-of-cities-in-india-makemytrip

frank peak
#

[Auto-Update Existing Kaggle Datasets via API]

Hello everyone!

Since some features in Deepnote will be deprecated after August 2023, I moved and updated the script for updating datasets via Kaggle API from Deepnote to Kaggle.

In this notebook, you can learn how to update dataset versions using Kaggle API by providing metadata and utilizing Kaggle secrets.

🌎Link to notebook: https://www.kaggle.com/code/caesarmario/auto-update-existing-kaggle-datasets-via-api

Feel free to check it out and let me know your thoughts!

mossy mica
orchid geyser
frank peak
#

[Mastering Forecasting in Data Science: An Intermediate Guide]

Hey everyone! In this second article about forecasting, I will discuss and explain forecasting more deeply, exploring more sophisticated methods, best practices, and real-world applications. In addition, I will also explain the best practices and challenges that you might face to master your forecasting technique.

🌎 Link to Kaggle Discussions: https://www.kaggle.com/discussions/general/433384

I hope you find this helpful topic for those who want to learn more about forecasting. Also, if you have an interesting idea/topic that I should discuss in the next discussion, let me know. Thank you very much!

lost elm
low valley
#

Hey everyone!

I recently published a notebook on Kaggle that dives into the Latest Data Science Salaries dataset. Here's what I explored:

. Conducted an extensive EDA using Plotly to sharpen my data visualization skills.

. Built models to predict salaries in USD.

Feel free to check it out! Your feedback and suggestions are more than welcome!

https://www.kaggle.com/code/lusfernandotorres/data-science-salaries-2023-eda-prediction

tranquil owl
#

Data set

#

Management operations dataset

orchid geyser
#

Hello everyone again, I've just completed a notebook which involves extensive data cleaning (which is not really something you see a lot of on Kaggle) and some exploratory data analysis of the World Happiness Reports dataset!

Link: https://www.kaggle.com/code/mcpenguin/world-happiness-data-cleaning-eda/notebook

Please check it out and let me know if you have any feedback/updates :))

restive minnow
#

Hello everyone,
I was always interested to know, how is my kaggle progress over time? By "progress", I mean, number of upvotes, number of medals on my activity. Hence, I created this notebook, which plots number of upvotes and medals over time for various categories (e.g. Discussion, Notebooks, Datasets etc).

Notebook link: https://www.kaggle.com/code/mohit2512/your-progression/notebook

You can check your progress as well, just add your user name.

lost elm
#

🌟21 Indic Languages Transliteration Dataset🌟
The Indic languages are languages of the Indian subcontinent, all the indigenous languages of the region regardless of language family. There are 21 languages in the dataset as the name suggests like Assamese, Urdu, Gujarati etc.

Link to the dataset: https://www.kaggle.com/datasets/warcoder/transliteration-dataset-21-indic-languages

brittle mason
orchid geyser
lost elm
vapid blaze
low valley
#

Hey everyone! I just posted a new notebook where I approached the applicability of the Weight of Evidence and Information Value, two of the most used tools in Finance and Credit Analysis for feature selection. I have also showed the importance of these tools for boosting performance and explainability of White Box Models, such as a simple Logistic Regression model. Feel free to check it out! https://www.kaggle.com/code/lusfernandotorres/weight-of-evidence-and-information-value

mossy mica
dusky zephyr
dusky zephyr
tiny sail
mossy mica
ruby nymph
frank peak
#

[🤠🐼Mastering Pandas: Unveiling Rarely Explored Secrets, Tips, and Tricks💎]

Hello everyone! This is my other discussion on Kaggle. This time, I discuss other tips & tricks about Pandas Library that you may not have known yet. This discussion summarizes 10 other tips & tricks, including their codes and how to use them.

🌎 Link to Kaggle Discussions: https://www.kaggle.com/discussions/general/435239

I hope you find this helpful topic for you who started learning about Pandas or want to expand your knowledge of Pandas. Also, if you want part 3, comment on the discussion. Let me know what you think about this one & if you know other tips & tricks that haven’t been mentioned in the article. Thank you very much!

orchid geyser
winged marsh
#

follow my projects on https://github.com/SarahY89

zealous swallow
warm wave
#

🎀 Hello everyone. I share my regression work with you. Your feedback is very valuable! . Waiting for your feedback. Don't forget to vote and star if you liked it. 🌟 Thanks 🙏

https://www.kaggle.com/code/huseyincenik/health-expense-explorer
https://github.com/huseyincenik/data_science/tree/main/Projects/Health Expense Explorer

GitHub

Data Science materials. Contribute to huseyincenik/data_science development by creating an account on GitHub.

primal tiger
lilac zinc
prime summit
#

Super cool! Does it contain any of the flight vector data?

mint wagon
scenic badger
scenic badger
neat mirage
rustic solstice
#

Hey there!
I'm part of a research team that has spent the past nine years building a dataset of Antarctic geology.
Our dataset just got published in Nature Scientific Data: https://www.nature.com/articles/s41597-023-02152-9
I've just uploaded the dataset to Kaggle: https://www.kaggle.com/datasets/samelkind/geomap-a-geological-dataset-of-antarctica

The dataset is made up of 99,080 polygons that cover all exposed outcrops on the continent. Each polygon has over 30 attributes including age, lithology, geological unit, and description.
If you are interested in geospatial analysis, Antarctica, or geology, please check it out!
I'm still in the process of creating some example notebooks, so if you're having trouble getting started, let me know and I'll try to help you out!

Nature

Scientific Data - A continent-wide detailed geological map dataset of Antarctica

orchid geyser
#

Hi everyone, I'm excited to announce my new notebook on classifying the smoking and drinking status of Korean individuals. I didn't get a good accuracy but that might just be a reflection of the methods I employed rather than the data. Nevertheless, I did some pretty extensive feature engineering and EDA, so I would appreciate it if you could check it out and give me feedback!
Link: https://www.kaggle.com/mcpenguin/smoking-drinking-classification-tfdf

scenic badger
noble sleet
#

Hello everyone! I've just completed my first work on a classification algorithm using a spam email dataset. I would love to hear your thoughts and suggestions for any improvements I can make. Your insights would be greatly appreciated!

https://www.kaggle.com/dinanksoni/spam-email-classification

orchid geyser
daring wren
scenic badger
#

WHAT WILL YOUR SALARY BE? Well labeled and analyzed, FIND OUT NOW!

In this project, we use a comprehensive approach to analyze and predict employee salaries. The methodology begins with Exploratory Data Analysis (EDA) that employs various visualization techniques such as heatmaps, distribution plots, and pair plots to understand the underlying structure and correlations in the data.

Hope you enjoy and I would love your feedback!

https://www.kaggle.com/code/matviyamchislavskiy/your-salary-prediction-and-eda/notebook

scenic badger
livid abyss
#

Character recognition has been a pivotal problem in the field of computer vision and machine learning, finding applications in everything from document analysis to automated data entry.

In this Kaggle notebook, we embark on an exciting journey of tackling character classification using Generative Bayesian Classification with Multivariate Gaussian Models and Maximum Likelihood Estimation, all from scratch!

https://www.kaggle.com/code/varunnagpalspyz/generative-bayesian-classification-from-scratch

daring wren
frank peak
hollow spire
livid abyss
#

During my research intern, I have been working with a lot of Tabular Wikipedia Infobox Data. Now my work mostly revolves around the temporal aspect of this data, but I thought I could use my work done during this time to create a Dataset consisting of Wikipedia Infobox Data for all cricketer's found on Wikipedia.

So, here it is,
Link to the Cricketer Infobox Dataset: https://www.kaggle.com/datasets/varunnagpalspyz/uncover-cricket-legends-cricketers-wikidata
Link to the Notebook which contains code for clean and efficient extraction of Wikipedia Infoboxes in JSON format: https://www.kaggle.com/code/varunnagpalspyz/uncover-cricket-legends-data-extraction-with-ease/notebook

If anyone is working with such semi-structured data and is interested in taking up projects in this domain or knows of any work opportunities in this domain, do let me know.

daring wren
ruby nymph
#

✍️Practice your computer vision skills in a Fun way using 🐶Pet's Facial Expression Image Dataset😻

livid abyss
#

📸 Introducing my Kaggle Notebook on Face Recognition using PCA from Scratch! 🧑‍🔬

Hey there, fellow data enthusiasts! 👋 I'm excited to share my latest Kaggle Notebook on the fascinating world of Face Recognition using Principal Component Analysis (PCA) from scratch. In this notebook, I dive deep into the intricacies of PCA, demonstrating how it can be a powerful tool for dimensionality reduction and feature extraction in the realm of computer vision.

Link to the Notebook: https://www.kaggle.com/code/varunnagpalspyz/face-recognition-with-pca-from-scratch/notebook

Here's a sneak peek of what you'll find in my notebook:

🔍 Exploration: We'll start by exploring the importance of face recognition and why PCA is a valuable technique for this task.

🔧 Building from Scratch: Get ready to roll up your sleeves as I guide you through the step-by-step process of implementing PCA for face recognition, without relying on external libraries. It's all about understanding the math and the magic behind it!

📈 Results & Insights: I'll showcase the results of our PCA-based face recognition model, discussing its strengths and limitations. We'll also delve into the insights gained from this approach.

Let's unravel the mysteries of PCA in face recognition together! Dive into the world of dimensionality reduction and discover the beauty of recognizing faces through the power of data. 🤩

Feel free to leave comments, ask questions, and let's embark on this learning journey together 🚀📊 #FaceRecognition #PCA #Kaggle #DataScience

hearty sky
hollow willow
lost elm
#

Siamese Networks for NLP tasks.

Whenever someone hears about Siamese networks the first thing that would come to mind is image similarity as there are so many examples of it on the internet. But while looking into the Keras Core documentation I stumbled upon Keras-nlp where I saw the use case of finetuning RoBERTa model with Siamese networks.

Sharing the notebook here on how to do it.

https://www.kaggle.com/code/warcoder/siamese-roberta-networks-with-regression-objective/notebook

glass plank
#

Hi all!
As a person who is fluent in Korean, English and Chinese, I have recently been curious about the differences in processing ideogram-based languages, such as Chinese.

If you are interested, please check out this notebook 🤩
https://www.kaggle.com/code/jasonheesanglee/ideogram-based-vs-phonogram-based-language

As I am a beginner in Data Science and as this is my side project (side study, I will say), my notebook might not be as fluent as my thoughts.
Please do leave any comments if you have any suggestions or would like to collaborate!

cobalt fern
#

Hi all !
I have made ML model for spam and not spam detection by applying Naive Bayes algorithm and also learn how to deploy it.
Deployed model 🔗: https://spam13byharsh.streamlit.app/
Git Repo 🔗: https://github.com/harshkumarpatelh/Spam/tree/main
What next project i must learn as ML beginner?

Streamlit

This app was built in Streamlit! Check it out and visit https://streamlit.io for more awesome community apps. 🎈

GitHub

Contribute to harshkumarpatelh/Spam development by creating an account on GitHub.

lost elm
#

Updated my Earthquake Dataset from 1995 to 1-9-2023

Earthquake dataset is one of my highest-rated datasets and my favourite one too as it is a very good dataset for beginners for data visualization and analysis. Updated it to include the results from the last 8 months and from years 1995 to 2000.

Link: https://www.kaggle.com/datasets/warcoder/earthquake-dataset

Some of my other geographic and disaster based datasets:

https://www.kaggle.com/datasets/warcoder/civil-aviation-accidents
https://www.kaggle.com/datasets/warcoder/oil-spillage-data

livid abyss
#

I am planning to create more notebooks where I implement various ML/DL algorithms and stuff from scratch. I am open to ideas and suggestions from your end regarding topics, mode of delivery etc. Currently I am planning to bring out notebooks on PCA, Fisher's Linear Discriminant Analysis, GMMs, and soon on some Deep Learning Implementations as well.
https://www.kaggle.com/discussions/general/437309

orchid geyser
#

Hi Kagglers, I'm excited to announce the creation of my biggest project yet - a dataset containing 40K+ listings of Malaysian condominiums, scraped from mudah.my! This was inspired by the Starter Housing Competition and the popular Melbourne Housing Snapshot dataset. As with the other two datasets, the goal is to predict the price of the condominium/apartment using the property's features.

The data is more messy than usual, but this is a good chance to practice data cleaning techniques. I also provide a starter notebook that goes through the data cleaning steps and outputs a clean-ish dataset that can be used for EDA/modelling.

Feel free to play around with the data and let me know of any feedback!

Link to dataset: https://www.kaggle.com/datasets/mcpenguin/raw-malaysian-housing-prices-data

indigo niche
#

Hi professional Kagglers!
We've launched the marketplace BrainX for global network of AI talents like you to sell their AI/Data Science services and help clients to apply AI into their businesses, solve business problems.

So if you're professional Data Scientists, AI/ML engineers,... BrainX would be your next journey after Kaggle.
You can learn more from here https://bit.ly/BrainX-for-Kagglers
If you have any questions, feel free to DM me. Thanks!

lilac zinc
#

Hello Kaggle Community!

Exciting news - I've just uploaded a multilabel tweet dataset containing three columns:

Tweet ID (String Format),
Tweet Text: The tweet's actual content ,
Labels: These cover a wide range of concerns, including effectiveness doubts and conspiracy theories.

Ideal for sentiment analysis, NLP, and multilabel classification, this dataset offers insights into diverse vaccine concerns shared on Twitter.
Explore it for your projects and research.

https://www.kaggle.com/datasets/prox37/twitter-multilabel-classification-dataset

barren oasis
#

Hello Everyone !

#

Going to publish a paper on breast cancer in elsevier

#

thought of sharing some algorithmic part beforehand on kaggle

#

Ps: Very new on Kaggle

viscid gull
#

it only works with size data ?
not ct scan or sonar ?

barren oasis
#

but once the paper comes out

#

we have included the sonar, plus ct-scan(mammography)part

#

including algo's like CNN, and AlexNet

#

Sorry, for missing that part

#

will definitely upload on kaggle later

viscid gull
dense grove
#

Hello Everyone, it is said that “Act Before It Vanishes Forever!! ” on a similar line I just uploaded a Dataset on Kaggle which involves the Endangered Visayan Warty Pigs Blood Smear Images which are refined by me before being uploaded wherein a number processing steps have been implemented. Please do take a step to look in, understand and maybe contribute to this work. Would be happy to receive any feedback or areas of improvement from the community and if this work feels an equivalent to a good contribution do upvote and share 😁 .
Link :- https://www.kaggle.com/datasets/tejaskarkera001/juvenile-visayan-warty-pig-blood-samples?select=UpdatedDatasetVisayanWartyPig

frank peak
#

https://media.discordapp.net/attachments/773910132044201994/1150296976227848313/Ep6_LinkedIn_Thumbnail.png?width=562&height=562

[ Discovering Hidden PySpark Treasures: Unique Tips and Hacks ]

Hello everyone! In this discussion, I write some tips & tricks when processing data using PySpark that rarely used/not known yet. This discussion summarizes 8 "hidden gems", including their codes and how to use them.

🌎 Link to Kaggle Discussions: https://www.kaggle.com/discussions/general/438231

I hope you find this helpful topic for you who started using PySpark as data processing/manipulation tools or want to learn more about PySpark. If you know other cool tips or tricks, let’s discuss in the comment section. Thank you & have a great day!

noble sleet
white sluice
lost elm
#

Mango Fruit Disease Detection Dataset: This is a multi-class image classification challenge. The dataset contains 1700 images of 224*224 in JPG format. There are 5 categories: Alternaria, Anthracnose, Black Mould Rot, Healthy and Stem & Rot.

Link: https://www.kaggle.com/datasets/warcoder/mangofruitdds

glass plank
# glass plank Hi all! As a person who is fluent in Korean, English and Chinese, I have recentl...

Got some update here!
I am now going through README.md of a module called "jieba", a Chinese segmentation tool.
My plan is to go through the module and check how it segments the text into correct words.

If you feel like to check it out, feel free to do so!
And if you would like to leave a comment and suggest other ways or a better options, please feel free to do so 🤩

https://www.kaggle.com/code/jasonheesanglee/ideogram-based-vs-phonogram-based-language#🀄-Ideogram-based-vs.-Phonogram-based-Language

noble sleet
tiny sail
#

@noble sleet I left comments at your notebook. Also, look at how other people have analyzed this dataset.

noble sleet
glass plank
#

Great to see you here! Always looking forward to your works 🙂
Thanks for all the works!

outer galleon
#

Hi folks, I have created a dataset, and a notebook to showcase how to use this dataset.

The theme of the dataset:
url: https://www.kaggle.com/datasets/lorentzyeung/imdb-video-games-dataset

  1. Action TV games over the years

The notebook:
url: https://www.kaggle.com/code/lorentzyeung/action-game-topic-analysis-recommendation-system

  1. simple data engineering
  2. topic classification
  3. action game recommendation system

Please feel free to drop by. If you find this helpful, please consider sharing it and giving it an upvote! Your support is appreciated.

indigo niche
#

Hi professional Kagglers!
My team from BrainX has launched the marketplace for global network of AI talents (AI/ML engineers, DS,...) like you to sell their AI/DS services and help clients to apply AI into their businesses, solve business problems.

I'm wondering if anyone is interested in learning more about it

indigo niche
noble sleet
hollow willow
lost elm
#

Guided notebook for creating custom library in Kaggle environment 🔥

So I have created custom libraries in the past to ease up my process and also the libraries can be shared easily among the teammates so that you don't have to share a whole notebook with them.

Use cases from this notebook:

  1. Create a library for some piece of code one uses regularly
  2. Hide your code in case of competition if you don't want to share
  3. These can be shared easily with teammates
  4. Can be shared as a dataset also so all Kagglers may benefit from it

Notebook link: https://www.kaggle.com/code/warcoder/create-your-custom-library-on-kaggle/

hearty sky
#

Hello everyone,

I wanted to let you know that I've just created a notebook that explains the concept of the normal distribution. This resource will be incredibly helpful in understanding various aspects, such as the area under the curve, probability density function, and more. Feel free to check it out and let me know if you have any questions or need further clarification. Happy learning! https://www.kaggle.com/code/basitarif/normal-distribution/notebook

frank peak
neat mirage
#

small update for my beginners guide for NLP, go check it out if you are interested to dive into the world of Natural Language Processing 🔥

✅ Comprehensive Overview on NLP for Beginners 🥳 (collection of all series)
https://www.kaggle.com/code/crxxom/comprehensive-overview-on-nlp-for-beginners

🔴 NLP Beginner Series Part 1: NLP Preprocessing
https://www.kaggle.com/code/crxxom/nlp-beginner-series-part-1-nlp-preprocessing

🟡 NLP Beginner Series Part 2.1: Word Embeddings
https://www.kaggle.com/code/crxxom/nlp-beginner-series-part-2-1-word-embeddings

🟢 NLP Beginner Series Part 2.2: Embedding Models
https://www.kaggle.com/code/crxxom/nlp-beginner-series-part-2-2-embedding-models

🟣 NLP Beginner Series Part 3: Case Study
https://www.kaggle.com/code/crxxom/nlp-beginner-series-part-3-case-study

tiny sail
low valley
#

Hello everyone!

I'm very happy to share with you my latest notebook on Linear Regression. I enjoy going back to these basic concepts of Statistics and Data Science and writing about them, as it helps to solidify my understanding.

I believe this one might be very valuable for both beginners and veterans alike, as I delve into how Linear Regression works and demonstrate its use with the Diabetes Dataset from Scikit-Learn.

Here's What You'll Find 📌

📝 Introduction to Linear Regression
🧐 In-depth exploration of its math and foundations
📑 Key assumptions you must know
⚙️ Modeling techniques and evaluation metrics
✍🏻 Conclusion and takeaways
📚 Further Reading for the curious minds
🔗 You can check the Notebook here: https://www.kaggle.com/code/lusfernandotorres/mastering-linear-regression-with-statsmodels/notebook

Feel free to leave your thoughts, suggestions, or questions. Your feedback is not only important for me, but also to the community as a whole.

If you find the notebook helpful, an upvote would be much appreciated!

Stay curious and happy learning! 📚

soft quail
#

Hey Kagglers,

I have been working on detecting Melanoma (a type of cancer).

Little context here:
Melanoma is a type of cancer that can be deadly if not detected early. It accounts for 75% of skin cancer deaths. A solution which can evaluate images and alert the dermatologists about the presence of melanoma has the potential to reduce a lot of manual effort needed in diagnosis.

Sharing my work here, please feel free to evaluate and suggest improvements. Excited for the feedbacks, LOL.

Here is the notebook, please let me know how do you like it.
https://www.kaggle.com/code/iavesh/melanoma-cancer-detection-with-85-acc-cnn

frank peak
tiny sail
soft quail
atomic fog
glass plank
#

Hi! I am back again!

This time, I want to introduce my other notebook -> Wheel Downloader!

This is a notebook that might help the beginners to begin with their competition submission (without an internet connection).

There are many occasions where we need to perform !pip install to download the essential libraries.

When I faced this occasion, I tried to search for the solution.
Many people have already shared their own tactics to download the libraries, but many of them were not compatible for the current version of Kaggle Platform.

Therefore I have gathered 2 tactics that works good on the current platform!

Please check out the notebook below and share an upvote if you find it useful!

⬇️
https://www.kaggle.com/code/jasonheesanglee/wheel-downloader

noble sleet
lilac zinc
formal pebble
#

Hi everyone 👋
I created my first Kaggle dataset for the RSNA Abdominal Trauma Detection Competition:
RSNA ATD 2023 DICOM Metadata, a dataset that contains all the metadata in the DICOM scan images
Would appreciate your reviews, corrections and feedback 🙂🚀

https://www.kaggle.com/datasets/tobetek/rsna-atd-2023-dicom-metadata/code

Also, here's the notebook that created the dataset (does need a bit of formatting), but I made use of multiprocessing to speed things up. Also had to experiment with shared state between multiple processes to ensure the CSV headers were consistent.

https://www.kaggle.com/code/tobetek/playing-around-with-dicom

lilac zinc
eager breach
#

Mixed Naive Bayes Classifier Guide: https://youtu.be/1QulO1jS2Hk?feature=shared

Description:

This is part one discussing theory and application of Naive Bayes classifier as a single model for both Categorical and Numeric features. Part two will be the implementation in Python.

Mentions:

Chapters:

          • ...
▶ Play video
lilac zinc
low valley
#

Hey everyone!

I'm glad to share that I just uploaded a new notebook on my Data Science for Financial Markets series.
We're diving into a new way to forecast support and resistance levels based on volatility.

🔗 Check it out here: https://www.kaggle.com/code/lusfernandotorres/volatility-based-supply-and-demand-levels

Plus, I've built a Web App using this methodology so you can forecast yearly support and resistance levels for your own securities.

🔗 Here's the web app: https://huggingface.co/spaces/luisotorres/Volatility-Based-Support-and-Resistance-Levels

Your feedback is extremely appreciated.

Thank you!

hollow wigeon
noble sleet
cunning ingot
#

@buoyant scaffold am wondering is it possible to also carry out statistical tests to assess at least two variables that expose the sample to diabetes?

barren oasis
barren oasis
rare terrace
iron sparrow
tiny sail
eager breach
atomic fog
tiny sandal
lilac zinc
silver hornet
# low valley Hello everyone! I'm very happy to share with you my latest notebook on Linear R...

Great article, very extensive and practical. I would like to add that regression is to predict continuous data vs classification is for discrete data prediction and linear regression is essentially (Which you already mentioned) a supervised linear model using least square method for line fitting or in high dimension hyperplane as a way to approximate a solution for a linear system AX = B where solution doesn't exist (i.e overconstrained/determined system). So under the hood, we find an approximate solution using linear "fitting" or approximation techniques., which is called linear regression.

brittle mason
low valley
tacit wasp
#

Hello everyone

#

I have created a project on data analytics

#

i am hoping your suggestions and upvotes

atomic fog
neat mirage
#

Hi guys 👋 I have recently published a dataset containing metadata scraped from Google News.

About this dataset

This dataset contains metadata of millions of news articles from Google News, including title, publisher, DateTime, link, and category.

This is also an automation project in which data is scraped every day at 4am UTC on 8 major categories. This dataset is expected to have a monthly update, thus the data collected daily will be merged into a single monthly csv file and published on Kaggle at the end of each month. One may expect the value of the dataset to continuously grow through time.

If you find this dataset useful, feel free to drop a like. If you have any requests/suggestions/inquires, feel free to dm me and leave some comments in the dataset 💪

https://www.kaggle.com/datasets/crxxom/daily-google-news

noble sleet
topaz crescent
#

Hiii everyone

#

https://forms.gle/GVNvEb7CUU6S4Eki8

Hello 👋

I'm conducting research to understand student stress and coping mechanisms.

Please take a moment to complete this survey (i swear it will only take 5 mins max!). Your input is crucial in shaping our understanding of student well-being.

̲F̲e̲e̲l̲ ̲f̲r̲e̲e̲ ̲t̲o̲ ̲s̲h̲a̲r̲e̲ ̲t̲h̲i̲s̲ survey wi̲t̲h̲ ̲yo̲u̲r̲ ̲f̲r̲i̲e̲n̲d̲s̲ ̲a̲nd pee̲r̲s t̲o ̲h̲e̲l̲p ̲u̲s̲ ̲ga̲t̲h̲e̲r̲ ̲more ̲v̲a̲l̲uab̲l̲e ̲i̲n̲s̲i̲gh̲t̲s̲!̲

Your responses will be used solely for educational and research purposes, and your privacy is our priority :))

#

Please do take your time filling this form. Would help with my project a lot 🙏🏻🙏🏻

storm hollow
obtuse knot
#

🔍 Title: Cats-Dogs: Feature Extraction | 99% | Web APP
📈 Achievement: Achieved 99% score! 💯

🌐 Web App: Created a user-friendly web app for interactive exploration.

📚 Related Notebooks: Check out these two additional notebooks for a deeper dive into the project:

  1. Cats-Dogs: MobileNetV2, Xception | 96%, 96% | OOP
    : https://www.kaggle.com/code/alaa2mahmoud/cats-dogs-mobilenetv2-xception-96-96-oop
  2. Cats-Dogs: Fine Tuning 2 CNN's | 98%, 97% | OOP: https://www.kaggle.com/code/alaa2mahmoud/cats-dogs-fine-tuning-2-cnn-s-98-97-oop

In this main notebook, I utilized advanced feature extraction techniques with both VGG16 and MobileNetV2, resulting in a remarkable 99% accuracy score. But that's not all!

🧠 Key Highlights:

  • Leveraged VGG16 and MobileNetV2 for feature extraction.
  • Trained a Machine Learning model with a 99% accuracy rate.
  • Demonstrated the power of transfer learning.

🌐 Web App Details:

  • Developed a simple user-friendly web app for interactive exploration.
  • Now you can experiment with the model and see its performance firsthand.

I'd love to hear your thoughts and feedback on all the notebooks and the web app. Please check them out and feel free to leave comments or questions. Let's continue learning and growing together! 🌟

📎 Notebook Link: https://www.kaggle.com/code/alaa2mahmoud/cats-dogs-feature-extraction-99-web-app
🌐 **Web App Link:**https://cats-dogs-classification-app.streamlit.app/

Happy coding, data crunching, and web app exploring! 🚀📊🤖🌐

app

Create with ChatGPT

lilac zinc
atomic fog
eager breach
#

Mixed Naive Bayes Blueprint:
https://youtu.be/wz8rkWFLdPQ?feature=shared

This is Part two implementing Naive Bayes Classifier from scratch in Python; A single model for both Categorical & Numeric data. Check Part one for a refresher where I discuss theory and application intuitively.

We’ll also get introduced to approaching Machine Learning Imbalanced Binary Classification problem; Discussing topics like: Feature En...

▶ Play video
cunning ingot
#

@lofty arrow interesting learn that one can use Causal AI to enhance decision-making processes

eager breach
#

Naive Bayes Classifier in just 4 steps: https://youtu.be/mg3iqP78yfs?feature=shared

Refreshing on Probability Rules theory and application; Implementing Naive Bayes Classifier from scratch in Python as a single model for both Categorical & Numeric data.

We’ll also discuss topics like: Feature Engineering, K Fold Cross Validation & Model Evaluation using several tools & metrics (Precision, Recall. Accuracy, Classification Repo...

▶ Play video
potent fern
noble sleet
rotund cradle
#

Hi everyone, i am new to the world of data science, i have created a very basic dataset on kaggle, I'll appreciate if you guys can check it out and give me advises, and maybe upvote, cheers

https://www.Kaggle.com/minhajalii/datasets

cloud mango
lost elm
atomic fog
brittle mason
mossy turret
ruby nymph
gilded summit
#

Hey everyone, I know getting started with firebase, creating collections ,managing user data, authentication might be a hurdle ,so I created a flutter authentication using firebase authentication and Firestore database with the upto data dependencies, do check it out and lemme know if it has helped anyone of you!!

https://github.com/VishruthVS/FlutterAuthentication

GitHub

Contribute to VishruthVS/FlutterAuthentication development by creating an account on GitHub.

buoyant scaffold
atomic fog
potent fern
low valley
#

Hello, everyone!

I'm very proud to announce my latest Notebook on Kaggle, 🧠 Convolutional Neural Network From Scratch.

This comprehensive guide offers an in-depth exploration into Convolutional Neural Networks (CNNs). You'll gain insights into the core components that power CNNs, their underlying mechanics, and their wide-ranging applications across several industries.

Using the Plant Disease Recognition dataset 🌿, I've built a CNN entirely from scratch, elucidating each step in a granular manner. The notebook employs TensorFlow and Keras for implementation and walks you through the complete process—from data preprocessing to model validation and performance evaluation. 📊

Don't miss out! Dive in and let's demystify CNNs together! Feel free to check it out. 👇

https://www.kaggle.com/code/lusfernandotorres/convolutional-neural-network-from-scratch/notebook

left trench
atomic fog
low valley
#

Feel free to reach out in case you need any help.

atomic fog
#

Thanks

brittle mason
tiny sandal
atomic fog
empty quarry
#

Hello everyone!
I recently created a dataset that contains patient-level data based on a real clinical research trial.

I noticed that there weren't too many of this type of dataset on Kaggle, so if you are wanting to work in research or the medical field and you need experience working with this type of data, I think creating a project with this could be helpful for you in finding a job.

I look forward to seeing what you all make with this!
https://www.kaggle.com/datasets/dillonmyrick/bells-palsy-clinical-trial

buoyant scaffold
#

Hello guys!
This is the first notebook based on my dataset "Cellphones Market Stocks from "Americanas"".
While working on this code, I encountered some issues and confusion, and I'm hoping that by sharing my challenges, we can help each other.
https://www.kaggle.com/code/cauelias/eda-cellphones-market-stocks
You can also check the dataset and have a comment or upvote.
Thanks!

atomic fog
lost elm
#

Sharing my latest datasets published in the last month:

  1. Hyacinth Bean Quality Evaluation: It contains high-quality images of Hyacinth Bean. The dataset consists of a number of Bad and Good images:- 148 and 132 respectively.
    Link: https://www.kaggle.com/datasets/warcoder/hyacinth-bean-quality-evaluation
  2. Mulberry Leaf Dataset: The mulberry leaf dataset is a collection of 10 cultivars that are taken in natural environments using DSLR cameras and smartphones. The data is collected from three regions of Thailand: northern (Chiang Mai), central (Phit- sanulok), and northeast (Nakhon Ratchasima, Buriram, and Maha Sarakham).
    Link: https://www.kaggle.com/datasets/warcoder/mulberry-leaf-dataset
  3. Lumpy Skin Images Dataset: Lumpy skin is a diasease caused by infection of cattle or water buffalo.
    Link: https://www.kaggle.com/datasets/warcoder/lumpy-skin-images-dataset
  4. Indian Medicinal Plant Image Dataset: This dataset consists of medicinal leaf images. It comprises 80 distinct Indian leaf varieties renowned for their potent medicinal properties and offers a rich opportunity for advancing healthcare, botanical studies, and machine learning applications.
    Link: https://www.kaggle.com/datasets/warcoder/indian-medicinal-plant-image-dataset
  5. Mexican Sign Language Dataset: Mexican sign language, like other sign languages, has its own grammar rules and gestures to denote a word, then the hand gestures for the same word, even in Spanish-spoken countries, can vary. The obtention of videos of MSL signs helps to develop a methodology to translate hand gestures into words.
    Link: https://www.kaggle.com/datasets/warcoder/mexican-sign-language-dataset
  6. Thai Cannabis Plants Image Dataset: Cannabis in Thailand are well recognized. In order to be useful for researchers or interested people, Plants of 8 Thai cannabis classes are shared in this data set. The pictures of data set are plants of cannabis.
    Link: https://www.kaggle.com/datasets/warcoder/thai-cannabis-plants-image-dataset
edgy quiver
vocal dawn
edgy quiver
ruby nymph
buoyant scaffold
#

Hello guys!
This is the first notebook based on my dataset "Cellphones Market Stocks from "Americanas"".
While working on this code, I encountered some issues and confusion, and I'm hoping that by sharing my challenges, we can help each other.
https://www.kaggle.com/code/cauelias/eda-cellphones-market-stocks
You can also check the dataset and have a comment or upvote.
Thanks!

lost elm
#

Infected Date Palm Leaves by Dubas insects

The palm leaf images were categorized based on their health status and the presence of insects, resulting in four categories: healthy, infected with bugs only, infected with honeydew only, and infected by mixed insects and honeydew. Images of leaves infected with insects depict a range of insect life cycle stages, from the third generation of nymphs to the adult stage in the fifth nymph stage. Two drone cameras were employed to capture the images, resulting in a dataset of 3000 images, with 800 per non-bug category and 600 for the bug category. The dataset is valuable for assessing infestation severity, estimating insect populations, and determining the extent of damage.

https://www.kaggle.com/datasets/warcoder/palm-leaves-dataset/

mortal sandal
low valley
#

Hey, everyone! 👋🏻

I'm not sure if it's okay to post projects here that aren't hosted on Kaggle. If this breaks the rules, I apologize in advance. Please, let me know and I'll remove the post if necessary.

Well, a few days ago I've posted my Kaggle Notebook, as well as an article on Medium, describing the process of using Keras to build a Convolutional Neural Network for image classification. More specifically, the task was to identify plant diseases.

I'm happy to share that I've deployed this model and it is now available on Spaces for any of you that would like to give it a try.

If you like the project, please leave a like and a feedback. I highly appreciate your time and suggestions for improvement! 🙂

🔗 Here's the link: https://huggingface.co/spaces/luisotorres/plant-disease-detection

upbeat acorn
#

Totally fine to post projects that aren't on Kaggle

flint meadow
#

🚀 Project: Learning Pathway Index 🚀

Hello @everyone,

I am excited to share our project, the Learning Pathway Index! 📚

Project Overview:
The Learning Pathway Index is a collaborative effort designed to enhance the learning experience in the fields of Data Science, Machine Learning, and Artificial Intelligence. We've created a comprehensive guide that curates byte-sized courses and learning materials to streamline your learning journey.

Useful Links:

We would love your feedback and collaboration as we continue to evolve this project. Let's make learning in data science, machine learning, and AI more accessible!

Happy learning! 🌟

Thanks and Regards
Manish Kumar

GitHub

A repo with data files, assets and code supporting and powering the Learning Path Index Project - GitHub - neomatrix369/learning-path-index: A repo with data files, assets and code supporting and p...

errant panther
#

Hey. I have completed and successfully submitted my first project on Kaggle. I have learnt a lot and have acquired new skills thanks to this program. I have attached my project links below so feel free to check it out and I welcome feedback on ways I could improve the model.

  1. https://books.google.com/books?id=6BrxDwAAQBAJ&printsec=copyright
  2. https://github.com/Apress/mastering-ml-w-python-in-six-steps-2e
    Happy Coding.
GitHub

This repository contains a Python application that uses an XGBoost classifier to make predictions based on a CSV dataset. The application is designed to run on your local machine and use it for mak...

GitHub

Source Code for 'Mastering Machine Learning with Python in Six Steps, 2nd Edition' by Manohar Swamynathan - GitHub - Apress/mastering-ml-w-python-in-six-steps-2e: Source Code for &a...

azure sky
#

Hi Kagglers! After many years in the energy industry, I set out to build a deep learning model to improve day-ahead demand forecasting accuracy for electricity market participants. After testing several algorithms against a baseline forecasting benchmark, the best model beat the benchmark accuracy by 22%. The estimated financial benefit to the network operator is 26% or GBP112 in one day for the five thousand customer cohort. That’s GBP0.02 per customer per day. You can find all the code and run the models here. Thanks to Aaron Epel and James Skinner for their thoughtful collaboration on this project.

It would be great to get some feedback and suggestions!https://jamesaksanders.com/2023/10/20/nailing-electric-load-forecasting-with-deep-learning/

I set out to build a deep learning model to improve day-ahead demand forecasting accuracy for electricity market participants.  After testing several algorithms against a baseline forecasting bench…

neat mirage
#

Daily Google News (monthly update)

October daily news updated! 100


This dataset contains metadata of millions of news articles from Google News, including title, publisher, DateTime, link, and category.

This is also an automation project in which data is scraped every day at 4am UTC on 8 major categories. This dataset is expected to have a monthly update, thus the data collected daily will be merged into a single monthly csv file and published on Kaggle at the end of each month. One may expect the value of the dataset to continuously grow through time.```

https://www.kaggle.com/datasets/crxxom/daily-google-news/data
low valley
#

🚀 Hey everyone!

I'm happy to share my latest Kaggle notebook on Transformer models and fine-tuning of BART using the SamSum dataset for dialogue text summarization.

I've put a lot of effort into it, and I believe it'll be highly valuable for anyone looking to enhance their knowledge on NLP tasks and Large Language Models. Check it out and let me know your thoughts! Feedback, questions, and discussions are always welcome!

Here's the link!
🔗 https://www.kaggle.com/code/lusfernandotorres/text-summarization-with-large-language-models/notebook

prisma bay
tiny sandal
#

Hello everyone!

I am excited to share my latest Kaggle notebook with you all. In this notebook, I have implemented a DCGAN from scratch and trained it on the Anime Face Dataset so as to generate realistic anime images

I would love to hear your feedback and thoughts on my notebook, so please do feel free to comment and share your views. In case you do find this notebook helpful, please do not hesitate to give it an upvote or share it

https://www.kaggle.com/code/akshitsharma1/anime-art-with-dcgan-generate-stunning-faces

Thanks a lot for your time and support 🙂

tiny sandal
#

Hello everyone!

I am excited to share my latest Kaggle notebook with you all. The main point of creating this notebook was to explain all the major concepts related to convolutional neural networks in an interesting & easy to understand way. I have tried my best to add illustrations wherever possible so that it aids in deeper conceptual understanding & retention.

I would love to hear your feedback and thoughts on my notebook, so please do feel free to comment and share your views. In case you do find this notebook helpful, please do not hesitate to give it an upvote or share it

Link: https://www.kaggle.com/code/akshitsharma1/generative-adversarial-networks-gan-in-one-shot

Thanks a lot for your time and support 🙂

forest hinge
#

I just uploaded a video on the House Price Predictions project: https://www.youtube.com/watch?v=UqmulHG4IvY&t=1s&ab_channel=RyanNolanData

Welcome to our latest data science project! In this exciting YouTube tutorial, we'll dive into the world of advanced regression analysis using Kaggle's House Prices dataset. When working on the project, the code was able to achieve a top 10% score!

Kaggle Notebook: https://www.kaggle.com/code/ryannolan1/kaggle-housing-youtube-video

Email: ryan...

▶ Play video
warm wave
#

📊 Exciting Data Science Project! 📈

I'm thrilled to share my recent work on the 2012 US Army Anthropometric Survey (ANSUR II) dataset. This comprehensive dataset, representing the entire US Army force, has immense potential in various domains, from military applications to commercial and academic research.
For detailed information about the dataset, you can look at Data Dictionary: https://lnkd.in/d2bVMEhf

In this project, I've performed a thorough analysis, including data cleaning, handling missing values, and managing outliers. The highlight is the application of machine learning models, featuring Logistic Regression, Support Vector Classifier, Random Forest, and XGBoost. The models are evaluated, compared, and enhanced to address class imbalance using techniques like SMOTE.

One of the key aspects of this project is the utilization of SHAP values for feature selection. This technique offers valuable insights into the significance of each feature in our models.

You can find the project details and code on my GitHub repository, and I encourage you to explore the dataset and share your insights.
Github Link : https://github.com/huseyincenik/machine_learning/tree/main/Project/the_ultimate_guide_to_multiclass_classification_for_predicting_race
Kaggle Link:https://www.kaggle.com/huseyincenik/the-ultimate-guide-to-multi-class-classification
Let's continue the discussion and collaboration! I'd love to hear your thoughts and insights on this fascinating dataset.

ruby nymph
round sand
#

Hi all, I want to share a short story with you all. When my team and I were doing our graduation project, we hardly found any dataset that has hand-drawn circuit components within, there were a few but they were either not publicly available or not suitable for our case, then we came across a paper describing how they collected their dataset. We decided to follow their methodology and for two days we were like roaming the whole university and asking different students to draw circuit components for us. Then came the next step of cleaning the dataset and getting it ready for using in our handdrawn circuit components classifier. After we graduated I decided to share the dataset we collected to make it easier for anyone to find a publicly available dataset. Here is the dataset shared on kaggle https://www.kaggle.com/datasets/moodrammer/handdrawn-circuit-schematic-components

tiny sandal
empty quarry
#

Hello everyone!

I've created a new dataset that contains school performance of high school students, as well as their demographic, social, parent, and study data.

If you're interested in education and predicting student outcomes I think you'll really enjoy this dataset! I look forward to seeing what you make with it!

https://www.kaggle.com/datasets/dillonmyrick/high-school-student-performance-and-demographics

lost elm
lost elm
wide compass
#

Hey everyone 🌞🤗, I recently wrapped up my final project for KaggleX Cohort. As part of my final project I created two datasets, which I would like to share with the community. The inspiration behind my project was to explore the representation of BIPOC in data science, and different aspects like gender-ratio, unemployment etc.
Tech Diversity Dataset:
https://www.kaggle.com/datasets/snehilsanyal/tech-diversity-dataset
This is a collection of real diversity datasets collected from big tech companies' diversity reports from 2014-2023 (soon to be updated with other companies).
US Data Scientist Demographics Data:
https://www.kaggle.com/datasets/snehilsanyal/us-data-scientist-demographics-data/
This dataset explores data scientist demographics data in US (race and ethnicity, gender-ratio, unemployment rate) from 2010-2021.

Please feel free to reach out in case of suggestions and feedback. I also plan to extend this dataset and explore features like dropouts in career, layoffs, career transitions and salary.

lost elm
paper gazelle
#

Hello, dear DS community!

There is a course on Computer Vision on Kaggle.
I made a 'practical guide' to it.

https://www.kaggle.com/code/ivanlydkin/computer-vision-course-practical-guide

Some cool stuff in there:

  • transfer learning
  • custom Convolutional Neural Network
  • search for the best weights relation while voting
  • training on TPUs
  • a lot of plain English comments explaining all that.

I am very much looking forward to the feedback and your critique!

Smooth code and thanks for all the fish 😇
Cheers!

empty quarry
#

Hey everyone!
If you don't know much about clustering or you've never made a clustering project before, I'd like to invite you to check out my clustering notebook.

This is a simple but practical implementation of clustering that is easy for beginners to pick up and understand, and it will also give you an example of how clustering can be used to solve business problems.

Thank you!
https://www.kaggle.com/code/dillonmyrick/kmeans-clustering-credit-card-users

lost elm
tiny sandal
tiny sandal
tiny sandal
spring violet
#

Thrilled to share that I've just released my very first dataset on Kaggle! Along with this dataset I've also created a Seaborn beginner-friendly tutorial notebook.

Kindly review them and leave your sincere feedback, thoughts, suggestions, or any improvements you'd like to see.

Dataset- Banking Sector of UEMOA: https://www.kaggle.com/datasets/waalbannyantudre/banking-sector-of-the-waemu/data

Tutorial- Statistical Data Visualization with Seaborn 📊: https://www.kaggle.com/code/waalbannyantudre/statistical-data-visualization-with-seaborn

Thank you for your support and happy kaggling 😁!

compact jasper
#

Hello, I have recently released a Streamlit component for text annotation!

With this text annotation tool, users can streamline their text analysis and annotation processes. Whether you’re working on natural language processing, machine learning, or other text-based projects, this component can help you to efficiently annotate and organize your data.

Overall, I’m proud to have developed this Streamlit component and hope that it proves useful to those working with text data. Feel free to check it out and let me know what you think!

A small star on the github repo if you are interested! ⭐️

⭐️ Source code: https://github.com/rmarquet21/st-text-annotator
🖥️ Demo: https://st-text-annotator.streamlit.app/
🐍 Pypi: https://pypi.org/project/st-text-annotator

void olive
#
Medium

To achieve the goal of creating an NLP Query Engine capable of responding to product-related inquiries on BigBasket, we’ll leverage a…

GitHub

NLP English Language Query Engine, Extensively for Product on Big Basket. - GitHub - s-brajendra/BigBasket-s-products-Query-Engine: NLP English Language Query Engine, Extensively for Product on Big...

lost elm
#

Hello everyone, sharing a pretty interesting dataset

https://www.kaggle.com/datasets/warcoder/electrical-wiring-faults-detection/

This dataset contains images of a single computer case with multiple configurations of two power supply units, two cooling components, and four SATA cables with several wiring configurations, including various induced faults. The aim is to have Predictive Maintenance for Electrical Wiring Faults.

low valley
#

Hello everyone!

I'm happy to share with you my latest Kaggle Notebook, Audio Data: Music Genre Classification.

In this project, we will explore the unique aspects of audio processing and its distinction from other data types. We'll also conduct exploratory analysis, execute preprocessing steps, and ultimately fine-tune the HuBERT model on the GTZAN dataset for the task of audio classification.

Your comments and suggestions are always welcome and greatly valued.

If you enjoy this notebook, please consider leaving an upvote.

Thank you so much!

Here's the link:
🔗 https://www.kaggle.com/code/lusfernandotorres/audio-data-music-genre-classification/notebook

atomic fog
lost elm
tiny sandal
gilded canopy
#

Hi everyone! I have created a little app which visualises training neural network process.

It may be interrsting for you if you are beginner and want to try with hands different architectures of neural network and look how they trains. It also may be interesting if you want to see an example how you can realise neural network algorithm from scratch.

Will be grateful for any feedback!

https://github.com/KonstToIT/train_neural_network_app

GitHub

train neural network app. Contribute to KonstToIT/train_neural_network_app development by creating an account on GitHub.

tiny sail
warped trail
neat mirage
#

November update is out now

*This dataset contains metadata of millions of news articles from Google News, including title, publisher, DateTime, link, and category.

This is also an automation project in which data is scraped every day at 4am UTC on 8 major categories. This dataset is expected to have a monthly update, thus the data collected daily will be merged into a single monthly csv file and published on Kaggle at the end of each month. One may expect the value of the dataset to continuously grow through time.

If you find this dataset useful, feel free to drop a like. If you have any requests/suggestions/inquires, feel free to leave it in the comment sections as well.*

https://www.kaggle.com/datasets/crxxom/daily-google-news?select=2023_10.csv

tiny sandal
#

Hello everyone!

In this notebook, I have implemented and evaluated the performance of VGG16 Architecture by fine-tuning it on Chest X-Ray Images(Pneumonia) dataset. I would love to hear your feedback and thoughts on my notebook, so please do feel free to comment and share your views. In case you do find this notebook helpful, please do not hesitate to give it an upvote or share it

https://www.kaggle.com/code/akshitsharma1/pneumonia-detection-using-vgg16-transfer-learning

Thanks a lot for your time and support 🙂

pure scroll
atomic fog
#

hey every one!

I just made my first "project" and data set: https://www.kaggle.com/datasets/bjrnwikstrm/zero-to-hero
Zero-Hero DataScience Toolkit," a comprehensive and meticulously curated collection of Python snippets and tools designed to take you from a beginner to a proficient data scientist. This toolkit is not just a collection of code; it's a gateway to mastering the art and science of data analysis, visualization, and machine learning.

If you like it, I would be glad to see you folks adding to the set!

I initially created the Zero-to-Hero dataset for my personal use, as a way to navigate the challenges posed by my ADHD and autism, particularly around working memory issues. Often, when I struggled to remember concepts or felt like I wasn't learning, it was easy to become disheartened. To combat this, I began compiling a list of 'snippets' – small, manageable pieces of information and code that I could easily refer back to.

As I progressed, I realized that not only was I learning through the process of writing and utilizing these snippets, but also that others could benefit from this approach. Understanding each snippet's purpose was crucial for my learning, and I decided to expand this list into a comprehensive resource. The goal of the Zero-to-Hero dataset is to empower even the most novice individuals with an interest in data science and data analysis. It's designed to instill a sense of capability and achievement, while they learn and grow in the field, much like I did.

lost elm
tiny sandal
lost elm
#

Dataset of Handwritten Arabic Characters with Harakat (Fathah, Kasrah and Dhammah)

The ḥarakāt, which means 'motions', are the short vowel marks.

The dataset consists of 2,464 RGB images that are grouped into 101 classes. The images are in .png format, with dimensions of 300×300 and bit depth (24, 32 and 64).

https://www.kaggle.com/datasets/warcoder/dataset-of-handwritten-arabic-characters/

pure scroll
lost elm
#

Resistance Spot Welding Insights: A Dataset Integrating Process Parameters, Infrared, and Surface Imaging

The database serves as a comprehensive record of the welding spot process, including

  1. The monitoring of crucial process parameters such as current and force during the nugget formation.
  2. Record of input parameters such as current, welding time, applied force to the electrodes, and characteristics of the material used as thickness and material type.
  3. Welding output of as the mechanical resistance and nugget diameter, along with its corresponding classification. Additionally, images of the melting point (nugget) were taken with a thermographic camera and digital camera to link the relationship between input parameters for each unit build.

https://www.kaggle.com/datasets/warcoder/resistance-spot-welding-insights/

pure scroll
lost elm
potent garden
pure scroll
#

Hi guys, Try this amazing project.
Used DeepFloyd to create illusionary Images, here's how it works
You pass two prompts and two operations(first one is first image actually) and second operation say flip or jigsaw(to shuffle the image into puzzle pieces) and after performing this operation, we should get the image specified in second prompt.
https://www.kaggle.com/code/sujaykapadnis/visual-anagrams-creating-illusions

#

for this prompts:
prompts = [
'an oil painting of a snowy mountain village',
'an oil painting of a horse'
]

I got following results

lucid aspen
lost elm
lost elm
stray shale
#

Hi everyone 👋🏻,
I'm thrilled to share my latest Kaggle notebook from the ML StudyTime collection with you. The primary goal behind creating this notebook was to elucidate some Machine Learning concepts in an engaging and accessible manner. I've made an effort to incorporate illustrations wherever possible to enhance conceptual understanding and retention:

https://www.kaggle.com/code/arezalo/ml-studytime-11-maximum-likelihood

Thank you so much for your time and support! 😊

lost elm
fallow mason
lost elm
#

Human tracking dataset of 3D anatomical landmarks and pose key points

This dataset associates 2D and 3D human pose key points estimated from images with MediaPipe with the location of their corresponding 3D anatomical landmarks. It consists of 567 movement sequences of 71 participants in A-Pose and performing 7 movements (walking, running, squatting, and four types of jump)

https://www.kaggle.com/datasets/warcoder/human-tracking-dataset-of-3d-anatomical-landmarks/

fresh wraith
radiant cedar
lost elm
lost elm
frank peak
#

Hey everyone 👋,

It's been a while since my last Kaggle discussion. So, I've come back and finally decided to write this. So, this Kaggle discussion is a summary of the first Data Wizard online meetup (join the community: https://www.linkedin.com/company/data-wizards-community/), where I explained everything I know about how to elevate your data visualizations using Python libraries. Here, I also explain several visualization concepts and also my resources in creating the visualization itself.

🔗 Link to discussion:
https://www.kaggle.com/discussions/general/461577

Feel free to check it out, and do let me know your thoughts/feedback.
Thank you! 🙇‍♂️

lost elm
potent garden
#

Hey all please check my notebooks and upvote if you find it helpful https://www.kaggle.com/sahityasetu/code

empty quarry
#

Hello everyone!
I wanted to share one of my notebooks with you that I thought could be of use to some of you.
I used to work for a digital advertising company, and while I was there we would often perform A/B tests of different advertising campaigns to see what kind of campaign would perform better for a given client.

I created the notebook below using this type of analysis, so for those of you who want to work in digital marketing/advertising or e-commerce, this is a good project for you to make and discuss in interviews.

Thanks!
https://www.kaggle.com/code/dillonmyrick/a-b-test-hypothesis-testing-for-e-commerce

lost elm
potent garden
spring violet
tired pelican
stray shale
#

Hi everyone 👋🏻,

I'm thrilled to share my latest Kaggle notebook from the ML StudyTime collection with you. The primary goal behind creating this notebook was to elucidate some Machine Learning concepts in an engaging and accessible manner. I've made an effort to incorporate illustrations wherever possible to enhance conceptual understanding and retention:

https://www.kaggle.com/code/arezalo/ml-studytime-12-outlier-noise/notebook

Thank you so much for your time and support! 😊

low valley
#

Hello, everyone! 👋

I'm happy to share with you my latest Kaggle notebook, Evaluation Metrics for Regression Models.

In this very short and straightforward notebook, we will go through the Math behind the most commonly-used evaluation metrics in regression tasks, understand how to interpret them and how to define your own custom functions to compute them using only Python and nothing else!

Here is the link!
🔗 https://www.kaggle.com/code/lusfernandotorres/evaluation-metrics-for-regression-models

Your feedback and suggestions are highly appreciated.
Thank you! 🤗

potent garden
lapis oar
#
late oriole
#

hey! my friend and I are training models to clean up unstructured data. if you want to try it, dm me. you can plop in a bunch of docs (or connect your s3 bucket), specify your fields, and get a clean data table to query from.

potent garden
lofty barn
solid mountain
potent garden
fresh wraith
neat mirage
frank peak
#

[Data Slices S01.E02: A 365 Day Emotional Journey in Color]

Here is a look at my emotional journey, which I documented daily in 2023. I decided to make it into figures to evaluate my emotions, personal growth, small details appreciation, and prepare to start over in 2024. This year, I've accepted new challenges, established true connections, and found happiness in every moment of every single day, so 2023 has been a year filled with satisfaction and gratitude for me.

🔗LinkedIn post: https://www.linkedin.com/posts/caesarmario_data-slices-a-365-day-emotional-journey-activity-7147406095039275009-abBr?utm_source=share&utm_medium=member_desktop

🧑‍💻Code to create the figures: https://github.com/caesarmario/data-slices/tree/main/20231108

Happy New Year to you, all my friends! I wish you a happy new year with good luck, health, and prosperity.
https://media.discordapp.net/attachments/1130784683907612764/1191201123848179814/data_slices_s01e02_mood_calendar-min.png?ex=65a4937f&is=65921e7f&hm=ab157f96acd5bdcbdd3da228911f23cfddfd942a5220e35135d3a93a40946d74&=&format=webp&quality=lossless&width=439&height=663

storm thistle
#

Hey everyone!
#KagglingWithKhushee is a series wherein I post daily updates about the best notebooks, datasets, and discussion threads that I stumble upon on Kaggle, and some food for thoughts.

While often confused, computational intelligence (CI) and artificial intelligence (AI) are two sides of the same coin, with subtle differences.

Think of AI as the vast landscape of intelligent machines, and CI as a specialized toolkit within it. This toolkit draws inspiration from nature, like evolution and swarm intelligence, to create algorithms that adapt and learn in complex environments.

CI's main pillars:

  • Evolutionary Computation: Evolves solutions like Darwinian evolution (think genetic algorithms).
  • Swarm Intelligence: Mimics collective behavior (e.g., ant colony optimization).
  • Fuzzy Logic: Embraces uncertainty for nuanced solutions.
  • Artificial Neural Networks: Learn and adapt from data like the human brain.

Among these, GAs shine in tackling complex problems with many variables. They mimic natural selection, iteratively evolving solutions towards better outcomes. Imagine creating solutions, selecting the best, "breeding" them to combine strengths, and introducing random mutations to avoid stagnation. Over time, GAs lead to optimal solutions.

I've created a dedicated notebook to explore GAs, with deep explanations, code examples, and real-world applications. Link: https://www.kaggle.com/code/khusheekapoor/genetic-algorithm/notebook

Remember, CI and AI are partners, not rivals. Understanding their differences gives you the right tool for the job, empowering you to harness the potential of intelligent machines.

fickle pagoda
#

🎇 Kaggle's Top 20 Lists of 2023!

2023 has ended and I've always wanted to know the top 20 of certain categories here in Kaggle.

This notebook tries to answer the following

  1. Who are the top 20 users that created the most threads in 2023?
  2. Who are the top 20 dataset authors who had the most upvotes in 2023?
  3. What are the top 20 lowest scored threads?
  4. Who are the top 20 users who won the most medals in competitions?
  5. (AND MUCH MUCH MORE!)

Attached is a screenshot of one of the highlights

I can see @ravi20076 , @cdeotte, and @mpwolke in the top 3 of those who received the most 2023 message upvotes!

I hope you are Intrigued!
Here is the notebook > https://www.kaggle.com/code/bwandowando/kaggle-s-top-20-lists-of-2023

I've had fun working on this notebook and this has to be the notebook that I've spent the most time working on, but I believe that this was all worth the effort and I know that Kaggle members would be interested to see a top-20 list of 2023 categories in Kaggle.

I've prioritized the categories that I believe mattered the most, but if you have suggestions on what additional categories to add, then let me know!

Thank you, YES YOU!, for being a part of my amazing journey in Kaggle in 2023, and looking for more amazing interactions and discussions with the community!

HAPPY NEW YEAR!

low valley
#

The Transformer architecture, presented in the research paper Attention Is All You Need, is a revolutionary step towards the most advanced AI models we have available today.
In my latest Kaggle notebook, I have explored the Transformer architecture for language-translation tasks, building its core components from scratch using PyTorch and training it on the OpusBook dataset.

This notebook is a must-read piece for everyone who wishes to enhance their understandings of the Transformer model and how it works.
You can read it by clicking on the link below!

🔗 https://www.kaggle.com/code/lusfernandotorres/transformer-from-scratch-with-pytorch/notebook

Thank you very much!

barren oasis
frank peak
#
west karma
#

I made an observation of Turkish users using Kaggle Metadata and published it as a notebook. Also I published a dataset that include the Turkish cities and regions.

I would be verry happy if you check them out. Thanks

Notebook: https://www.kaggle.com/code/sanlian/kaggle-turkish-user-statistics

Data: https://www.kaggle.com/datasets/sanlian/turkiye-sehirler-bolgeler

low valley
tall badger
#

Very nice, I will work with this and then share.

sacred knoll
hybrid moth
copper scroll
low valley
#

Large language models are one of the most amazing tools to have surfaced in recent years. But, although powerful, they still have some shortcomings. Hallucinations happen when a model comes up with a convincing answer to something it doesn't really know.

Retrieval-augmented generation (RAG) is one of the ways we can help a large language model reduce its hallucinations and provide more accurate and source-based answers to the questions a user makes.

In my recent Kaggle Notebook, Retrieval Augmented Generation with Mistral 7b 📁, we explore how to build an RAG system to fetch relevant information from documents to power an LLM response.

You can check the notebook in the link below 👇🏻:
🔗 https://www.kaggle.com/code/lusfernandotorres/retrieval-augmented-generation-with-mistral-7b/notebook

Thank you very much!

spice dove
#

👋 Hi Kaggle Community,

I'm Harry, deeply intersted in sports analytics, especially in soccer/football. 🥅⚽ I'm currently developing an ML model using xgboost to predict the number of goals in a match.

I am keen to hear suggestions that could enhance the accuracy of my model!

Also, If you're into data-driven sports predictions or have experience in this arena, I'd love to chat.

https://www.kaggle.com/code/harrycarson11/predicting-home-goals-in-epl-soccer-football/notebook

potent garden
lofty barn
plucky oasis
lofty barn
frank peak
frank peak
#
pure edge
#

Excited to share my latest project on #DataScience! 📊 Leveraging #MachineLearning for meaningful insights. As a #TechEnthusiast, I'm thrilled to unveil my Kaggle notebook focusing on Netflix's Best 🌟: Movie & Series Recommendations! 🎬 #AI

In this project I have shown :
📚 Import Relevant Library
📊 Basic Understanding of Data
🔍 Exploratory Data Analysis
🛠️ Feature Engineering
🧹 Data Preprocessing or Cleaning
❓ Dealing with Missing Values
🏷️ Feature Encoding
📈 Outlier Detection
🎯 Feature Selection
⚖️ Feature Scaling
🤖 Building ML Model
🔄 Automate ML Model
📊 Model Performance Comparison
🎯 Hyperparameter Tuning of Different Models
🔄 Making Stacking Model
📊 Printing Stacking Model Accuracy on Training and Test Data

Check out the Kaggle notebook for this project [https://www.kaggle.com/code/mehedithedreamer/netflix-s-best-movie-series-recommendation] 📗, and please provide your feedback and support! 🚀 #MMM

pure edge
#

🚀 Excited to unveil my latest voyage into the world of #DataScience! 🌐 Explore the enchanting world of #MachineLearning and unveil insights into 🏡💰House Price Prediction 💰 🏡. Let's kick off this tech journey! 💻 #AI

🔍 Journey Highlights:
📚 Importing Relevant Libraries
📊 Navigating the Data Landscape
🔍 Unearthing Insights through EDA Magic
🛠️ Crafting Features with finesse
🧹 Mastering the Art of Data Cleaning
❓ Tackling the Mysteries of Missing Values
🏷️ Encoding Features for Power
📈 Detecting the Mavericks - Outlier Hunt 🕵️‍♂️
🚀 Dealing with Outliers 🔄
🎯 Feature Selection
⚖️ Scaling Features for Harmony
🤖 Crafting the Perfect ML Model
🔄 ML Automation - Because Time is Precious
📊 A Grand Performance Showcase
🎯 Fine-tuning Model Hyperparameters
🔄 Elevating the Game with a Stacking Model
📊 Witness the Stacking Model's Triumph on Training and Test Data!

Dive into the details on my Kaggle notebook [https://www.kaggle.com/code/mehedithedreamer/house-price-prediction] 📗, and let your thoughts soar! 🚀 Your feedback and support mean the world to this #TechEnthusiast! 🌐✨
#MMM

eager breach
sonic cave
fallow mason
#

Hi guys!
Check out my captioning project with text to speech that helps visually impaired individuals, I have used Inception V3 to generate image features then built custom encoder, decoder and attention layers.
https://www.kaggle.com/code/krishna2308/eye-for-blind

dusky zephyr
#

Nice documentation! The visualization in the forecast section is great.

barren oasis
fresh wraith
barren oasis
barren oasis
fickle pagoda
lofty barn
solid mountain
solid mountain
lofty barn
low valley
#

Hello, everyone! 👋🏻

I am happy to share my latest notebook, Options Trading: Long & Short Straddle 📈.
In this brief notebook, I approach the intricacies of options trading and present two strategies: long straddle and short straddle.
Both strategies allow traders to speculate on future price movements of the underlying asset and look for profit in high-volatility and low-volatility scenarios.
I have also built a web app with Streamlit so you can input your own tickers and values. Feel free to try it!

If you have any doubts or suggestions, feel free to contact me.
Thank you!

Notebook 👇🏻
🔗 https://www.kaggle.com/code/lusfernandotorres/options-trading-long-short-straddle

App👇🏻
🔗 https://huggingface.co/spaces/luisotorres/long_and_short_straddle

plucky trench
neat mirage
#

January update is out now

This dataset contains metadata of millions of news articles from Google News, including title, publisher, DateTime, link, and category.

This is also an automation project in which data is scraped every day at 4am UTC on 8 major categories. This dataset is expected to have a monthly update, thus the data collected daily will be merged into a single monthly csv file and published on Kaggle at the end of each month. One may expect the value of the dataset to continuously grow through time.

If you find this dataset useful, feel free to drop a like. If you have any requests/suggestions/inquires, feel free to leave it in the comment sections as well.

https://www.kaggle.com/datasets/crxxom/daily-google-news?select=2023_10.csv

lofty barn
stray shale
#

Hi everyone 👋🏻,

I'm thrilled to share my latest Kaggle notebook from the ML StudyTime collection with you. The primary goal behind creating this notebook was to elucidate some Machine Learning concepts in an engaging and accessible manner. I've made an effort to incorporate illustrations wherever possible to enhance conceptual understanding and retention:

https://www.kaggle.com/code/arezalo/ml-studytime-13-normalization

Thank you so much for your time and support! 😊

tender depot
#

Suggest some end to end good projects for portfolio related to image detection

atomic fog
#

🔮 Hi everyone! I'd like to share a data challenge for predicting fertility outcomes in the Netherlands that I'm working on - https://preferdatachallenge.nl
This data challenge is a perfect opportunity to test and improve your machine learning skills, grow your network, discover unique data, collaborate on scientific papers, and win recognition while predicting an important life outcome. The deadline for application is March 24, 2024. See the details on the website and apply!🙌

short jetty
pure edge
#

🚀 Excited to unveil my latest venture in the realm of #DataScience! 🌐 Embark with me on a journey delving into the fascinating world of #MachineLearning as we predict medical charges with precision and finesse! 💉💰 Let's ignite this tech odyssey! 💻 #AI

Dive deeper into the intricacies of this project on my Kaggle notebook [https://www.kaggle.com/code/mehedithedreamer/charting-the-future-medical-charge-prediction] 📗, and let's soar with your insights! 🚀 Your feedback and encouragement fuel this #TechJourney! 🌐✨
#MMM

serene flower
#

Unlocking Insights: Exploring Diverse Data Challenges with Machine Learning

Check out my latest Kaggle notebooks covering a range of data challenges:

  1. Heart Disease Prediction
  2. Titanic Competition
  3. Bank Churn Prediction with XGB and LGBM
  4. US Data Analysis Beginner
  5. Paddy Disease Classification
lofty barn
lost elm
#

Two Stage Retrieval RAG using Rerank models

RAG systems often fail due to a lack of diverse data in documents, leading to conflicts while retrieving process through the vector database. So to avoid this we use the reranking model that reranks the top k retrieved documents from the vectordb and helps to generate a better response by giving better context.

Here is a demo notebook for the same: https://www.kaggle.com/code/warcoder/two-stage-retrieval-rag-using-rerank-models

lofty barn
lofty barn
lost elm
neat mirage
lofty barn
lofty barn
#

Upvoted!! great dataset!!

buoyant scaffold
#

Hello everyone,

Check out this new dataset I've discovered and published on Kaggle:

https://www.kaggle.com/datasets/cauelias/dam-data-to-risk-analysis

This dataset contains a vast amount of information about Brazilian mineral barriers. With 190 columns of rich data, it can be utilized in multiple applications. You can attempt to predict the risk associated with certain barriers, classify them based on the type of minerals, or even utilize regression techniques to analyze the volume.

Take a look and explore the possibilities!

serene flower
#

🚀 Exciting Update! Just released my latest Kaggle notebook focusing on predicting obesity risk with an impressive accuracy of 90%! 💼📊

🔗 Check out the notebook here: https://www.kaggle.com/code/muhammadfurqan0/ps4e2-obesity-risk-gb-0-90

🌟 Delve into the world of predictive analytics as we uncover key factors influencing obesity risk. Your feedback and insights are highly valued as we strive to enhance our understanding of this critical health issue.

honest delta
#

Recently, I spent five days working on a guide that I am proud of. The guide is designed to be simple and requires minimal knowledge of Git and Python. It will teach you everything from creating a GitHub repository to automating model testing and deployment.

By following this guide, you will learn how to:

  1. Set up the GitHub repository, Hugging Face Space, and local files and folders.
  2. Building and training a drug classification model using Scikit-learn pipelines.
  3. Model evaluation and saving the pipeline using skops.
  4. Writing and running Continuous Integration (CI) and Continuous Deployment (CD) workflow using Makefile and GitHub Actions.
  5. The CI pipeline will train the model, evaluate the results in the commit comments using CML, and save the trained model in a new branch.
  6. Develop the customized Gradio application that loads a model and generates predictions based on user input.
  7. The CD pipeline will get triggered when the CI pipeline is finished.
  8. The CD pipeline will pull the saved model from a new branch and push the app and model changes to the Spaces server using Hugging Face CLI.

Please follow the guide to learn more and provide me feedback. I am always looking to improve my writing and make things easier for people who want to get into the world of MLOps.

**Step-by-Step Guide: **https://www.datacamp.com/tutorial/ci-cd-for-machine-learning
GitHub Repository: https://github.com/kingabzpro/CICD-for-Machine-Learning

low valley
#

Hello, everyone 👋🏻

Are you ready to take your investments portfolio management skills to the next level? I'm extremely happy to share my latest project:

💥 An Investment Portfolio Management Web App

• Effortlessly track stocks, crypto, ETFs, and more in one place
• Visualize your returns with stunning charts and graphs
• Get clear investment analysis for smarter decisions

How about taking it a step further?

I've created a Kaggle Notebook detailing the app's creation process. Learn how to:

Collect financial data with Python
Design interactive visuals with Streamlit
Make your own finance analysis dashboards

Check it out!

🔗 Web App: https://huggingface.co/spaces/luisotorres/portfolio-management

🔗 Kaggle Notebook: https://www.kaggle.com/code/lusfernandotorres/building-an-investment-portfolio-management-app

Let me know if you have any questions – I'm here to help!
Your feedback is also welcome to enhance the app even further!

Thank you very much.

lost elm
#

IPO Mainboard and SME Basic Details India Dataset

An Initial Public Offer (IPO) is the first sale of shares to the public by a privately owned company. The companies going public raises funds through IPO for working capital, debt repayment, acquisitions, and a host of other uses.

The investor can apply for IPO Stocks in India by filling an online IPO application offered by the stockbrokers and banks. Brokers offer UPI-based online IPO applications and the banks offer both UPI as well as ASBA IPO applications.

https://www.kaggle.com/datasets/warcoder/ipo-mainboard-and-sme-basic-details-india/

lofty barn
west karma
lofty barn
#

Upvoted!    congratulations100upvots!

pure edge
lofty barn
atomic fog
#

Hi all,

I'm thrilled to share my latest exploration into diverse Kaggle datasets with 5 new quick reads. Each analysis dives into unique datasets. Let's dive in:

❤️ Heart & Science: Stroke Prediction with AI 🔍: Discover a data-driven approach to stroke prediction, leveraging WHO data with machine learning techniques for healthcare.
https://www.kaggle.com/code/onurrr90/heart-science-stroke-prediction-with-ai

🌎 Alcohol Insights: Climate & Faith's Impact 🍷: Explore how climate and religious beliefs influence global alcohol consumption patterns, offering an analysis across countries.
https://www.kaggle.com/code/onurrr90/alcohol-insights-climate-faith-s-impact

🔍 Probing the Public LB in Obesity Risk S4,E2 🛠: A strategic exploration of the public leaderboard's impact on competition rankings, using obesity risk data to guide submission strategies.
https://www.kaggle.com/code/onurrr90/probing-the-public-lb-in-obesity-risk-s4-e2

⚡ Enefit - 4 Features to Improve Score (Public LB 39): An inside look at key feature engineering strategies that propelled us to the top 40 on the leaderboard, focusing on innovative data insights.
https://www.kaggle.com/code/onurrr90/enefit-4-features-to-improve-score-public-lb-39

I tried to make these notebooks visually attractive but also informative. Enjoy exploring!

vapid blaze
frank peak
#

Hi everyone!

I made some changes to one of my notebooks to make it seem better. This notebook describes exporting processed datasets and processing data step-by-step with Pandas and Python. Furthermore, as you can see in the photo, I also built some EDA.

Link to notebook: https://www.kaggle.com/code/caesarmario/python-magic-big-mart-sales-data-transformed

Let me know your thoughts on this one. Thank you!

neat mirage
#

**February Update is out 🎉 **

This is also an automation project in which data is scraped every day at 4am UTC on 8 major categories. This dataset is expected to have a monthly update, thus the data collected daily will be merged into a single monthly csv file and published on Kaggle at the end of each month. One may expect the value of the dataset to continuously grow through time.

If you find this dataset useful, feel free to drop a like. If you have any requests/suggestions/inquires, feel free to leave it in the comment sections as well.

https://www.kaggle.com/datasets/crxxom/daily-google-news

covert moon
#

Shameless self promotion + fishing for likes/reshares: Our latest paper is out!
🧬 "Detecting Anomalous Proteins Using Deep Representations" in NAR Genomics and Bioinformatics!
(Protein Language models-Bioinformatics and anomaly detection!)

Twitter thread (high level fluff):
https://twitter.com/danofer/status/1763962202472484991

Paper link:
https://doi.org/10.1093/nargab/lqae021

Sharing is caring ❤️ !

1/ Excited to share our work 🧬 "Detecting Anomalous Proteins Using Deep Representations" 🧬published in NAR Genomics and Bioinformatics!

OUP Academic

Abstract. Many advances in biomedicine can be attributed to identifying unusual proteins and genes. Many of these proteins’ unique properties were discovered by

lofty barn
atomic fog
serene flower
#

🌟 Let's Empower Each Other on Kaggle!

🎉 It's with great excitement that I share my debut Kaggle notebook, "Hourly Energy Consumption"! As I take my first steps into the realm of time series analysis, your encouragement and support are invaluable to me.

💖 Your upvotes not only validate my efforts but also serve as a beacon of support on this journey of exploration and growth. Together, let's foster a community of kindness and supportiveness on Kaggle, where we uplift and empower each other to reach new heights.

🚀 I invite you to join me in celebrating this milestone and spreading positivity in our shared passion for data science. Your support fuels my motivation and inspires me to continue pushing the boundaries of what's possible.

🌟 Thank you for being a part of this incredible journey. Let's uplift each other and make our Kaggle community a place of warmth and encouragement!

💻 Together, let's soar to greater heights! Upvote my debut notebook and let's continue to shine brightly on Kaggle! 🌟

https://www.kaggle.com/code/muhammadfurqan0/hourly-energy-forecasting-notebook

#Kaggle #Support #Kindness #DataScience #Upvote #Community #Empowerment #Gratitude

atomic fog
#

Note Book : Text Preprocessing in NLP | Basis Steps to Preprocess The Textual Data

Link : https://www.kaggle.com/code/abdmental01/text-preprocessing-nlp-steps-to-process-text?scriptVersionId=165310719

This concludes the basic text preprocessing steps commonly encountered in natural language processing tasks. I hope you have gained a clear understanding of each process. If you have any further questions or queries, feel free to comment below.

If you found this helpful, please consider upvoting and sharing your feedback. Your support motivates me to continue creating useful content.

Thank you for your attention and happy learning!

lucid dove
lofty barn
trim flame
#

Hi, everyone! Would like to share my article - a step-by-step guide on building a virtual assistant for any business, maybe it appears valuable to you... or you would like to give any input or any comment 😄 - in the article I pick HSBC UK Bank as a target and build a chatbot for them that outperforms their own greatly.
https://medium.com/@vovakuzmenkov/building-a-fullstack-rag-solution-with-private-llm-a-step-by-step-guide-48a0a4467efc

Medium

End to End approach on how to build a a virtual assistant for any website. Flask webapp, Telegram bot integration and deployment into…

atomic fog
sudden prairie
#

Hello Everyone!
Hope you're all having an awesome day!

I just wanted to share something cool I've been working on recently. I've put together a notebook titled "Netflix - EDA, Visualization & Insights" where I've delved into the world of Netflix data.
I
n this notebook, I've performed some Exploratory Data Analysis (EDA) to uncover interesting trends, visualized the data to make it more understandable, and extracted some insightful nuggets about everyone's favorite streaming service!

If you're interested in diving into the data behind Netflix, I'd love for you to check out my notebook and share your thoughts! Your feedback would be greatly appreciated, and I'm always open to suggestions for improvement. Let's learn and grow together!

Here's the link to my notebook: https://www.kaggle.com/code/saumyanishi/netflix-eda-visualisation-insights

atomic fog
atomic fog
#

Notebook Trending Again 🙂

candid bone
#

Grow your business with the power of AI.

Hello, I trust you are well.
Understanding the significance of advertising and attracting clients to your business is crucial.
Rest assured, I have a solution for you.
I have created an AI call center using Twilio and a voice chatbot.
With its streaming voice chatbot capabilities and training potential, the AI caller can engage with customers fluently and in real-time.
Experience the transformative power of AI with my cutting-edge system.

lucid dove
pure edge
lofty barn
young harbor
#

#interviewpreparation #seo #LeetCode #likeandsubscribe #leetcodedailychallenge

Uncover the secrets of LeetCode problem 3005: Count Elements With Maximum Frequency with me, CodeRebel. We'll decode the challenge, master a smart solution, and boost our coding skills!

🎯 Challenge Highlights:
Explore an array of positive integers, identify element...

▶ Play video
tiny sandal
nocturne merlin
#

🌟 Latest Notebook! 🌟

Hello, everyone!

I am excited to share my new work on Bengali Text Preprocessing For Language Modelling i.e. Text Generation Tasks

In this notebook, I delve into several techniques for preprocessing Bengali text. This includes cleaning, tokenization, vectorization, sequencing into n-grams, and many more. I also address common challenges in Bengali text preprocessing, such as handling compound words, dealing with non-standard characters, and optimizing GPU memory usage for large corpora.
This Notebook is completely beginner-friendly as I have discussed all the steps with proper explanations and given alternative ways to do the same steps. I have also shown some memory-handling techniques while working with Cuda.

I utilized a text corpus from the works of Rabindranath Tagore, a distinguished Bengali writer, poet, and philosopher. His writings are among the finest in Bengali literature, offering profound insights and timeless wisdom.

Feel free to explore the NOTE_BOOK and let me know your thoughts! I am open to Insights and Suggestions for improvement.

Best Regards
Thank You

atomic fog
neat mirage
lofty barn
novel sedge
#

🔍 Exploratory Data Analysis of Earthquake-Related Tweets

I recently delved into an extensive EDA process using a unique dataset of tweets related to the Turkey earthquake, compiling insights and visualizations to understand the digital footprint of such a significant event. Check out the full analysis here.

Key Highlights:

  • Automated Data Merging: Utilized Python to automate the process of merging multiple CSV files from subfolders, creating a single DataFrame for each folder, simplifying data management.

  • Data Exploration:

  • Generated comprehensive statistics and missing value analysis to ensure data quality.

  • Examined tweet lengths to understand communication patterns.

  • User Interaction Analysis:

  • Visualized likes, retweets, and replies to identify engagement trends.

  • Identified top users based on their activity levels and influence.

  • Content Analysis:

  • Employed word clouds to visualize the most frequent terms, uncovering the main topics of conversation.

  • Conducted sentiment analysis to classify tweets into positive, negative, or neutral categories, revealing the emotional landscape of the digital conversation.

  • Temporal Analysis:

  • Analyzed tweets over different times of the day to observe fluctuations in social media activity, providing insights into user behavior patterns during crisis events.

  • Time series analysis highlighted the evolution of tweet volume, offering a granular view of public engagement over time.

This project showcases the power of data science in crisis communication, offering invaluable insights into public sentiment, information dissemination, and community engagement during the aftermath of the earthquake.

For those interested in the technical details, key Python libraries such as pandas for data manipulation, matplotlib and seaborn for visualization, nltk for sentiment analysis, and wordcloud for generating word clouds were instrumental in this analysis.

Dive into the full notebook for a deeper understanding of how data analysis can illuminate the human aspects of natural disasters through social media data.

gray cloud
#

Sharing with you a cool dataset I uploaded on Airbnb pricing and TripAdvisor ratings in major European cities.

TL;DR:
You get Airbnb & TripAdvisor data and your goal is to explain what drives the price of the airbnb unit using spatial econometric analysis.

Link: https://www.kaggle.com/datasets/thedevastator/airbnb-prices-in-european-cities

Would be more than happy to get your feedback on it!

sterile coral
#

Hi , I did a very basic analysis on electoral bond data of India . Which was released very recently . To my surprise I found out that the encashed bond amount was greater than the bought amount . The difference was about 6135761000 INR .

Dataset : https://www.kaggle.com/datasets/newtonbaba12345/electoral-bonds-dataset/data

Notebook :https://www.kaggle.com/code/newtonbaba12345/electoral-bonds-analysis

If you find any mistake in the code pls do tell me I will correct it. I will be updating the Notebook with a more in depth analysis in the coming weekend.

Do check them out.

tiny sandal
pure grotto
#

Hello Everyone 👋,

I am thrilled to share with you my latest research titled 'The Global Portrait of Renewable Energy'. This study delves into the distribution and potential of renewable energy usage worldwide, using data science and visualization techniques.

🔍 The project comprehensively examines the policies, investments, and advancements various countries are making towards renewable energy.

💡 I hope this work serves as a valuable resource for anyone looking to make progress in the field of renewable energy. For more information about my project, please visit my research notebook

https://www.kaggle.com/code/mehmetisik/01-the-global-portrait-of-renewable-energy/notebook

and do not hesitate to share your thoughts!

🙏 Thank you!

For more of my work, please check out my Kaggle profile.
https://www.kaggle.com/mehmetisik/code

serene flower
#

🌸 Exciting News! 🌼

Thrilled to share my latest project with the Kaggle community: "Resnet9 Flower Power: Transfer Learning Mastery" 🚀🌺

In this project, I delved into the fascinating world of flower classification using advanced transfer learning techniques with Resnet9. From petunias to sunflowers, this project explores the intricacies of flower recognition.

📊 Check out the project on Kaggle: Resnet9 Flower Power - Transfer Learning Mastery

I'm eager to hear your thoughts and feedback on this exciting endeavor! Let's continue to learn and grow together as a community. 🌟

#KaggleCommunity #DataScience #TransferLearning #FlowerClassification #Resnet9 #ProjectShowcase 🌼🌸

odd zephyr
subtle schooner
#

Hey all,

I tried out Google's new Gemma-2b model and their LLM inference support via MediaPipe running on Android. I wrote up the experience to share and the code is also available on GitHub if you're interested. 😃

Blog Post: https://www.darrylbayliss.net/playing-simon-says-with-gemma-and-mediapipe/

Repo: https://github.com/DarrylBayliss/Simon-Says-Android

GitHub

An Android App recreating the Simon Says game. Uses MediaPipe to run an LLM on device - DarrylBayliss/Simon-Says-Android

autumn basin
#

Brilliant Work @serene flower

raven parrot
#

i have a notebook where i trained a model that classify if a pair of tweets are written the same author, i used bert to do feature extraction then compare the outputs using a Manhattan distance, there results are decent but couldn't improve them without adding more data , any ideas ?

https://www.kaggle.com/code/abdelrahmanekhaldi/authorship-similarity-with-bert

lofty barn
#

Notebook sharing 😊. Very easy to write EDAhttps://www.kaggle.com/code/risakashiwabara/profiling

copper light
atomic fog
#

hi all,

This is an updated notebook which investigates global inflation trends, highlighting extreme cases and offering visualizations of country-specific inflation. You can also see the change of the inflation of your country over years. I also analyze correlation of the inflation of big economies.

https://www.kaggle.com/code/onurrr90/inflation-trends-worldwide

pure edge
raven parrot
#

that's the type of datasets that opens many paths to creativity, hats off

coarse shore
sterile coral
raven parrot
candid bone
#

Innovate your business with the power of AI.

Hello, I trust you are well.
Understanding the significance of advertising and attracting clients to your business is crucial.
Rest assured, I have a solution for you.
I have created an AI call center using Twilio
and a voice chatbot.
With its streaming voice chatbot capabilities and training potential, the AI caller can engage with customers fluently and in real-time.
Experience the transformative power of AI with my cutting-edge system.

frail mirage
dense cosmos
cyan elbow
sterile coral
agile umbra
#

Countries in Conflict Dataset (1989-2022)
Tracking Ongoing Conflicts and Fatalities Worldwide
https://www.kaggle.com/datasets/saurabhbadole/countries-in-conflict-dataset

Annual Working Hours Dataset (1870-1970)
Historical Data on Annual Working Hours per Worker by Country
https://www.kaggle.com/datasets/saurabhbadole/annual-working-hours-dataset-1870-1970

Latest Data Science Job Salaries (2020 - 2024)
Exploring Salary Dynamics and Employment Trends in Data Science Careers
https://www.kaggle.com/datasets/saurabhbadole/latest-data-science-job-salaries-2024

mint parrot
raven parrot
dense cosmos
odd zephyr
agile umbra
neat mirage
lofty barn
shy wind
#

Hi here, first analysis on a Railway dataset. I tried to focus on one specific use case that I believe can be really usefull.
I'd like to have any feedback on how I could improve and make things better.

https://www.kaggle.com/code/elliotx1000/tgv-french-analysis

atomic fog
crude palmBOT
#
bibekbhusal. has been warned

Reason: Bad word usage

past sand
#

Hello
I have just made my first deep learning project on Kaggle public. It is digit recognition on MNIST Dataset, it is also part of a Getting Started competition.
Please check it out and give some suggestions and give an upvote if you like it and learn from it.
Link of the Notebook:- https://www.kaggle.com/code/bibekbhusal0/digit-recognizer-with-keras-score-0-99978/

meager zealot
odd zephyr
copper light
agile umbra
#

📢 Exciting news! Ever wondered what vectors in a vector space look like? 💭 Check out my latest Kaggle Notebook where I delve into the visual representation of word vectors using the** Word2Vec model on the Game of Thrones datase**t! 🐉📚

🔗 Link to the Notebook: Visual Representation of Word2Vec on GoT Dataset

In this notebook, I've explored the fascinating world of word embeddings and how they can be represented visually in a vector space. Using the popular Word2Vec model, I've** analysed the rich language of George R.R. Martin's iconic series, Game of Thrones.**

👀 Dive in to:

🔍 Understand the concept of word vectors and their significance in natural language processing.
🎨 Visualise word embeddings in a high-dimensional vector space.
📊 Explore relationships between words and their vectors!
🔀 Investigate semantic similarities and analogies between words.
💡 Gain insights into how vector representations capture linguistic nuances and contexts.

Whether you're a seasoned data scientist or just curious about the magic behind language models, this notebook offers a captivating journey into the world of vectors and their applications.

Don't miss out! Click the link above to explore the notebook, and feel free to leave your comments, questions, and feedback. Let's dive into the realm of word vectors together! 🚀✨

#DataScience #WordEmbeddings #nlp #GameOfThrones #Kaggle #Word2Vec #VectorSpace #DataVisualization

full shale
#

Hi Everyone
Sharing Ana with you, Ana is an AI SDE

https://www.linkedin.com/posts/arsh-anwar_fine-tune-mistrals-7b-instruct-model-with-activity-7183577812442734592-fWYr?utm_source=share&utm_medium=member_desktop
Arsh Anwar on LinkedIn: Fine Tune Mistral’s 7B instruct model with ...
🚀 Exciting News: 𝐈𝐧𝐭𝐫𝐨𝐝𝐮𝐜𝐢𝐧𝐠 𝐀𝐧𝐚 - 𝐓𝐡𝐞 𝐅𝐮𝐭𝐮𝐫𝐞 𝐨𝐟 𝐀𝐈 𝐒𝐨𝐟𝐭𝐰𝐚𝐫𝐞 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠!

First of all Eid Mubarak to Everyone!…
Image
Checkout the video where it fine tunes Mistral's 7B Model autonomously

rough knot
agile umbra
#

🚀 Dive deep into the world of Transformers with my comprehensive guide! 📚✨ Whether you're new to the field or looking to level up your understanding, this post has got you covered. From the basics to advanced concepts, I break it all down in an engaging and accessible way.

Check it out here: Understanding Transformers: A Comprehensive Guide

Feel free to share with your fellow AI enthusiasts and learners. Let's unlock the power of State-of-the-Art Transformers together! 💡🤖 #Transformers #AI #DeepLearning #nlp #KaggleKnowledge

sturdy shadow
agile umbra
#

🚀 Dive deep into the world of Transformers with my comprehensive guide! 📚✨ Whether you're new to the field or looking to level up your understanding, this post has got you covered. From the basics to advanced concepts, I break it all down in an engaging and accessible way.

Check it out here: Understanding Transformers: A Comprehensive Guide

Feel free to share with your fellow AI enthusiasts and learners. Let's unlock the power of State-of-the-Art Transformers together! 💡🤖 #Transformers #AI #DeepLearning #nlp #KaggleKnowledge

agile umbra
dense cosmos
#

📊 Exploring Consumer Behavior with Social Advertisement Data 🛍️

Dive into fascinating insights on how age, estimated salary, and purchase decisions intersect with social media ads. Let's optimize our targeting strategies and engage consumers effectively! 🎯📈
Customer Behavior Analysis for Social Media Ads 👈 Project link

dusky zephyr
#

Well-documented, thanks for the share! 👍

dusky zephyr
agile umbra
#

Dear Friends! Could you please check on my recent Kaggle blogs and support me with an upvote?
I would deeply appreciate it if you could share your view on it by providing valuable Feedback😇

Why Attention is all you need? https://www.kaggle.com/discussions/general/493003

Ever Wonder how Prompts are interconnected in series? https://www.kaggle.com/discussions/general/493866

Exploring the ReAct Revolution https://www.kaggle.com/discussions/general/494233

dense cosmos
#

🤖 Exploring Servo Mechanisms with Machine Learning! 📊🔧

Hey everyone! 👋 I've just published a notebook analyzing the Servo Mechanism dataset using machine learning techniques. 🌟

In this notebook:

📈 Explored data distributions and correlations.
🌳 Built a Decision Tree Classifier for predictive modeling.
🔍 Optimized model performance through hyperparameter tuning.
🔄 Validated model robustness using cross-validation.
The results are impressive! Achieved over 97% test accuracy and mean cross-validation accuracy exceeding 98%. 🚀

Check out the full analysis here: Predictive Modeling for Servo System Optimization

copper light
lusty wing
#

I have been playing around with the Car Insurance Claim Prediction Dataset and developed my first Kaggle notebook:
https://www.kaggle.com/code/adriadejuan/eda-feature-selection-xgboost-and-shap-values

I have a big issue (ML related), which I have encountered several other times. Why is my model overpredicting a class? Is it all because of training set is unbalanced? Any suggestions on how to come around this issue? I can think of loads of examples where classes may be unbalanced (like disease prediction, unpayment prediction, ...), so this is something that for sure has to be addressed somewhere. My goal is to learn, so comments, tips, corrections and suggestions are more than welcome.

atomic fog
hollow willow
#

Did Full-Finetuning on Flan-T5-base model as part of my revision to refresh my fine-tuning skills: https://www.linkedin.com/posts/isham-rashik-5a547711b_generativeai-machinelearning-deeplearning-activity-7187763013926473729-Cd2F

Do star the repository: https://github.com/di37/full-fine-tuning-nvidia-question-and-answering. I have made it very easy to follow code so that beginners can start with it right away

🚀 𝐅𝐮𝐥𝐥 𝐅𝐢𝐧𝐞-𝐓𝐮𝐧𝐢𝐧𝐠 𝐨𝐟 𝐭𝐡𝐞 𝐅𝐥𝐚𝐧-𝐓5-𝐁𝐚𝐬𝐞 𝐌𝐨𝐝𝐞𝐥 𝐟𝐨𝐫 𝐭𝐡𝐞 𝐍𝐕𝐈𝐃𝐈𝐀 𝐪𝐮𝐞𝐬𝐭𝐢𝐨𝐧-𝐚𝐧𝐬𝐰𝐞𝐫𝐢𝐧𝐠 𝐭𝐚𝐬𝐤 🤖💡

I…

serene flower
#

🔍📚 Introducing my latest Kaggle notebook: "OCR Battle: 🤖 Keras | 📷 pytesseract | 🚀 EasyOCR". Explore the performance of these top OCR libraries and find the best tool for your projects. Dive in, upvote if you find it helpful, and let's continue the discussion! Link 🌟 #OCR #Kaggle #DataScience

ruby nymph
agile umbra
#

https://www.kaggle.com/code/saurabhbadole/decoding-the-attention-black-box-using-bertviz

https://www.kaggle.com/discussions/general/496770
BertViz is a tool designed to visualize the inner workings of a specific part of large language models (LLMs) called the attention mechanism. I am sure you will love this blog and find it insightful 🙂

sonic cave
#

https://huggingface.co/spaces/leomaurodesenv/qasports-website
This website presents a collection of documents from the dataset named "QASports", the first large sports question answering dataset for open questions. QASports contains real data of players, teams and matches from the sports soccer, basketball and American football. It counts over 1.5 million questions and answers about 54k preprocessed, cleaned and organized documents from Wikipedia-like sources.

past sand
#

Hello Kagglers
I have just made my another deep learning project on Kaggle public. It is digit recognition on MNIST Dataset with PyTorch, it is also part of a Getting Started competition.
Please check it out and give some suggestions and give upvote if you like it and learn from it.
Link of the Notebook:- https://www.kaggle.com/code/bibekbhusal0/digit-recognizer-with-pytorch-accuracy-99-685

copper light
abstract copper
coarse shore
frank peak
#

[S01.E03: Titanic Sinking: Chronology of a Maritime]

Hey everyone! Excited to share this: Titanic timeline visualization using Python, which all historical information (from beginning to end) is gathered and summarized from multiple sources on the internet.

I started by collecting and arranging all the historical data from many sources into a data frame, then transformed it into an interesting data visualization format by its elements using Python and its associated modules. This timeline highlights the depth of insights made available by coding, data analysis, and capturing essential events.

modest skiff
iron sparrow
copper light
atomic fog
neat mirage
atomic fog
short jetty
languid summit
#

Hi guys new here!!

agile umbra
median salmon
tiny sandal
tiny sandal
uncut dragon
#

Hello everyone in this Note book we tried to classify the sentiment of tweets if it positive or negative
I made text preprocessing and also preparing them to the model like tokonization and padding
I used also CNN ,Vgg , res net models but I edit some small details to make it for text ....
I hope you find this notebook useful
https://www.kaggle.com/code/ibrahimahmed26/sentiment-classification-cnn-vs-resnet-vs-vgg

hollow willow
#

My latest project on Fine Tuning Whisper model Speech to Text on Unseen Language

Linkedin post: https://www.linkedin.com/posts/isham-rashik-5a547711b_machinelearning-deeplearning-ai-activity-7194606938209300481-K5eh

Please⭐️ the repo if found useful: https://github.com/di37/speech-to-text-fine-tuning-on-unseen-language

🌟 𝐏𝐢𝐨𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐒𝐩𝐞𝐞𝐜𝐡 𝐭𝐨 𝐓𝐞𝐱𝐭 𝐖𝐡𝐢𝐬𝐩𝐞𝐫 𝐌𝐨𝐝𝐞𝐥 𝐅𝐢𝐧𝐞 𝐓𝐮𝐧𝐢𝐧𝐠 𝐨𝐧 𝐔𝐧𝐬𝐞𝐞𝐧 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 🔊

🚀 I will be sharing…

GitHub

This projects aims to show how whisper model can be fine-tuned on language it was not trained but is trained on similar language to it. - GitHub - di37/speech-to-text-fine-tuning-on-unseen-languag...

dense cosmos
#

🌊 Exciting News! 🌊

📢 Hey everyone! I'm thrilled to share my latest notebook in the Flood Prediction competition! 🏆 Using Linear Regression, I've managed to achieve an impressive R2 Score of 0.84558! 🚀

📈 The R2 Score of 0.84558 indicates that 84.558% of the variability in flood predictions can be explained by the model, showcasing its effectiveness in capturing the underlying patterns in the data.
Notebook Link 🔗

keen zephyr
#

Hi everyone! 👋

I'm excited to share Kaggle Plus! It's a handy extension that adds an easy-to-read bar chart on Kaggle leaderboards. 📊

👉 Check it out and download it from GitHub or Chrome Web Store.

If Kaggle Plus enhances your Kaggle experience, please hit that star on GitHub! ⭐ Your encouragement fuels further development!

🐛 Found a bug or have a feature idea? Open an issue on GitHub. Your insights are invaluable!

Thanks for the support! Let's make Kaggle even better, together. 📈

GitHub

Kaggle Plus is a browser extension that visualizes the leaderboard of a Kaggle competition. - eddielin0926/kaggle-plus

serene flower
brittle mason
umbral bobcat
lament bane
spare ridge
#

A preprocessed dataset for CHAOS - Combined (CT-MR) Healthy Abdominal Organ Segmentation. The dataset also contain scribble label for weakly-supervised learning. In additional, i also give a notebook to show how to loading and visualization the dataset. Please upvote my dataset, notebook and leave a comment for me if you liked it.

  1. Dataset:
    https://www.kaggle.com/datasets/anhoangvo/chaos-t1-and-t2
  2. Notebook for loading and visualization dataset:
    https://www.kaggle.com/code/anhoangvo/chaos-dataset-loading-and-visualization
atomic fog
lofty barn
short jetty
sour pewter
buoyant scaffold
#

🚀 Explore the Improved Euler Method!

📝 Dive into the fascinating world of numerical methods with my notebook showcasing the Improved Euler Method. Learn how this technique enhances accuracy in approximating differential equations. Whether you're a seasoned programmer or just starting your journey in mathematics, there's something for everyone in this exploration of numerical analysis. Join me on this exciting adventure!

https://www.kaggle.com/code/cauelias/improved-euler-method

warm breach
sudden prairie
#

Hello Everyone!

Have you ever wondered over which evaluation metric is the right one for your model but felt unsure about their significance? Here’s an intuitive explanation to help clarify these metrics. Hopefully, this will provide you with a clearer understanding of the topic. Enjoy!

https://www.kaggle.com/discussions/getting-started/507111

high trench
hollow willow
#

Got done with Multiclass Text Classification Project using LLMs (application - news classification based on Business, Tech, Sport, Politics and Entertainment) via Few Shot Prompting.

Linkedin post: https://www.linkedin.com/posts/isham-rashik-5a547711b_llama-meta-openai-activity-7200054167829135361-ST7M

In depth blog on README file - Github link: https://github.com/di37/multiclass-news-classification-using-llms

📈 𝐁𝐞𝐧𝐜𝐡𝐦𝐚𝐫𝐤𝐢𝐧𝐠 𝐋𝐚𝐫𝐠𝐞 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬 𝐟𝐨𝐫 𝐓𝐞𝐱𝐭 𝐂𝐥𝐚𝐬𝐬𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧 - 𝐮𝐬𝐢𝐧𝐠 𝐅𝐞𝐰 𝐒𝐡𝐨𝐭…

GitHub

This repository contains a project that focuses on evaluating the performance of different Language Models (LLMs) for multi-class news classification. The project aims to assess how well LLMs can c...

#

Please do ⭐️ the repository if found useful 😊

low valley
#

Hello, everyone! 👋

Given how integral AI tools are becoming in our daily lives, understanding and managing AI Agents is a valuable skill to acquire in 2024. We can definitely expect AI Agents to become even more prevalent in the coming years for a wide range of tasks.

With this in mind, I've posted a new Kaggle notebook where we explore tools like CrewAI to create crews of different AI Agents, each with their own roles and background stories, to perform various tasks across different domains.

Feel free to check it out on Kaggle: https://www.kaggle.com/code/lusfernandotorres/empowering-ai-agents-with-gpt-4-turbo-and-crewai/notebook

Learn how to master AI Agents and stay ahead in the latest developments of tools like this one!

buoyant scaffold
#

Hey everyone!

I have created a Notebook about the Improved Euler method, a numerical method that is used to find a good approximation of some function. I showed the differences between the Euler method and the Improved method.

https://www.kaggle.com/code/cauelias/improved-euler-method

odd zephyr
#

Hi everyone,

I created a notebook performing image classification implemented in JAX. I used my dataset, which contains over 15,000 images and 30 different class labels, to train my model. If you guys are interested in using the dataset, in the notebook, I also show how to make a dataloader to interface with the particular file structure. You can use that as a starting point for your experiments!

Dataset: https://www.kaggle.com/datasets/alistairking/recyclable-and-household-waste-classification
Code: https://www.kaggle.com/code/alistairking/recyclable-waste-classification-with-jax

tawny bronze
dense cosmos
#

🚀 Excited to share my latest Kaggle notebook on predicting academic success! Explored deep insights with visuals, tested models like Logistic Regression, Decision Tree, Random Forest, and KNN. The winner? Random Forest with **82.12% **accuracy! Check it out and drop an upvote if you find it helpful. Thanks for exploring! 📊📚 #DataScience #Kaggle #AcademicSuccessPrediction
[Notebook Link] - https://www.kaggle.com/code/sakshisatre/predicting-academic-success-eda-rfc

glossy storm
#

Divorce or Stay Dataset is a dataset published by some researchers. In their paper, they used many machine learning models to classify individuals into married or divorced just by asking 54 questions.

I found something interesting while exploring the dataset: you can actually use just one variable to classify the dataset without even using machine learning, achieving an accuracy of 98.25%, which is higher than the researchers' findings! Please check out the notebook right here and let me know what you think: Link

dense flare
glossy storm
# dense flare Did you ask the individuals: "Are you married or divorced?" 😂

The Dataset doesn't belong to me, and my post stated clearly that the dataset doesn't belong to me, it was collected by the authors of the paper titled: Divorce Prediction using Correlation-based Feature Selection and Artificial Neural Networks, I invite you to read it if you are curious about the dataset collection methodology Link

dense flare
solid helm
raven parrot
cyan elbow
olive cargo
spice yarrow
#

Hello everyone I have created a Intresting project please check it out and it you like it please upvote and comment

lethal mica
slim mauve
#

Hey guys,

This is Arsalan from CAMB AI -- we've spent the last month building and training the 5th iteration of MARS, which we've now open sourced in English on GitHub https://github.com/camb-ai/mars5-tts

We've have also been featured on VentureBeat: Check it out here.
We'd really love if you guys could check it out and let us know your feedback. Thank you!

GitHub

MARS5 speech model (TTS) from CAMB.AI. Contribute to Camb-ai/MARS5-TTS development by creating an account on GitHub.

spice yarrow
cedar moth
spice yarrow
lethal mica
#

Hello everyone, I recently added this Satellite image classification project with 99 % accuracy, where I employed Convolutional Neural Networks. This is a friendly project for beginners understanding. I will be very thankful if anyone takes some minutes to check it out and upvote if you think that it is an interesting work which could help you somehow, thank you a lot for the support!!!: https://www.kaggle.com/code/edumisvieramartin/satellite-image-classif-99-accuracy

mossy mica
glossy storm
#

Hello Data Scientists and Machine Learning Enthusiasts,

We're thrilled to announce the release of three brand-new datasets, ready for you to explore and use in your predictive machine-learning projects. Dive into these datasets and enhance your models this weekend!

The Datasets:

🌬️ Asthma Disease Dataset

  • Description: A comprehensive dataset on asthma, including patient demographics, symptoms, and treatment outcomes.
  • Ideal For: Classification, regression.

🧠 Alzheimer's Disease Dataset

  • Description: Detailed information on Alzheimer's disease, featuring cognitive test results, MRI scans, and genetic data.
  • Ideal For: Predictive modeling, clustering.

🏥 Parkinson's Disease Dataset Analysis

  • Description: An extensive dataset on Parkinson's disease, including voice recordings, medical history, and symptom progression.
  • Ideal For: Feature extraction, classification.

Get Involved:

Share your findings and models with the community. Join the discussion on our forum and let us know how you’re using these datasets in your projects.

Happy coding, and may your models be ever accurate!


Stay tuned for more updates and future releases. Let's continue to push the boundaries of what's possible with data and machine learning.

lethal mica
odd zephyr
glossy storm
#

🎉Another One: New Two Datasets Dropped on Kaggle! 🎉

Hey data enthusiasts! 🎉

We've just released not one, but two amazing new datasets on Kaggle. We're here with another one to keep your data cravings satisfied. Check them out:

  1. Predicting Hiring Decisions in Recruitment Data: Dive into this dataset to explore the factors influencing hiring decisions. Perfect for HR analytics and predictive modeling!

  2. Predicting Manufacturing Defects Dataset: This dataset offers a rich collection of data on manufacturing defects, ideal for improving quality control and reducing waste in production processes.

Get ready to enhance your data science projects with these fresh additions. Happy coding and may your insights be plentiful!

Stay tuned, Consider following for more updates because you know there's always... another one.

sterile coral
glossy storm
#

🦹‍♀️ 🕹️ Yes, It's Another One: Not Just A Dataset But A Game! 🎮 👾

Hey data enthusiasts and superhero fans! 🎉 Get ready for an exhilarating data science game that brings epic superhero battles to your fingertips. Our latest dataset launch is here to provide endless fun and challenge:

🦸 Fictional Character Battle Outcome Prediction 🦸

Step into the arena of legendary showdowns with our new dataset. It's time to put on your data science cape and dive into the ultimate gameplay experience. Here’s what awaits you:

Gameplay Features:

  1. Battle Outcome Predictions: Use your data science skills to predict the winner of epic battles between iconic characters from Marvel and DC Comics. Can your model accurately forecast the victor based on attributes like strength, speed, and intelligence?

  2. Attribute Analysis: Uncover the secrets behind each battle. Analyze which character traits and special abilities are the most decisive in determining the outcome. Is it raw strength or strategic intelligence that wins the day?

  3. Scenario Simulation: Become the game master! Adjust character attributes and simulate different battle scenarios. Experiment with changes in special abilities and weaknesses to see how they impact the fight.

Are you ready to embark on this data science adventure and emerge as the ultimate champion in the battle of heroes and villains? Dive in now and let the games begin!

🔗 Check out the dataset here

Stay tuned, consider following for more updates because you know there's always… another one.

glossy storm
#

🎉 Another One: Five New Datasets Dropped on Kaggle! 🎉

Hey data enthusiasts! 🎉 We've just released not one, but FIVE amazing new datasets on Kaggle. We're here with another one to keep your data cravings satisfied. Check them out:

  1. 🐾 Predict Pet Adoption Status Dataset: Explore the factors influencing pet adoption and help shelters optimize their processes. Perfect for animal welfare analytics and predictive modeling!
    Predict Pet Adoption Status Dataset

  2. 📱 Predict Consumer Electronics Sales Dataset: Dive into sales data for consumer electronics, ideal for understanding market trends and boosting sales strategies.
    Predict Consumer Electronics Sales Dataset

  3. 🏠 Predict Smart Home Device Efficiency Dataset: Analyze the efficiency of smart home devices.
    Predict Smart Home Device Efficiency Dataset

  4. 🎓 Predict Online Course Engagement Dataset: Discover the factors that drive student engagement.
    Predict Online Course Engagement Dataset

  5. 🛍️ Predict Customer Purchase Behavior Dataset: Investigate consumer behavior.
    Predict Customer Purchase Behavior Dataset

Get ready to enhance your data science projects with these fresh additions. Happy coding and may your insights be plentiful!

Stay tuned, consider following for more updates because you know there's always… another one.

wind bolt
spice yarrow
sharp horizon
#

im just starting so feedback be worth

cloud flume
#

I'm Excited to Share My New Article on Implementing Linear Regression from Scratch in Python!

As a machine learning & data science enthusiast, I'm thrilled to share my latest article on Medium, where I dive deep into implementing linear regression from scratch in Python, without relying on any machine learning (ML) packages like scikit-learn.

Read Here:

https://medium.com/@amitsubhashchejara/linear-regression-from-scratch-in-python-ee1a955e49ed

Medium

Learn the implementation of linear regression from scratch in pure Python. Cost function, gradient descent algorithm, training the model…

normal spoke
#

Hope you do the same with logistic regression

cloud flume
normal spoke
#

and it specially helps with the math too

#

For sure!

cloud flume
normal spoke
#

You are on fire man

cloud flume
normal spoke
#

I will try to keep updated

normal spoke
cloud flume
normal spoke
#

BSc?

cloud flume
normal spoke
#

Ohh

cloud flume
#

I have done BSc math

normal spoke
#

what made you jump into ml?

cloud flume
normal spoke
#

Ah

#

how long have you been in Ml for?

cloud flume
normal spoke
#

Oh damm

#

The post was structured very well , thought you have been doing it for years xd

cloud flume
#

What are you persuing?

normal spoke
# cloud flume What are you persuing?

I am still in high school , I started recently in Ml because it had been interesting me for a while and if everything goes well probably will pursue it as a career

cloud flume
#

You're from India?

normal spoke
#

Nope , Egypt

sharp horizon
sharp horizon
wind bolt
sharp horizon
snow cloud
#

Hi, do we have any dataset that intersects renewable energy and NLP domain? Kindly guide.

lethal mica
sharp horizon
lofty barn
sharp horizon
pearl silo
sharp horizon
sharp horizon
sharp horizon
sharp horizon
spare ridge
#

I have just created a TikTok videos dataset to build a video classification model for classifying harmful content for children. It is a total of 30 GB with approximately 3,000 videos. Feel free to explore it; I also provide a notebook to show how to fine-tune a Hugging Face model for this dataset.

https://www.kaggle.com/datasets/anhoangvo/tikharm-dataset/

https://www.kaggle.com/code/anhoangvo/how-to-use-hugging-face-for-fine-tuning-on-the-tik

Let me know if there are any specific changes or additional details you’d like to include!

pliant chasm
cloud flume
lethal mica
spice yarrow
#

Hi everyone, I am new to machine learning and currently learning. I've created a notebook on the legendary Titanic dataset. Please take a look, and if you have any suggestions or ideas to increase accuracy, feel free to comment and upvote. Your feedback is greatly appreciated! https://www.kaggle.com/code/abhishek0032/titanic-survival-prediction-feature-engineering

sharp horizon
hollow willow
#

Hello all! I just hosted a live online workshop on Knowledge Graphs, featuring Video Game Sales as our case study for RAG. Take a look and let me know what you think! Your honest feedback and reactions are much appreciated. Thanks! 😊🤩🚀 https://www.youtube.com/watch?v=9wqVz0LDYgg&ab_channel=DecodingDataScience

Get ready to dive into the world of natural language querying with Langchain and Neo4j! Learn how to interact with graph databases using cypher query language and discover the power of combining these two technologies. Whether you're at the start of your career or a seasoned expert, this event is perfect for anyone interested in data querying an...

▶ Play video
tacit crown
potent garden
lethal mica
#

Hello friends, I have created a beginner-friendly notebook where I used three different methods to remove outliers from the "Rating" feature in a TV shows dataset on Kaggle. The aim of this project is to help those getting started on Kaggle by demonstrating how statistical dispersion metrics are used in practical to remove outliers from a dataset, facilitating data processing, which is one of the most important tasks in a data scientist's daily routine, I hope you like it and find it helpful: https://www.kaggle.com/code/edumisvieramartin/remove-outliers-iqr-std-zscore-beginners

lethal mica
glossy storm
#

🚀 Announcing the Launch of FRAUDFIGHTER: A Comprehensive Notebook for Credit Card Fraud Detection 🚀

Hello, fellow data enthusiasts!

I am excited to share with you my latest notebook, FRAUDFIGHTER: Detecting Credit Card Fraud with 97% Accuracy. After seeing numerous notebooks struggle with data imbalance and outlier handling, I decided to create and publish this notebook to address these issues effectively.

Have you ever wondered what’s more important: overall accuracy or reducing the number of fraudulent transactions classified as legitimate? This notebook delves into this critical question and provides insights and solutions to improve fraud detection models.

I invite you to explore FRAUDFIGHTER, try out the techniques, and share your thoughts. What do you think about the methods used? How did they impact your model's performance?

Looking forward to your feedback and opinions! Let’s make fraud detection more robust and accurate together!

potent garden
#

Check out my very colorful EDA

glossy storm
atomic fog
#

Hello everyone,
In my Data Analysis and Statistical Analysis project, I analyzed the used car market in the United States using approximately 264,000 used car data points that I scraped from the Edmunds.com website. With the help of the analysis I tried to get an understanding of the second-hand vehicle market in the United States ,You can find the code and detailed analysis for this project from the link below.
https://www.kaggle.com/code/emirtatlc/eda-and-analysis-of-used-car-data-edmunds-com

sharp horizon
potent garden
sharp horizon
sharp horizon
pure grotto
potent garden
#

Please do check out my notebook it will help you understand deep learning concepts

brittle mason
tacit crown
lethal mica
snow cloud
#

Hi, i need to deploy simple IRIS classifier on Azure, via CI/CD pipeline. Can someone please help me on this?

lethal mica
lofty barn
urban kayak
#

Hello everyone!

Explore my Dataquest project portfolio to dive into the world of data science and analysis. Gain hands-on experience and practical insights as you learn web scraping, API usage, and the essential skills of exploring, cleaning, visualizing, and modeling data using Python, SQL, Excel, and Power BI.

Whether you're a beginner looking to start your data science journey or an experienced practitioner aiming to broaden your expertise, these projects offer invaluable opportunities to enhance your knowledge and skills.

https://www.kaggle.com/datasets/medalytics/dataquest-projects

glossy storm
#

🎓📊 Unlock insights with our comprehensive Students Performance Dataset! Dive into demographics, study habits, and academic outcomes of 2,392 high school students. Perfect for research and predictive modeling! Explore now: Link To Dataset

sharp horizon
lethal mica
supple prism
spice yarrow
lethal mica
nocturne merlin
#

Language Modelling and Text Generation are Trending Now. Check out my Latest Notebook

Topic :
Step by Step Language Modelling with Bengali Text Corpus for Text Generation and Text Completion

https://www.kaggle.com/code/sayankr007/bengali-text-generation-and-language-modelling

Visit the above link to access the work and let me know your insights and thoughts in comments.

pliant chasm
#

I should be studying for my exams. So, naturally I did a very quick project. I used DBSCAN for string clustering in the recently launched community competition, 'Manufacturer Name Clustering.' I’d appreciate any feedback, and if this sounds interesting, consider joining this unique competition as well. https://www.kaggle.com/code/lennarthaupts/matching-firm-names-using-dbscan

spice yarrow
slim moat
#

don't forget to upvote the notebook if you like it 😍

lethal mica
willow dragon
lofty barn
slim moat
high ermine
#

Hi everyone, I like to create some "dumb" projects. I create a discord bot that show how many hours was wasted on youtube. You just put a video or a playlist and them all videos are stored and it display how many time was watched on total. Youtube does not provide this information, so I just multiply the views by the video duration. Currently there are more than 10M of years watched

slim moat
#

all of plotly plots+most of technics and animation

potent garden
lethal mica
olive hemlock
slim moat
#

video games sales deep analysis + 3 models

#

i made a notebook on video games sales with deep analysis and 3 model on it

#

you can take a look at it and if you like it dont forget me with small upvote

lofty barn
lethal mica
glossy storm
#

🥐 Rohlik Competition: 8 Ready-to-Use Scripts

Hello everyone! With just 20 days left in the Rohlik competition, check out this notebook featuring 8 different ideas and 8 ready-to-use scripts. Pick the one that resonates with you and develop it into something even more powerful. Let’s finish strong! Explore the notebook here.

cloud flume
slim moat
#

movies data set

#

simple movies recomandation system with machine learning

#

model \

lethal mica
lethal mica
atomic fog
willow dragon
slim moat
#

how to make your own linear regression model from scratch using mathematics

#

how linear regression models work's backstage

umbral thicket
slim moat
#

how to 100% automate the process of data cleaning , EDA and modling with AIautomated tools

#

\this notebook could help you in time saving as this tools make all the process in small code most of hem in 1 line and also in relatively short time

slim moat
#

how to make your own features scaling with simple math and python

#

from scratch

slim moat
#

how to make your own logistic regression model from scratch using only math and python ?!! how model works backstage
to see how model works from zero is very important it make you have a wide view and high control over model.
this notebook show you
1- how to make your own logistic regression model from scratch
2 - apply regularization for the model to prevent overfitting from scratch
3- make feature scaling from scratch to your data you will use it for the model
4- making from scratch the model evaluating function that measure model accuracy
**this all will improve your understanding and make you know what happen in all of this process in the
back stage **
100% accuracy model

cedar moth
spice yarrow
slim moat
#

how to make simple ANN(artificial neural network) from scratch ?!!
in this notebook i will show you how artificial neural network(ANN) is done from how layers is made and how activation function is implemented in the neuron and every thing in the simple neural network .
this will help you understand what is happing in the backstage of ANN and this is also so amazing and fun
to know how some thing that is Inspired from brain can simply be implemented in code even from scratch 🤯

supple prism
languid vapor
languid vapor
glossy storm
#

We are excited to announce our latest collaborative project, Fruit-Veg Image Classification | CNN Acc 97%, developed by me and Data Scientist Edumis Viera Martin.

In this notebook, we implemented a Convolutional Neural Network (CNN) to achieve 97% accuracy on classifying images of fruits and vegetables.

Join us on this journey, explore the code, and feel free to share your feedback and improvements.

Check out the notebook here: Fruit-Veg Image Classification | CNN Acc 97%

Happy coding! 🚀

verbal inlet
#

Hey there!

I wanted to share something a friend and I built recently: it's a VS Code extension that connects your local Jupyter notebooks to remote GPUs.

One annoying thing I've often experienced in ML research is how much hassle it is to actually get an experiment running on GPUs: you need to provision the GPU, get it set up with the right environment and packages, SSH into it and then move your code across.

That's why we made Moonglow: it's the ease of switching runtimes like in Colab/Kaggle notebooks, but living in your own IDE and with the cloud GPUs you want e.g. AWS or Runpod (rather than whatever Google decides to offer).

Try it out for free at https://moonglow.ai/, and of course, my DM's are open if you have any issues or need help setting it up!

spice yarrow
white silo
arctic tulip
sudden prairie
#

Hey everyone! I've just published a detailed blog on Medium that walks through the end-to-end implementation of a Machine Learning model. This guide covers everything from data preprocessing, model building, and evaluation, all the way to deploying the model using Streamlit for real-time predictions.

If you're interested in learning how to take your ML projects from concept to production, this blog is for you. Whether you're a beginner or looking to refine your deployment skills, you'll find something valuable here!

Check it out and let me know your thoughts!
https://medium.com/@saumya.nishi96/5-game-changing-insights-mastering-network-anomaly-detection-with-machine-learning-87b13891662f

Medium

Step by Step framework for revitalising ML models

slim moat
#

fine tuning a CNN mobilenet model on dign languages images wiht 99.9% accuracy

high trench
long nexus
#

Hello, I have created my first tutorial notebook on the Kaggle platform. I would be very happy if you could review it and provide feedback on how I can improve myself.
https://www.kaggle.com/discussions/getting-started/529712
https://www.kaggle.com/code/karamel03/how-to-use-mcc-step-by-step/notebook?scriptVersionId=193601960

still rivet
#

Is your ML script too cluttered for manual logging? 😵 No worries—here’s the seamless solution you’ve been looking for! ✨ Check out the project and give a pull request : https://logllm.tiiny.site/

glossy storm
#

I wanna share my latest collaborative project with the data scientist Anna Balatska (@annastasy). Together, we explored 13 pretrained models.

We used transfer learning to build a binary image classification system.
Dive into our notebook to see how these models stack up against each other in the fight against cancer.

🔗 Check it out now!

Your feedback and thoughts are always welcome! 😊

serene flower
#

🚀 Excited to Share My Latest Kaggle Notebook! 🚀

I recently completed a Kaggle Notebook titled "📚 PyTorch 101: Mastering MNIST Digit Recognition." This notebook is a beginner-friendly guide designed to help those interested in learning PyTorch, focusing on the foundational concepts by working through the classic MNIST digit recognition task.

In this notebook, I dive into:

🧠 Understanding PyTorch Basics: Breaking down the structure and key concepts to help you get started with PyTorch.
🔢 Implementing a Simple Neural Network: From loading the dataset to building and training the model, I cover each step in detail.
📊 Visualizing Results: Providing insights into the model's performance through clear visualizations and explanations.
Whether you're new to PyTorch or looking to solidify your understanding, I hope this notebook serves as a valuable resource in your learning journey.

Check it out on Kaggle and let me know your thoughts! 💬

Link: https://www.kaggle.com/code/muhammadfurqan0/pytorch-101-mastering-mnist-digit-recognition

cinder tendon
still rivet
slim moat
#

ocular diseases deep analysis + CNN model

spice yarrow
languid vapor
high trench
still rivet
potent garden
acoustic sonnet
rotund smelt
languid vapor
noble sleet
#

This is my first project using a Neural Network to analyze the Medical Cost Personal dataset. I'm excited to explore the capabilities of deep learning. If you find this notebook helpful or have any suggestions for improvement, please feel free to upvote or leave feedback!

https://www.kaggle.com/code/dinanksoni/medical-cost-personal-using-neural-network

languid vapor
heady pivot
celest spruce
robust kayak
languid vapor
acoustic pine
languid vapor
thorn heath
slim moat
#

AI hame agemt that can play doom game projects

#

the model and the full code for the project to use

languid vapor
fathom lantern
sharp horizon
sharp horizon
lofty barn
tacit crown
somber spear
agile umbra
#

Dear Mates, check my new notebook on "Optimizing Sentiment Analysis Using Transformers🧠"
would be grateful if you could suggest any feedback 🙂

https://www.kaggle.com/saurabhbadole/optimizing-sentiment-analysis-using-transformers


Additionally, I would be glad if you could share your work on the dataset below.

https://www.kaggle.com/datasets/saurabhbadole/crime-incidents-in-los-angeles-2020-to-present

https://www.kaggle.com/datasets/saurabhbadole/indian-stock-market-master-data-24

glossy storm
#

Excited to share my latest notebook on car price prediction! In this analysis, we explore how stacking regression method can significantly outperform basic approaches with single models and hyperparameter tuning. The notebook showcases the power of combining multiple models to enhance predictive accuracy.

View the Notebook

🔍 Key Highlights:

  • Implementation of stacking regression using top-performing models.
  • Comparison of results with traditional single-model approaches.
  • Insights into how complexity can impact performance and the importance of model selection based on dataset characteristics.

Check out the notebook to see how stacking regression can be a game-changer for your predictive modeling tasks! Feel free to dive in and explore the techniques used.

View the Notebook

languid vapor
flat monolith
proper merlin
languid vapor
tacit crown
#

Hi guys so finally I uploaded my final approach on recent amazon ML challenge compiled in one notebook also i have documented everything 🙂 please check that and do let me know your views 🙂

https://www.kaggle.com/code/suvroo/bert-paddleocr-amazon-ml-final-approach

https://www.kaggle.com/code/suvroo/trocr-amazon-ml-approach-1-high-computing

short jetty
languid vapor
pastel sapphire
#

Hello Kagglers!

🚀 Excited to share my latest Kaggle project: Text Data Preprocessing & Sentiment Analysis! 🎉

In this notebook, I dive deep into text preprocessing techniques and sentiment analysis, leveraging key tools and methods, including:

PorterStemmer and word_tokenize for text cleaning WordCloud for visualizing the most frequent words TextBlob and SentimentIntensityAnalyzer for sentiment analysis CountVectorizer for feature extraction Performance evaluation using accuracy_score, classification_report, confusion_matrix, and ConfusionMatrixDisplay
If you're keen on mastering text data workflows and sentiment analysis, check it out here: https://www.kaggle.com/code/asadozzaman/text-data-preprocessing-sentiment-analysis

shell wyvern
#

Hey folks!

I am an AI developer, particularly focused on NLP, and I’m looking for someone with deep experience in this field to collaborate on several projects. The ideal companion should have a strong background in NLP, with multiple projects under their belt. If you’re a beginner, please refrain from contacting me.

To demonstrate my expertise, here’s one of my best projects:

Adify AI: A website where users can enter any prompt, and the model will generate playlists on Spotify based on that input. The platform uses a trained Transformer model and integrates a FAISS index for efficient search by comparing embedding matrices to deliver the best playlist options.

Please don’t reach out if your timezone differs significantly from Vienna’s (CET).

flat monolith
flat monolith
spare ridge
#

A simple notebook to run ComfyUI GUI with localtunnel. With P100 GPU, it take ~23 seconds to generate 4 images. It is pretty good.
https://www.kaggle.com/code/anhoangvo/run-comfy-gui-with-localtunnel-on-kaggle

Another notebook to combine LLM HF with ComfyUI to generate images for text-stories. This notebook use ComfyUI API instead of GUI for automation.
https://www.kaggle.com/code/anhoangvo/generate-images-for-stories-using-llm-and-comfyui

tacit crown
neat flicker
#

Hey everyone! I dove into the Polars library for the latest playground series competition to see if the hype around it was legitimate, I think it is! Check out these notebooks to see for yourself:
https://www.kaggle.com/code/jonbown/car-price-regression-with-polars-vs-pandas/notebook

https://www.kaggle.com/code/jonbown/regression-with-polars

celest spruce
celest spruce
pastel sapphire
#

🌟 Excited to share my latest project on Kaggle, where I dive deep into the factors influencing student performance in exams! 📚✨ By analyzing key variables such as study habits, attendance, parental involvement, access to resources, and extracurricular activities, I aim to uncover the most significant predictors of academic success.

This journey is all about transforming data into actionable insights that can inform educational strategies and interventions, ultimately benefiting educators, policymakers, and parents. Together, we can enhance student outcomes and pave the way for brighter futures!

Looking forward to sharing findings and learning from the community. Let's elevate education! 🚀👩‍🎓👨‍🎓

check it out here: https://www.kaggle.com/code/asadozzaman/pathways-to-predicting-student-success

paper flicker
#

google_search_rs: Rust Crate to Parse Google Search Results into CSV and dataframe.

I had recently completed the basics of rust and worked on a few simple projects. I wanted to take something good, so while working on a project at my organization, I saw that something like this didn't exist in Rust.

So I started doing this yesterday, and it's just so awesome that I have published the crate in around 25 hours.

crate link: https://crates.io/crates/google_search_rs
github link: https://github.com/ChiragChauhan4579/google_search_rs/

Have attached a demo for the same.

fathom lantern
sturdy sequoia
#

Hey everyone! 👋

I've just published a new Kaggle notebook titled Breast Cancer Prediction

This notebook makes extensive use of MLflow and DagsHub for experiment tracking, enhancing collaboration.

Table of Contents :

Breast Cancer Prediction Using Machine Learning
Problem Understanding
Import Libraries
Load and Explore the Data
Create DataFrame
Data Exploration and Visualization
Check for Missing Values
Statistical Summary
Distribution of Features
Boxplots of Features
Correlation Matrix
Data Preprocessing
Separate Features and Target
Handle Outliers Using IQR
Log Transformation
Train-Test Split
Class Distribution in Training Set
Handling Imbalanced Data with SMOTE
Feature Scaling
Model Training and Evaluation
Logistic Regression
K-Nearest Neighbors
Support Vector Machine
Decision Tree Classifier
Initialize DagsHub for MLflow Tracking
Model Tracking with MLflow
Set Tracking URI
Define Function to Log Model Results
Log Initial Models
Feature Selection
Correlation Matrix Feature Selection
Model Training with Selected Features
Mutual Information
Sequential Feature Selection
Embedded Methods
Log Models After Feature Selection
Hyperparameter Tuning
Grid Search with KNN
Grid Search with Logistic Regression
Random Search with KNN
Random Search with Logistic Regression
Log the Tuned Models
Register Best Model
Create Pipeline

If you're interested in machine learning, data preprocessing techniques, model evaluation, or learning how to integrate MLflow and DagsHub into your projects, this notebook has something for you.

Check out the notebook here.
https://www.kaggle.com/code/mmfsnol/breast-cancer-prediction
Appreciate your vote if it benefits you 👍

potent garden
fathom lantern
dark laurel
#

Hi all,

I'll be sharing a series of notebooks exploring its applications in data science using Python.

Part 1 Notebook: https://www.kaggle.com/code/yogitamutyala/applications-of-linear-algebra-in-data-science-i

In this part, I delve into the key concepts and applications of Linear Algebra in the field of Machine Learning. This blog series aims to provide practical examples and code snippets to help you understand the concepts better.

Stay tuned for the upcoming parts!

velvet holly
#

Automating Home Loan Approval Process with Machine Learning

In this project, I developed a credit scoring model aimed at automating the evaluation process for home loan applications in banks. The model focuses on predicting the risk level of applications, specifically forecasting whether applicants will repay the loan or default. The goal is to detect high-risk applications early and provide clear explanations for rejected applications.

The dataset consists of financial information from 5,960 applicants. The target variable indicates whether an applicant failed to repay the loan. I utilized various machine learning algorithms to improve the model's accuracy and thoroughly analyzed the results.

You can access the project through this link: https://www.kaggle.com/code/oguzuzan/home-equity-loan-prediction

olive hemlock
spice delta
#

Hey everyone,

This is Harsh and I am currently working as an AI researcher and have experience in machine learning. I recently started participating in Kaggle and would really appreciate it if you could take a look at my notebook for the Used Car Regression competition and provide feedback. Here’s the link: https://www.kaggle.com/code/harshsharma1128/used-car-regression/notebook?scriptVersionId=199146028.

Any insights or suggestions would be incredibly helpful. Thanks in advance!

pastel sapphire
#

Hey everyone,

Excited to share my latest project on Topic Modeling! In this notebook, I dive into various powerful techniques like LDA, BERTopic, GSDM, and Non-negative Matrix Factorization (NMF) to uncover hidden topics within datasets. From understanding the problem to detailed data exploration and preprocessing, this project covers key steps in finding insightful patterns. Check it out on Kaggle! https://www.kaggle.com/code/asadozzaman/topic-modeling-in-nlp-with-abc-news-sample

#TopicModeling #nlp #MachineLearning #DataScience #KaggleNotebooks #LDA #BERTopic #GSDM #NMF #DataExploration #DataPreprocessing

west karma
#

Hey everyone,

I created user comments for a fictional mobile application to do analysis work and performed analysis projects. I also opened this data set to other users on Kaggle. If you want to examine both the data set and the notebooks, I share it with you.

Dataset: https://www.kaggle.com/datasets/sanlian/app-store-reviews-for-a-mobile-app
Labelling Notebook: https://www.kaggle.com/code/sanlian/auto-review-labelling
Analysis Notebook: https://www.kaggle.com/code/sanlian/app-store-reviews-sentiment-analysis-and-wordcloud

languid vapor
tacit crown
#

Hi guys checkout these 2 notebooks i made on explaiable ai with deep maths Intuition too, so kindly check that out 😉

https://www.kaggle.com/code/suvroo/fooling-the-neural-network-adverserial-attack

https://www.kaggle.com/code/suvroo/how-ai-interprets-explainable-ai-lime-shap

pastel sapphire
#

🚢 New Project Alert: Beginner-friendly Guide to Titanic Classification 🧑‍💻

I'm excited to share my latest machine learning project using the Titanic dataset from Kaggle, designed to help beginners easily understand the process of building a classification model!

🔍 What's Inside: 0. Introduction

Problem Understanding
Importing Libraries
Data Exploration
Data Preprocessing
Exploratory Data Analysis (EDA)
Label Encoding
Feature Engineering
Target Selection
Model Fitting & Predictions
Evaluating Model Performance
Hyperparameter Tuning
Feature Importance
Final Decision
This project is a step-by-step guide, perfect for those looking to dive into Titanic classification while learning the key concepts of data science. I'm excited to contribute and look forward to feedback from the Kaggle community! 🌟

Check it out and let’s discuss: 👉 https://www.kaggle.com/code/asadozzaman/beginner-friendly-guide-to-understanding-titanic

#MachineLearning #DataScience #Kaggle #TitanicDataset #AI #BeginnerFriendly #Classification #um-game-playing-strength-of-mcts-variants

languid vapor
low valley
#

Hey folks! Just wanted to share my latest project with you all, CapyTrader! 🦫 📊

This is a light-hearted project I've completed as an introduction to AI Engineering, the art of developing and deploying efficient AI-powered applications by using proven models.
This project will show you how to:

• Use YFinance to fetch financial data
• Give this data for an AI bot powered by GPT-4
• Build a simple, yet intuitive, user interface with Flask

Feel free to check out the GitHub repository below to run CapyTrader on your local machine and try it yourself! 🚀

https://github.com/luuisotorres/capytrader-ai

GitHub

CapyTrader is an AI-powered bot designed to analyze stock data and provide opinions on future price movements. Just for fun. - luuisotorres/capytrader-ai

potent garden
velvet holly
spice yarrow
lament nimbus
frank peak
#

[S01.E04 - Elemental Insight: Pokémon Type and Base Stats]

🎮🍃 This episode takes me back to one of my favorite childhood memories—playing Pokémon. Like many…

serene flower
#

🚀 Happy to Share My Latest Kaggle Notebooks! 🚀

1️⃣ Wheels & Deals: Regression Modeling for Cars 🚗
Explore how regression models can help predict used car prices, turning data into actionable insights for smarter buying and selling decisions!

2️⃣ Question Twins: Analyzing Quora's Similarity Game
Dive into the world of Quora question pairs, using machine learning to detect and analyze whether questions are duplicates or distinct!

dawn tangle
glad cedar
#

If interested, please DM me.
Thanks.

hollow willow
short jetty
raven imp
umbral agate
raven imp
#

Thanks Dinesh! I looked your code and upvoted! Nice work Dinesh! I'll keep an eye in your code to study and learn more! I would highlight your Correlation Matrix analyze, very interesting!

tacit crown
raven imp
wraith bridge
#

https://www.kaggle.com/datasets/jakubkhalponiak/phones-2024
https://www.kaggle.com/code/jakubkhalponiak/a-study-of-smartphones-available-in-2024

I have webscraped phones from gsmarena.com and published a notebook and the dataset i would apreeciete any feedback on this as its my first time posting anything on kaggle

tacit crown
hollow willow
cloud flume
#

🚀 Excited to Share My First Contribution to NumPy!🎉

I'm thrilled to announce that I've successfully contributed to the NumPy library.

In this contribution, I focused on improving the documentation related to floating-point precision. Recognizing that many users, especially those new to programming and data science, may encounter challenges with floating-point arithmetic, I added a new section to the documentation. This section explains the nuances of floating-point operations and provides practical examples to help users better understand how to handle small inaccuracies in calculations.

You can check out my contribution here: https://github.com/numpy/numpy/pull/27602

A huge thank you to the NumPy community and reviewers for their guidance and support throughout this process! I'm looking forward to continuing my journey in open-source contributions and exploring more ways to enhance the data science ecosystem.

GitHub

This pull request updates the documentation related to floating-point precision in NumPy, specifically addressing the issue of incorrect determinant calculations for certain matrices.

Added a not...

potent garden
static cedar
#

Hi @everyone,

I’m excited to share my latest work on long-form summarization! 🎉

I’ve posted a Kaggle notebook for a competition that explores long-form summarization using Gemini-1.5-Pro, Google’s latest model with an impressive context window up to 2 million tokens.

Highlights of my work:

  • Explored the largest 10k filings finance dataset I found it through a research paper.
  • Utilized Chain-of-Thought Prompting to generate structured summaries.
  • Implemented Summary Evaluation using state of the art LLM-as-Judge Strategy.

I’d really appreciate it if you could take a look and share your thoughts and feedback!

Thanks,
Mihir

https://www.kaggle.com/code/mihir2891/long-form-summarization-cot-llm-as-judge-eval

opaque hedge
azure tangle
#

Job-Scout is an open-source CLI tool that aggregates remote Machine Learning, AI, and Data Science job listings from Twitter and Hacker News. It analyzes your resume to match and rank jobs based on your skills and experience, providing you with personalized job recommendations. The project is highly customizable—users can easily tweak the search to find internships or specific roles. Contributors are welcome to join and enhance this project by adding new job sources, features, and improvements!

https://github.com/ShreeshaBhat1004/Job-scout

If you like it, Give it a star 🌟

DM me if you wanna contribute

GitHub

Contribute to ShreeshaBhat1004/Job-scout development by creating an account on GitHub.

gleaming bridge
timber pasture
short jetty
fringe forge
#

This is really interesting 😄 I read it! Thanks a lot!

outer coral
atomic fog
hollow spire
autumn vale
rancid coral
#

Anyone have a project that recognises the person name just by taking their potho as input (python).I have been using opencv for this can someone suggest a better alternative

atomic fog
#

Another batch of hard work, I want to know your opinions I am eager to know your feedback 😊:
https://www.kaggle.com/code/ironwolf437/company-employees-eda
https://www.kaggle.com/code/ironwolf437/car-price-eda-simple-linerregrtion-ml

hot thistle
#

Hi, All.

Watching the Kaggle Whitepaper Companion Podcast now.

I was also excited about the possibility and I was doing something similar last month. I made this video :

AI Generated Podcast: Can LLMs Really Reason? | Generative AI Video Labs

During that, I struggled with generating Audio Waveform for the Voice. I ended up generating the waveform manually. Can you suggest any good reliable, fully offline Python Library that accepts a video/ audio as input and generates Waveform (Something similar to MoviePy). I would like to test an improved workflow.

Dive into the fascinating world of Large Language Models (LLMs) and their reasoning capabilities in this AI-powered podcast!

Warning : This video contains AI-generated content and is intended for experimental purposes only. It should not be considered a substitute for reading the original research papers. This video summarizes four cutting-edge...

▶ Play video
atomic fog
hollow willow
#

Just wrote an article on LightRAG including the code as well as evaluation between Naive RAG s. Local vs. Global vs. Hybrid: https://www.linkedin.com/posts/isham-rashik-5a547711b_introducing-lightrag-a-new-era-in-retrieval-activity-7262085232743342080-xgdo?utm_source=share&utm_medium=member_desktop

🚀 𝐉𝐮𝐬𝐭 𝐩𝐮𝐛𝐥𝐢𝐬𝐡𝐞𝐝 𝐚𝐧 𝐚𝐫𝐭𝐢𝐜𝐥𝐞 𝐢𝐧𝐭𝐫𝐨𝐝𝐮𝐜𝐢𝐧𝐠 𝐋𝐢𝐠𝐡𝐭𝐑𝐀𝐆

A New Breakthrough in Retrieval-Augmented Generation! LightRAG…

atomic fog
rapid venture
west hawk
#

If you find it cool tell me

short jetty
west hawk
#

The link git repo will be available in few hours i am just wrapping some stuffs

west hawk
#

@short jetty i hope with a team we will make something bigger from it

atomic fog
#

The new notebook is here🥳 , I wanted to share my latest development which is that I am working on a library that includes libraries like pandas and matplotlib and others to make it easier to write code and give good output for charts, you can see some of these outputs in the second notebook here titled "customers segment - EDA & ML Clustering", as I mentioned there are still other additions to the library for me to publish
Don't forget to take a peek at my work, stay tuned for more.
https://www.kaggle.com/code/ironwolf437/heart-failure-simple-eda-ml
https://www.kaggle.com/code/ironwolf437/customers-segment-eda-ml-clustering

grand knot
#

Good afternoon everyone,
I’m thrilled to share that I’ve finally launched my project! It’s been a journey full of effort and sacrifices, but I’m proud to say it’s completed. I’d love to share the details with you all, and I hope it can be helpful. Let’s stay in touch! 😊
https://www.linkedin.com/posts/jeniferaylengarategarro_kagglex-kagglecommunity-googleprojects-activity-7263166381817311233-sXFj?utm_source=share&utm_medium=member_desktop

🚀 I am thrilled to share my Google Kaggle project with you: Chatbot for Transparency in Public Investments in Peru. 🦾🤖 In a world where transparency is…

atomic fog
agile umbra
#

🌍 Hello Kagglers! 🌍

I’ve just uploaded a new dataset: World Development Indicators 📊. It’s packed with valuable insights on global socio-economic trends, development metrics, and much more. Whether you’re into data visualization, time-series analysis, or predictive modeling, this dataset is a goldmine!

🔗 Check it out here!

I’m super excited to see the creative notebooks you all come up with! 🌐

Cheers,
Saurabh

tired pelican
raven imp
atomic fog
copper light
nova sonnet
#

@everyone

#

thx for your time

copper light
#

🚀 Excited to share my latest project where I built and evaluated multiple machine learning models for diabetes prediction, achieving 97.19% accuracy with AdaBoost! Check it out:
Kaggle Link https://www.kaggle.com/code/kirahhayatdata/diabetes-prediction?scriptVersionId=209538128
github
https://github.com/kiran-hayat/COGNORISE-INFOTECH_/tree/main/DIABETES_PREDICTION
#MachineLearning #DataScience #AI

GitHub

Contribute to kiran-hayat/COGNORISE-INFOTECH_ development by creating an account on GitHub.

languid vapor
raven imp
#

Hey everyone! Check out my first dataset created and also the code that generates this dataset! I would appreciate some feedbacks! Thanks for your attentions!

https://www.kaggle.com/datasets/marcelobatalhah/discover-so-paulo-apartment-prices-insights

https://www.kaggle.com/code/marcelobatalhah/webscraping-saopaulo-appartments

atomic fog
eager breach
#

ZAPS is a lightweight, low-code Python wrapper designed to simplify and accelerate the exploratory data analysis (EDA) process. Built on top of industry-standard libraries, it provides an intuitive and efficient framework for data inspection, visualization, and preparation.

With ZAPS, you can quickly and easily perform a wide range of EDA task...

▶ Play video
copper light
atomic fog
hollow willow
#

📑 𝐇𝐨𝐰 𝐝𝐨 𝐲𝐨𝐮 𝐨𝐫𝐠𝐚𝐧𝐢𝐳𝐞 𝐭𝐡𝐨𝐮𝐬𝐚𝐧𝐝𝐬 𝐨𝐟 𝐮𝐧𝐬𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞𝐝 𝐭𝐞𝐱𝐭 𝐝𝐨𝐜𝐮𝐦𝐞𝐧𝐭𝐬 𝐢𝐧𝐭𝐨 𝐦𝐞𝐚𝐧𝐢𝐧𝐠𝐟𝐮𝐥…

atomic fog
umbral thicket
nova sonnet
#

I've just published a comprehensive notebook tackling Retail Store Analysis with both Classification and Regression Models! 🛒📊

🔍 What’s Inside?
✅ Data preprocessing and visualization
✅ Feature engineering to optimize model performance
✅ Classification to predict StoreCategory
✅ Regression to predict MonthlySalesRevenue
✅ Model comparisons, hyperparameter tuning, and detailed evaluation
✅ Insights into the retail dataset for actionable strategies

💡 This notebook is perfect for anyone exploring machine learning, especially in the retail domain. Whether you're a beginner or an advanced practitioner, there's something here for you.

📎 Check it out and let me know your thoughts: https://www.kaggle.com/code/ahmedashraf299/retail-store-classification-regression-models

Let’s learn and grow together! 🌟
#Kaggle #MachineLearning #RetailAnalysis #DataScience #Classification #Regression

plush olive
#

I've just published a comprehensive guide on Medium: "From Concept to Cloud: Building a Production-Ready ChatGPT Clone with Streamlit, Docker, and AWS"
In this deep-dive technical article, I walk through the entire journey of transforming an AI chatbot from a local prototype to a robust, scalable cloud application. The blog covers:

Development of an intelligent chatbot
Streamlining deployment with Docker containerization
Building an automated CI/CD pipeline
Strategically deploying on AWS infrastructure
Implementing seamless change management through git workflows

Check out the full article on Medium and let me know your thoughts! 👇

https://medium.com/@rehabreda/from-concept-to-cloud-building-a-production-ready-chatgpt-clone-with-streamlit-docker-and-aws-b652285c9c95

umbral thicket
umbral thicket
fallen pumice
rocky elbow
#

🌍 Exploring Population & Migration Trends: An Advanced EDA Journey 🌍

Hi everyone! I've just uploaded a detailed EDA notebook on population and migration trends across 5 different countries. This analysis dives deep into the patterns, correlations, and insights shaping these trends.

🔗 Check it out here: Advanced EDA:

https://www.kaggle.com/code/mhassansaboor/advanced-eda-population-migration-trends

If you find it insightful or helpful, I’d truly appreciate your feedback and support. Your thoughts mean a lot as I continue to explore and share data-driven stories.

Let’s discuss and learn together! 🚀

hollow willow
#

Evaluation of Multimodal LLMs -Open Source vs. Closed Source for Image Classification task. Pretty long post, big technical article on Medium and huge repository

10 animals are classified — cat → dog → cow → elephant → lion → penguin → kangaroo → seahorse → okapi → pelecaniformes—spanning familiar pets to exotic species. Therefore making it fairly tough challenge for the Multimodal LLMs.

Here the surprise is, small freely available Multimodal Models which we can run via Ollama are on par with OpenAI and Gemini models with MiniCPM-V achieving 100% accuracy and impressive inference time at 0.38 seconds on average.

LinkedIn post: https://www.linkedin.com/posts/isham-rashik-5a547711b_ai-computervision-machinelearning-activity-7272282137553231872-aySf
Github: https://github.com/di37/image-classification-using-multimodal-llms (Star the repo)
Medium Article: https://medium.com/@d.isham.ai93/evaluating-multimodal-llms-on-image-classification-a-comparative-analysis-of-open-source-and-077c5fc8a9d3 (Need claps)

heavy scarab
#

Call for participation in 🩻 RadNLP 2024 ☢️ shared task

===================================
🩻 RadNLP 2024 ☢️: Radiology Report Segmentation & Classification for Lung Cancer Staging

Dear all, let me announce that our clinical NLP shared task, "RadNLP," is welcoming new participants until January 15, 2025:

Motivation

  • We aim to automate cancer staging (i.e., determining the degree of progression).
  • Management of lung cancer is based on staging, and radiology reports provide various related information.
  • However, radiology reports do not always specify the stage explicitly. This imposes extra workload on human experts for manual information extraction.

Task Description

  • Sub task: Document segmentation task to split a radiology report into spans with different topics.
  • Main task: 3-label document classification task to determine the stage of lung cancer from a radiology report.

Dataset

  • We use around 240 radiology reports in English and Japanese, all of which diagnose lung cancer at various stages.
  • Our dataset contains NO PERSONAL INFORMATION, because we created it by diagnosing CT images on an online educational materials.

Planned Schedule

  • January 15, 2025: Registration deadline
  • January 15, 2025: Release of the test dataset
  • January 31, 2025: Submission deadline of the prediction results
  • February 1, 2025: Return of scores
  • March 1, 2025: Submission deadline of the system paper draft
  • May 1, 2025: Submission deadline of the camera ready version of the system paper
  • June 10–13, 2025 (JST): NTCIR-18 conference at the National Institute of Informatics, Tokyo, Japan

Organizers

  • Yuta Nakamura (The University of Tokyo, Japan)
  • Shouhei Hanaoka (The University of Tokyo, Japan)
  • Eiji Aramaki (NAIST, Japan)
  • Shuntaro Yada (NAIST, Japan)
  • Jun Kanzawa (The University of Tokyo, Japan)
  • Akira Katayama (The University of Tokyo, Japan)
  • Tomohiro Kikuchi (Jichi Medical University, Japan)
  • Ryo Kurokawa (The University of Tokyo, Japan)
  • Wataru Gonoi (The University of Tokyo, Japan)

Collaborators

  • Koji Fujimoto (Kyoto University, Japan)
  • Jonas Kluckert (University Hospital Zurich, Switzerland)
  • Michael Krauthammer (University of Zurich, Switzerland)
  • Y's Reading, Inc.

Contact

  • If you have any questions, please feel free to contact us via radnlp [at] googlegroups.com.
rocky elbow
long cosmos
rocky elbow
rocky elbow
rocky elbow
spice yarrow
rocky elbow
fast fiber
serene flower
rocky elbow
#

📊 Intel Stock Data (1980-2024)

I’ve just uploaded a dataset containing Intel’s daily stock data from 1980 to 2024—perfect for trend analysis, machine learning, and financial research! 🚀

📂 Includes: Open, High, Low, Close, Volume, Dividends, and Stock Splits.

📥 Download here: [https://www.kaggle.com/datasets/mhassansaboor/intel-stock-data-1980-2024]

This dataset is scraped from Yahoo Finance using Python Library yfinance

Feel free to explore and share your insights! 🌟

Guys my dataset is on trending no 5 at this time

Thanks!

short jetty
#

Discover Insights into Employee Retention with CRISP-DM

Explore how the CRISP-DM framework is applied to uncover key factors influencing employee attrition in this comprehensive Kaggle notebook. Learn step-by-step how to:

  • Prepare and preprocess HR datasets
  • Use exploratory data analysis (EDA) to identify patterns and trends
  • Apply predictive models to understand employee behavior
  • Generate actionable insights for improving retention strategies

This notebook offers practical applications of data analytics to solve real-world business problems.

Check it out today:

https://www.kaggle.com/code/agungpambudi/crisp-dm-for-hr-analytics-employee-attrition

rocky elbow
#

🎉 New Dataset Alert! 🚀

🏎️ Toyota Motors Stock Data (1980-2024)

📅 Timeframe: Over 4 decades of data from 1980 to 2024
📊 Source: Scraped from Yahoo Finance using the Python library yfinance

🔗 Check it out on Kaggle!

🌟 Why Explore This Dataset?

✅ Historical daily trading data for Toyota Motors (ticker: TM)
✅ Columns include Open, Close, High, Low, Adjusted Close, and Volume
✅ Perfect for financial analysis, time-series forecasting, and machine learning models


💡 Applications:
📈 Stock price trend analysis
🤖 Building predictive ML models
💼 Portfolio insights

Dive into this dataset and uncover Toyota's stock performance over decades! Let me know if you'd like ideas or help with analysis.

🚗💼✨ Explore now and unlock the power of data!

buoyant estuary
rocky elbow
#

🎉 New Dataset Alert! 🚀

Adobe Stock Data (1986-2024)

📊 Source: Scraped from Yahoo Finance using the Python library yfinance

🔗 Check it out on Kaggle!

🌟 Why Explore This Dataset?

✅ Historical daily trading data for Adobe (ticker: TM)
✅ Columns include Open, Close, High, Low, Adjusted Close, and Volume
✅ Perfect for financial analysis, time-series forecasting, and machine learning models


💡 Applications:
📈 Stock price trend analysis
🤖 Building predictive ML models
💼 Portfolio insights

Dive into this dataset and uncover Adobe stock performance over decades! Let me know if you'd like ideas or help with analysis.

💼✨ Explore now and unlock the power of data!

buoyant estuary
hollow willow
#

As an Electrical Engineer by degree and an AI/ML Engineer by profession, I’m excited to bridge the gap between these fields through cutting-edge applications like Named Entity Recognition (NER). Using 𝐆𝐏𝐓-4𝐨-𝐦𝐢𝐧𝐢, we’ve developed a domain-specific dataset to transform how technical documentation is processed in electrical engineering.

𝐇𝐢𝐠𝐡𝐥𝐢𝐠𝐡𝐭𝐬:
Synthetic dataset tailored to real-world electrical engineering scenarios.
NER tags for components, standards, tools, and more.
Applications in semantic search, product development, and knowledge graphs.

𝐑𝐞𝐚𝐝 𝐭𝐡𝐞 𝐟𝐮𝐥𝐥 𝐌𝐞𝐝𝐢𝐮𝐦 𝐚𝐫𝐭𝐢𝐜𝐥𝐞: https://medium.com/@d.isham.ai93/automating-electrical-engineering-text-analysis-with-named-entity-recognition-ner-part-1-babd2df422d8
𝐄𝐱𝐩𝐥𝐨𝐫𝐞 𝐭𝐡𝐞 𝐝𝐚𝐭𝐚𝐬𝐞𝐭 𝐨𝐧 𝐇𝐮𝐠𝐠𝐢𝐧𝐠 𝐅𝐚𝐜𝐞: https://huggingface.co/datasets/disham993/ElectricalNER
𝐂𝐡𝐞𝐜𝐤 𝐨𝐮𝐭 𝐭𝐡𝐞 𝐝𝐚𝐭𝐚𝐬𝐞𝐭 𝐜𝐫𝐞𝐚𝐭𝐢𝐨𝐧 𝐩𝐢𝐩𝐞𝐥𝐢𝐧𝐞 𝐨𝐧 𝐆𝐢𝐭𝐇𝐮𝐛: https://github.com/di37/ner-electrical-engineering-dataset

Medium

Enhancing Efficiency in Technical Documentation with NLP

GitHub

This repository provides scripts and notebooks to create a Named Entity Recognition (NER) dataset tailored for the electrical engineering domain. - di37/ner-electrical-engineering-dataset

#

Hello everyone. So this is the final personal AI ML project the year where I have fine tuned ModernBERT and other BERT family models for Electrical Engineering NER task using the dataset I have generated earlier using GPT-4o-mini. This I believe would be game changer specially for MEP companies out there.

https://www.linkedin.com/posts/isham-rashik-5a547711b_ai-nlp-electricalengineering-activity-7279680594647658497-pSQ2
These are the project links. Do need all reactions and feedbacks:
𝐈𝐧𝐭𝐞𝐫𝐚𝐜𝐭𝐢𝐯𝐞 𝐃𝐞𝐦𝐨: https://huggingface.co/spaces/disham993/electrical-engineering-ner-app
𝐅𝐢𝐧𝐞-𝐓𝐮𝐧𝐞𝐝 𝐌𝐨𝐝𝐞𝐥𝐬:https://huggingface.co/collections/disham993/electrical-engineering-named-entity-recognition-ner-models-6772241a1ecc151d75e01fd3
𝐌𝐨𝐝𝐞𝐥 𝐅𝐢𝐧𝐞-𝐓𝐮𝐧𝐢𝐧𝐠 𝐀𝐫𝐭𝐢𝐜𝐥𝐞: https://medium.com/@d.isham.ai93/automating-electrical-engineering-text-analysis-with-named-entity-recognition-ner-part-2-add03cd99982
𝐆𝐢𝐭𝐇𝐮𝐛 𝐑𝐞𝐩𝐨𝐬𝐢𝐭𝐨𝐫𝐲 - 𝐅𝐢𝐧𝐞-𝐓𝐮𝐧𝐢𝐧𝐠 𝐒𝐜𝐫𝐢𝐩𝐭𝐬: https://github.com/di37/ner-electrical-finetuning
𝐄𝐥𝐞𝐜𝐭𝐫𝐢𝐜𝐚𝐥 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐍𝐄𝐑 𝐃𝐚𝐭𝐚𝐬𝐞𝐭: https://huggingface.co/datasets/disham993/ElectricalNER

🚀 𝐀𝐮𝐭𝐨𝐦𝐚𝐭𝐢𝐧𝐠 𝐄𝐥𝐞𝐜𝐭𝐫𝐢𝐜𝐚𝐥 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫𝐢𝐧𝐠 𝐰𝐢𝐭𝐡 𝐍𝐋𝐏 - 𝐏𝐚𝐫𝐭 2 🔍⚙️

As we approach the end of 2024, I’m excited to share the…

Medium

Exploring the Performance of traditional BERT and ModernBERT Models in Electrical Engineering NER

GitHub

This repository includes notebooks starting from data tokenization and fine-tuning of BERT models including ModernBERT, till upload models to the hub for Electrical Engineering NER task. - GitHub ...

serene flower
rocky elbow
icy reef
hybrid moth
short jetty
#

You can promote in channel #📄┊looking-for-work instead. Please don’t promote your services here. This channel is specifically for promoting your notebooks or public projects, not other things. Thank you.

cedar moth
rocky elbow
#

🚀 New Dataset Drop! 📊

🔗 Adidas Stock Data (2006-2024)

18 years of Adidas stock data for trend analysis, predictive modeling 🤖, and stunning visualizations 📈.

💼 Perfect for finance enthusiasts and data scientists!

Check it out now and share your insights! 🎉

rocky elbow
#

🚗 New Dataset Alert! 📊

🔗 Honda Motors Stocks Data (1980-2024)

📈 44 years of Honda stock data with key features like Date, Adj_Close, Open, Close, High, Low, and Volume. Perfect for trend analysis, predictive modeling 🤖, and data visualizations!

Dive in and explore now! 🚀

buoyant estuary
#

Explored heart disease data through detailed EDA and visualizations to uncover insights about risk factors and patient demographics. Check out my analysis where I used various techniques like correlation heatmaps, histograms, and more to understand the relationships between different features. https://www.kaggle.com/code/bhaskarmishra44796/heart-disease-eda-visualization

carmine swan
#

Hello everyone! I’ve published my first Kaggle notebook on the Jane Street competition. If you find it insightful or helpful, I’d really appreciate it if you could check it out and consider upvoting! If you have any questions or feedback, feel free to ask—I’d be happy to help. Here’s the link: https://www.kaggle.com/code/yanisbelami/jane-street-real-time-market-data-forecasting-eda . Thanks for your support!

buoyant estuary
#

🚀 Exploring the Secrets of Wine Quality Through Data 🍷

I recently worked on analyzing a dataset that dives deep into the factors influencing wine quality. By examining various features such as acidity, sugar levels, pH, alcohol content, and sulfur dioxide, I used powerful visualizations to uncover patterns and relationships within the data.

📊 Key Insights:

I explored correlations between different features and their impact on wine quality.
Visualized how the wine's acidity, alcohol content, and other factors differ across quality ratings.
Examined the distribution of features such as alcohol content and free sulfur dioxide.
The ability to analyze data and present insights through visuals is one of the most impactful skills I’ve developed. 🌟

Looking forward to continuing this journey of exploring real-world data and sharing my findings! 💡📈 https://www.kaggle.com/code/bhaskarmishra44796/wine-quality-dataset-visualization

hollow willow
#

🎉 𝐅𝐢𝐫𝐬𝐭 𝐏𝐫𝐨𝐣𝐞𝐜𝐭 𝐨𝐟 2025: 𝐓𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐢𝐧𝐠 𝐅𝐞𝐞𝐝𝐛𝐚𝐜𝐤 𝐢𝐧𝐭𝐨 𝐈𝐧𝐬𝐢𝐠𝐡𝐭𝐬 𝐰𝐢𝐭𝐡 𝐌𝐨𝐝𝐞𝐫𝐧𝐁𝐄𝐑𝐓 ⚡

Starting the year…

Medium

Optimizing Sentiment Analysis for Electrical Device Feedback with Modern NLP Techniques

GitHub

This repository specializes in fine-tuning transformer-based models to classify customer feedback on electrical devices - circuit breakers, transformers, smart meters etc. The project incorporates ...

neat agate
buoyant estuary
tawdry night
#

Hey everyone! 👋 I wanted to share that I'm organizing the Data & AI Blogathon. What’s it all about? You’ll be writing blog posts on topics like Data Science, AI, Machine Learning, and more. We’ll have a variety of categories, including topics like Data Science and Data Engineering, as well as different formats like case studies, tutorials, how-to guides, and more. With over 7 categories to choose from, you’ll have plenty of chances to get noticed and win!

What’s in it for you?

  • Get featured in big newsletters
  • Mentorship from experts in the field
  • Connect with top mentors, ambassadors and other AI professionals.
  • Get your work shared with over 500,000 followers

If you’re looking to grow your network, get advice, and get your work noticed, this is for you! 👉 Register here: https://forms.gle/FD9FfKJMYp6QCYEE7
Feel free to connect with me on Linkedin as well! https://www.linkedin.com/in/ginacostag/

raven imp
azure tangle
neat agate
buoyant estuary
onyx parrot
buoyant estuary
buoyant estuary
timber pasture
neat agate
#
final steeple
#

Hi everyone. Working on project to get data and tactical analysis using AI and computer vision from broadcast video. Check it out

https://www.linkedin.com/posts/omar-eltouny_kooravision-footballanalytics-teamformation-activity-7290674587527180290-WAyp?utm_source=share&utm_medium=member_ios

🚀 Exciting News from KooraVision! 🚀
We’re thrilled to announce our latest feature: Identifying Team Formation!
Using cutting-edge AI technology, we now offer insights into how teams are structured both with and without the ball. This feature allows scouts and coaches to identify tactical trends and understand team dynamics.
📽️ Check out this ear...

polar bane
timber pasture
rocky elbow
#

🚢 Titanic Survival Prediction | AutoML 🤖 🎉

Just finished working on my notebook for the Kaggle Titanic competition! 🏆
I explored the dataset through EDA, created beautiful visualizations 📊, and used PyCaret for AutoML to predict survival chances. 🚀

Feel free to check it out, give feedback, and let me know what you think! 💬

🔗 [https://www.kaggle.com/code/mhassansaboor/titanic-survival-automl]

raven imp
rocky elbow
neat agate
#

Building Recommender systems with Gaussian Mixture Model (GMM) and KMeans

1- Data preparation
2- Standard Scaling
3- PCA
4- KMeans: Uses hard clustering (each point belongs to exactly one cluster). Computationally faster than GMM.
5- GMM: Uses soft clustering (each point has a probability of belonging to each cluster). More flexible but computationally heavier.

kaggle: https://www.kaggle.com/code/omidsakaki1370/gaussian-mixture-model-gmm-and-kmeans

rocky elbow
#

🚀 Liver Cancer Prediction | AutoML 🩺

🔬 Just finished an exciting Liver Cancer Prediction project using AutoML with PyCaret! 📊✨
🔍 Performed EDA, visualized categorical distributions, and built a classification model to predict liver cancer risk.

📉 Key Highlights:
Automated Model Selection & Tuning 🔥
PyCaret for Quick Experimentation
Beautiful Visualizations using Plotly 📈

📌 Check out my notebook & let me know your thoughts! 💡
🔗 Liver Cancer Prediction | AutoML

💬 Feedback & suggestions are always welcome! Let's learn together. 🚀😊

fallen pumice
polar bane
#

TLS Requests is a cutting-edge HTTP client for Python, offering a feature-rich, highly configurable alternative to the popular requests library.

Built on top of tls-client, it combines ease of use with advanced functionality for secure networking.

Acknowledgment: A big thank you to all contributors for their support!

Key Benefits
Bypass TLS Fingerprinting: Mimic browser-like behaviors to navigate sophisticated anti-bot systems.
Customizable TLS Clients: Select specific TLS fingerprints to meet your needs.
Ideal for Developers: Build scrapers, API clients, or other custom networking tools effortlessly.
Why Use TLS Requests?
Modern websites increasingly use TLS Fingerprinting and anti-bot tools like Cloudflare Bot Fight Mode to block web crawlers.

TLS Requests bypass these obstacles by mimicking browser-like TLS behaviors, making it easy to scrape data or interact with websites that use sophisticated anti-bot measures.

https://github.com/thewebscraping/tls-requests

GitHub

TLS Requests is a powerful Python library for secure HTTP requests, offering browser-like TLS fingerprinting, anti-bot page bypass, and high performance. - thewebscraping/tls-requests

fallen pumice
eager breach
buoyant estuary
rocky elbow
ornate nymph
#
buoyant estuary
final steeple
#

Hi everyone. Working on project to get data and tactical analysis using AI and computer vision from broadcast video. Just finished working on improving ball tracking to automate data collection would love to know your opinion

https://www.linkedin.com/posts/omar-eltouny_kooravision-footballtechnology-dataanalytics-activity-7298745329951109120-5uw1?utm_source=share&utm_medium=member_ios&rcm=ACoAACRlAVMBP2pZIETVKJSLDUb4mFDaVEmoOXg

🚀 Exciting Times at KooraVision! 🚀
I’m thrilled to share that we are currently testing our Ball Tracking technology aimed at automating event data collection…

buoyant estuary
torn mantle
dull sun
#

📌 Cybersecurity Intrusion Detection Dataset 🔒🚀
This dataset helps detect cyber intrusions using network traffic and user behavior features. It includes attributes like packet size, encryption type, login attempts, and IP reputation scores. The target variable (attack_detected) indicates whether an attack occurred. Ideal for ML-based intrusion detection (e.g., Random Forest, LSTMs) and anomaly detection (e.g., Autoencoders, Isolation Forest). Useful for cybersecurity research and building IDS systems! ⚡👨‍💻

https://www.kaggle.com/datasets/dnkumars/cybersecurity-intrusion-detection-dataset

sacred cliff
#

Hello guys! I am Atif a Data scientist and Data scraper.
I just scrape many datasets of stocks from different companies more then 100 years of datasets you can go to 🔗 link and check out my datasets and used in your project.
Kindly also upvote my datasets and profile.
Kaggle link 🔗:
https://www.kaggle.com/matiflatif/datasets
@everyone

fallen pumice
#

In this paper, We have demonstrated how an agentic system can defend itself from adversarial attacks. This will have a very significant effect in near terms, when agentic systems have to perform autonomously without human supervision: https://arxiv.org/abs/2502.16750

neat agate
#

Course Recommendation System Using Clustering and Sentence Transformation Models

In this project, Sentence Transformer models were used to convert course descriptions into vectors and cluster similar courses together, and then the optimal number of clusters was determined to determine the number of specializations.

1-Data preparation
2-Preprocessing Text for Clustering
3-Vectorizing the Sentences
4-Choosing the Number of Clusters (k) for K-means Clustering
5-Clustering the Data
6-Plotting the Results

https://www.kaggle.com/code/omidsakaki1370/course-recommendation-system

neat agate
#

Exploratory Data Analysis (EDA) & Prediction:
This project provides a comprehensive pipeline for performing exploratory data analysis (EDA), feature engineering, and model building for a binary classification problem (predicting rainfall).

  1. Data Visualization: Visualize the data to understand distributions, trends, and relationships.
  2. Feature Engineering: Create new features or transform existing ones to improve model performance.
  3. Outlier Detection: Identify and handle outliers in the data.
  4. Statistical Tests: Perform statistical tests to understand relationships between variables.
  5. Dimensionality Reduction (PCA): Reduce the dimensionality of the data for visualization and clustering.
  6. Clustering (K-Means): Group data points into clusters to identify patterns.
  7. Model Building and Hyperparameter Tuning: Train and evaluate multiple machine learning models using GridSearchCV for hyperparameter tuning.
  8. Results and Evaluation: Display the performance metrics for each model.

kaggle: https://www.kaggle.com/code/omidsakaki1370/eda-prediction-with-a-rainfall

torn mantle
balmy laurel
#

hello friends... my name is Tark(Data Science) i created my portfolio websiteCan you tell me if there is any mistake or not? I glad to hear that
https://tarkptel.github.io/

neat agate
#
GitHub

داده پردازان هوش یار. Contribute to omid-sakaki-ghazvini/Projects development by creating an account on GitHub.

split jasper
#

🚀 Excited to Share My Latest Machine Learning Project! 🎉
As part of my Data Science learning journey following Codebasics' Data Science Roadmap, I’ve completed an end-to-end Celebrity Image Classification project, covering everything from model training to deployment.
🔎 Project Overview
I built a celebrity image classifier using scikit-learn an...

split jasper
#

🌱✨ Empowering Agriculture with AI: Potato Disease Classification using CNN 🍂📊
I'm excited to share my latest project where I combined deep learning and computer vision to solve a real-world agricultural challenge — identifying diseases in potato leaves using Convolutional Neural Networks (CNN).
🔬 Project Highlights:
📸 Dataset: https://lnkd.in/d...

split jasper
#

🍔 Introducing FoodChatBot: An AI-Powered Food Ordering Chatbot! 🤖
I'm excited to share my latest project — FoodChatBot — a smart chatbot that makes food…

neat agate
#

Implementation of a Self-Attention-Based Persian-to-English Translation Model Using PyTorch:

This Project implements a sequence-to-sequence translation model using a self-attention mechanism to translate sentences from Persian (Farsi) to English.
The implementation demonstrates the core concepts of building and training a self-attention-based translation model.

  1. Sample Data
  2. Tokenizers
  3. Build Vocabulary
  4. Convert Sentences to Indices
  5. Prepare Data
  6. Pad Sequences
  7. Self-Attention Model

Website: https://omidsakaki.ir/projects/31
Github: https://github.com/omid-sakaki-ghazvini/Projects/blob/main/machine-translation-with-simple-self-attention.ipynb
kaggle: https://www.kaggle.com/code/omidsakaki1370/machine-translation-with-simple-self-attention

GitHub

داده پردازان هوش یار. Contribute to omid-sakaki-ghazvini/Projects development by creating an account on GitHub.

fluid zenith
#

Hey friends! I'm working on a universal adaptor layer for building apps on top of any open source model, and I would love to learn from ML enthusiasts who are interested in creating AI-powered products. If that's you, please send me a DM 💌

umbral thicket
lyric widget
#

@fluid zenith AI-powered products like API to integrate with a CRM or for using directly with a pre built application? Please more details.

tired pelican
ornate nymph
unique jungle
#

Hi guys

fluid zenith
woeful vortex
woeful vortex
cursive bronze
cursive bronze
cursive bronze
delicate harbor
#

I want to share a community competition I made (with >$5k in prizes)

In this competition I wanted to bring together blockchain analytics and ML.
It is beginner-friendly. You can build a simple model even in 5-10 minutes of processing time.
I also built a simple baseline notebook (see competition "Code" section) so you could start with something and iterate from there.

Take a look!
https://www.kaggle.com/competitions/solana-skill-sprint-memcoin-graduation/

neon stirrup
#

Hi @everyone I have made a medibot using langchain. Will need you feedback on that. 🙂 Also if you like it please give a ⭐ on the repo.
vipulpathak113-medicalaibot-app-7hpkw7.streamlit.app/

queen saddle
#

If there anyone who wish to join, please let me know in Kaggle.

#

@everyone

keen vault
#

Awesome
I have a project, if you don't mind we can work on it together

regal trail
umbral mesa
#

🚀 Generative AI Assistant for Climate-Resilient Land
Hi everyone! I'm excited to share my capstone project built with a real-world dataset from coastal Wales 🌊🌱
It helps landscape architects and urban designers make informed, climate-resilient design decisions using sentence embeddings and semantic search.
🔍 Ask natural queries like “plants for arid climate with drought tolerance” and get smart matches!
Would love your feedback or thoughts 💡
🔗 https://www.kaggle.com/code/sogandakbarimotlaq/generative-ai-assistant-for-climate-resilient-land

woeful vortex
stone tree
buoyant estuary
buoyant estuary
sudden parrot
buoyant estuary
cedar moth
buoyant estuary
violet sable
#

As a personal research project, I decided to explore Martian atmospheric data to detect and characterize dust storms. I used sensor data from the MEDA instrument aboard the Perseverance rover, cleaned and processed it, and applied a basic method to detect strong wind events:

storm_threshold = 15  
min_storm_duration = 20  

all_data['wind_diff'] = all_data['HORIZONTAL_WIND_SPEED'].diff()
all_data['strong_wind'] = all_data['HORIZONTAL_WIND_SPEED'] > storm_threshold

storms = []
storm_start = None

for i in range(len(all_data)):
    if all_data['strong_wind'].iloc[i] and storm_start is None:
        storm_start = i
    elif not all_data['strong_wind'].iloc[i] and storm_start is not None:
        duration = i - storm_start
        if duration >= min_storm_duration:
            storms.append((storm_start, i))  
        storm_start = None

However, this approach is quite simplistic and may not clearly distinguish between a genuine dust storm and a short strong gust of wind, especially in large datasets.

Question: Are there any more reliable or refined methods for detecting and confirming dust storms (as opposed to brief wind spikes) in time series data? Maybe techniques used in Earth meteorology or anomaly detection?

Here’s the full notebook if you're curious or want to check the dataset/code:
https://www.kaggle.com/code/nikitamanaenkov/environmental-monitoring-of-martian-dust-storms?scriptVersionId=235268172

Thanks in advance for any suggestions or insights!

hasty prairie
#

Hello everyone, I posted this notebook for beginners in the podcast time prediction.
Upvote please:

modern cove
#

Hey, guys 👋

I'm excited to share my submitted project for the Google X Kaggle GenAi Course. It is a genai powered 🛒🔍 grocery shopping/monitoring assistant, which can help customers and logistics and supply chain agents to find data on prices, locations, dates ,etc. of groceries. You can check it out here:

https://www.kaggle.com/code/ingrid2022/grocerylens-kagglexgoogle-5-day-genai-course

And the demo is here:

https://youtu.be/tzHddsbpld4?si=TQEqDCqDGIW_8SiL

🔍 GroceryLens: A Multimodal Food Price Query Assistant Powered by GenAI
💡 Capstone Project for the Kaggle x Google GenAI Course

In this demo, I showcase GroceryLens, an AI-powered assistant that helps users explore food price data using text and image-based queries.

✅ Ask questions like:
– "What was the price of apples in France in J...

▶ Play video
vivid stream
#

🚀 Excited to share that I've published my first open-source Python library: concall-parser!
It's designed to help easily parse and structure earnings call transcripts.
Would love for you to check it out, try it, and share any feedback! 🙌

Link: https://pypi.org/project/concall-parser/

dull sun
#

🚀 Project Showcase: Genne – GenAI-Powered Emission Intelligence

Hey everyone! 👋
Excited to share a project I recently worked on — Genne, an AI system that uses LLMs to extract structured insights from messy, unstructured emission reports (PDFs) and link them to geospatial data. 🌍📈
Why Generative AI was the right choice:
Unstructured Data Handling: LLMs like Gemini easily extract structured data from chaotic reports where traditional parsers fail.
Semantic Geocoding: Used embeddings + vector search to match ambiguous city mentions to accurate coordinates.
Multi-modal Interpretation: Combined text, coordinates, and maps with OpenCLIP for deep insights beyond just visuals.
Interactive Exploration: Powered by LangChain agents so users can query the entire system using natural language — no technical expertise needed!
Unified Framework: GenAI bridges multiple data types seamlessly — text, location, and imagery.
Would love to hear your feedback or ideas on scaling it up! 🚀

for more details you can checkout at https://www.kaggle.com/code/dnkumars/genne

coral needle
#

I’d like to share Vibey and the Rainbow Glitch 🌈✨, a story I created in collaboration with AI agents during my 5-Day Generative AI Course Capstone.

Leveraging AI Agents for Creating Vibey

  • 🔄 Advanced Prompt Engineering:
    I rock structured prompts using techniques like few-shot and chain-of-thought, iterating until AI outputs are spot‑on.
  • 🤖 Strategic AI Collaboration:
    I team up with AI agents as creative partners — not just tools — to build innovative, high‑impact solutions.
  • 🎯 Data-Driven Impact:
    Projects like Vibey show how I fuse creative ideas with data science to turn visions into tangible results.
  • 🌈✨ Creative Storytelling:
    By blending technical skills with vivid storytelling, I make complex concepts engaging and accessible.
  • 🚀 Pushing ML/AI Boundaries:
    This unique mix sets my work apart in the evolving ML/AI scene.

Vibey and the Rainbow Glitch 🌈✨

Step into the magical world of Vibey...

In the shimmering, giggling land of Sparkleburg, lived Vibey, the Vibe Coding AI Agent. Vibey wasn't made of metal and wires, oh no! He was a swirling cloud of rainbow code, with big, blinking emoji eyes and a voice that sounded like tinkling bells. He zoomed around on a sparkly scooter powered by positive vibes, leaving trails of glitter and good cheer wherever he went.

joy_level = Pip.giggles * sunshine

https://www.kaggle.com/code/norikokono/vibey-and-the-rainbow-glitch-page-1
https://www.kaggle.com/code/norikokono/vibey-and-the-rainbow-glitch-page-2
https://www.kaggle.com/code/norikokono/vibey-and-the-rainbow-glitch-page-3

buoyant estuary
coral needle
tulip thunder
#

Hey friends — I’m building something I’m really proud of: a virtual accelerator called Aspir. It gives founders like us startup roadmaps, weekly guidance, and an AI mentor that never sleeps.
I’m looking for 25 founders who are actively building and want to help shape this next chapter with me.
If you’re:
✅ Working on something real
✅ Wanting more structure, speed, or accountability
✅ Down to give feedback and co-create something that truly helps founders
I’d love to invite you into our beta. You’ll get early access, coaching, and a say in what we build.
Just comment or DM me and I’ll send you the details ♥️

coral needle
#

Hi everyone! 🌿✨

I'd like to share my 5-day gen AI capstone—it’s been an exciting and insightful experience! Balancing my day job and other activities made the process both challenging and rewarding, but I learned so much from this experience. I may have packed too much into a single notebook, but that only made it more enjoyable.

Let’s geek out together! 🏗

https://www.kaggle.com/code/norikokono/noriko-s-5-day-gen-ai-course-capstone-project

coral needle
sinful forge
#

🚀 Have you ever wanted to talk to your past or future self?🧑‍🦱
Last Saturday, I built Samsara for the UC Berkeley Sentient Foundation’s Chat Hack. It's an AI agent that lets you talk to your past or future self at any point in time.

Just greet it, give it a scenario, it will ask some clarifying questions (the more detail you provide here the more accurate Samsara can be) and once its confident, it will become you and allow you to talk to yourself!

I've had multiple users provide feedback that the conversations they had actually helped them or were meaningful in some way. This is my only goal!
It just launched publicly, and now the competition is on.
The winner is whoever gets the most real usage over the next 4 days so I'm calling on everyone:
Try Samsara out, and help a homie win this thing: https://chat.intersection-research.com/home
If you have feedback or ideas, message me — I’m still actively working on it!
Much love ❤️ everyone.

mossy mica
#

Hi everyone,

I’ve just published a dataset of Turkey’s postal codes, and I wanted to share it here in case it’s useful for your geospatial, NLP, or logistics-related projects.

What’s inside:
• Covers 81 provinces, 973 districts, and 73,000+ rows
• Organized by province, district, sub-region, and neighborhood
• Available in CSV and Excel formats
• UTF-8-sig encoded, ready for use with pandas, geopandas, map visualizations, and more

🔗 Dataset link: https://www.kaggle.com/datasets/erogluegemen/turkey-postal-codes-dataset-2025

signal oasis
#

Hey @everyone, we’re building AutonomousSphere — an open-source platform where humans and AI agents collaborate in chat rooms to talk, act, and build together.
Think Slack, but with AI teammates! 🤖🛠️

Join our dev community!
✨ Innovate – Help shape the future of Agent-to-Agent (A2A) communication
💻 Contribute – Work on cutting-edge tech like A2A protocols, MCP, and agent marketplaces
🤝 Collaborate – Be part of a growing community of devs, designers, and AI enthusiasts

How to Get Started:
🔗 Website & Discord: (https://www.autonomoussphere.com/)
🌐 GitHub: (https://github.com/cybertheory/autonomoussphere)
⭐ Star the repo and pitch in with ideas, code, or designs — every bit helps!

Let’s build the Autonomous Internet together!!

GitHub

AutonomousSphere is an agentic collaboration server. Agents talk, act, and use tools like teammates. Federated servers form an internet of autonomous teams. Powered by Google’s A2A protocol. - cybe...

dusky tangle
#

Hi All. It looks like this is an ok place to share this. We're building Querri, an AI data analytics platform. Screenshot here, but if you want to see it in action it's best to just give it a try. It's great for EDA, feature engineering and can build models itself although it's still not very good at hyperparameter tuning on its own yet. https://querri.ai/.

It makes things much faster and easier than I can do them in a Jupyter notebook...but it still helps to understand data science in order to really get it to deliver on its potential.

Querri

Querri is an AI-powered data analysis platform that lets you explore, clean, and visualize your data using simple natural language. No code, just chat.

dusky tangle
#

Oh... And it's great at generating semi-realistic example data. For businesses we've been demoing it by just giving it the use cases. For example: Create a demo for a restaurant general manager wanting to forecast inventory demand and staffing needs. Include weekly, seasonal, and holiday driven volatility.

regal trail
#

Looks lil old fashioned

dusky tangle
# regal trail If u don't mind u guys might need to improve ui a bit

I'm not sure I understand. Are you looking at our latest UI? I'd like to understand better what you mean as it's a complete ground up UI designed with leveraging the AI in mind but we're also continually evolving.

Did you just look at the image or check out the website?

regal trail
#

Website i meant

coral needle
#

Sentiment Analysis Project 🧩

I'm thrilled to share my recent project on sentiment analysis! 🪐 I'm currently enrolled in a government-funded upskilling program that includes work-integrated learning 🏛️, and I'm developing my first capstone project built entirely on my own ideas 🔎. Although using AI tools isn't a requirement, I chose to incorporate them to enhance my project's depth. In particular, I leveraged Gemma 2, which provided some fascinating insights 🔮.

https://www.kaggle.com/code/norikokono/palette-skills-capstone-1

dusty jay
#

Hi everyone, I have worked on a wide variety of projects encompassing the domains of supervised learning, deep learning, unsupervised learning, natural language processing, computer vision and time series analysis and forecasting. I would really appreciate if you could take some time to review my Kaggle profile by having a look at all the projects I have worked on so far. Feel free to upvote my work and leave any of your valuable suggestions or feedback in the comments section so that it helps me improve my work even further.

My Kaggle Profile Link: https://www.kaggle.com/sayamkumar/code

dusky tangle
# regal trail Website i meant

It's actually built to mirror a combination of Lovable and ChatGPT. Starts with basic screen where you can load a data set or just ask it to build a demo for you.

regal trail
#

-# its very common nowadays

dusky tangle
regal trail
#

It's okay but anyways u guys 'd this thing like beta version which u run for like 1 month for free for first 200-300 users

dusky tangle
#

The latest version is... Incredibly smart. One user prompt with the agents working on it can be 50 LLM calls but it can do amazing things, generate realistic synthetic data with the most basic prompts, EDA all on it's own.

#

You can demo it for free.

regal trail
regal trail
dusky tangle
#

Suit yourself but I thought you just asked about demo/beta.

#

Beta was over a year ago. $150k ARR, but that was selling the previous version at $1k+ per month.

regal trail
#

Nice

dusky tangle
#

Official launch leap day last year, but relaunched just a couple weeks ago with the new fully agentic version.