#computer-vision | Kaggle | Page 1

zinc plaza Sep 1, 2023, 9:32 PM

#

Hey all, How are we doing? Would love to hear what you are working on 🙂

past terrace Sep 3, 2023, 11:05 PM

#

zinc plaza Hey all, How are we doing? Would love to hear what you are working on 🙂

a brain tumor mri classifier

crisp meadow Sep 5, 2023, 1:14 AM

#

Maybe try asking in #❓┊ask-a-question ? Or even Streamlit's community forums. I'm not able to tell what part of your question is relevant to computer vision.

obtuse veldt Sep 5, 2023, 1:15 AM

#

crisp meadow Maybe try asking in <#1129507816697241822> ? Or even Streamlit's community forum...

My app is based on computer vision but you are right

zinc plaza Sep 5, 2023, 9:25 PM

#

past terrace a brain tumor mri classifier

Exciting, I've worked in Radiography based classifiers

ancient swift Sep 6, 2023, 7:47 AM

#

Hi @ Everyone, I am working on a usecase related to image detection of a battery fluid level and calculating the percentage of fluid. Any help in this regard is highly appreciated. Please let me know if you have any ideas/suggestion to handle this usecase.

icy pebble Sep 6, 2023, 5:18 PM

#

I’m working on building a CV model to aid farmers. Would deploy on streamlit and mobile 🙂

eternal remnant Sep 6, 2023, 5:24 PM

#

Hello Everyone great to be here😃👍

azure nymph Sep 6, 2023, 7:12 PM

#

Hi everyone
I am currently working over a library called openpose
Need help regarding the basis concepts

grave cradle Sep 8, 2023, 7:49 PM

#

azure nymph Hi everyone I am currently working over a library called openpose Need help re...

Sorry. Haven't heard of it.

visual quartz Sep 10, 2023, 11:50 AM

#

azure nymph Hi everyone I am currently working over a library called openpose Need help re...

can you be a bit more specific about what help you need?

azure nymph Sep 10, 2023, 12:44 PM

#

@visual quartz actually I am working on live attendence system
Which would recognise the head , hand position and calculate the time and mark system .

visual quartz Sep 10, 2023, 5:46 PM

#

azure nymph <@534659493439733761> actually I am working on live attendence system Which wou...

okay, and what is not working for you? Or is everything working and you want someone to review your code?

icy pebble Sep 13, 2023, 6:35 AM

#

can visual prompting be used for image classification

pearl sequoia Sep 23, 2023, 8:48 AM

#

Hello everyone im working on a pose estimation model that can be used to detect/diagnose early onset nuerodegenerative conditions. I have a working model but really finding it hard to get my datasets/videos one particular platform dementias platform UK has a 28 day waiting list. Any leads to where i can get video data would be awesome

rich temple Sep 26, 2023, 5:14 AM

#

Anyone working on remote sensing + deep learning?

rare current Sep 29, 2023, 7:21 PM

#

Hi.
I'm looking for some image dataset about small plastics on the beach.
The idea is identify plastic straw, candy wrap, popsicle packaging, plastic bag, bottle cap, plastic label, etc.
Can someone point me a project about or similar to this?
Thank you.

drowsy sundial Oct 11, 2023, 10:02 AM

#

need in harry a satellite dataset for a segmentation graduating project!

zenith rampart Oct 12, 2023, 8:18 PM

#

Anyone working on https://www.kaggle.com/competitions/rsna-2022-cervical-spine-fracture-detection/overview?

RSNA 2022 Cervical Spine Fracture Detection

Identify cervical fractures from scans

drowsy sundial Oct 18, 2023, 2:34 PM

#

i will do segmentation on xDB dataset which contains images and masks each one has an image or mask pre-disaster and post-disaster how I preprocess this data and make it ready for my model and how the model will be so I can give it the two images before and after and their labels to give me the final mask ??

heady sage Oct 23, 2023, 2:11 PM

#

edgy wedge Nov 2, 2023, 2:13 AM

#

Hello Kagglers. Who can help me with the point cloud reconstruction?

boreal grove Nov 2, 2023, 3:53 PM

#

edgy wedge Hello Kagglers. Who can help me with the point cloud reconstruction?

what technique are you looking to use

#

there's this really cool paper called RL-GAN-Net

edgy wedge Nov 2, 2023, 3:57 PM

#

boreal grove what technique are you looking to use

Is it the best approach?

boreal grove Nov 2, 2023, 4:01 PM

#

I don't think so, it was a paper from 2019 iirc and 4 years is a long time in deep learning
But it's definitely very interesting, and holds a lot of potential imo

#

if you're going for a production use tho, something like samplenet or yolo3D might be better

#

I haven't done a benchmark study of this area but these are some interesting things to check out ig ^^

edgy wedge Nov 3, 2023, 2:22 AM

#

boreal grove I don't think so, it was a paper from 2019 iirc and 4 years is a long time in de...

ty

edgy wedge Nov 4, 2023, 6:47 AM

#

@everyone I have a las file. Who can help me to reconstruct this point cloud data using AI?

vocal whale Jan 6, 2024, 1:20 PM

#

Man this has not been active in a looooon time

wispy galleon Jan 6, 2024, 6:25 PM

#

Hi everyone, if you are into computer vision we can have a separate group to form a team for compeitions and projects. Just write to me.

eager wagon Jan 18, 2024, 11:40 PM

#

Hey, in training a cv model, how much variation in mAP50 and mAP90 values is considered ok?

drowsy sundial Jan 25, 2024, 2:07 AM

#

I want to fine-tune the SAM model on a dataset of 256 image sizes, but SAM only takes 1024 sizes, how to solve this problem ??

grand moat Feb 4, 2024, 4:15 PM

#

Hey everyone,

I hope you're all doing well. I'm currently facing a challenge in Object detection dataset specially related to class imbalance. My dataset is in yolov5 format. I'm exploring image augmentation techniques to address it. Although I can generate augmented images, the missing piece is the corresponding annotation, specifically creating annotation files like label.txt.

I'm a bit unsure about the best practices for generating these annotations for augmented images. If anyone has insights or guidance on this matter, I'd really appreciate your help!

Thanks a ton!

Latifur Rahman Zihad
Undergrad student

lament perch Feb 7, 2024, 5:45 PM

#

grand moat Hey everyone, I hope you're all doing well. I'm currently facing a challenge in...

did you solve your issue? i think i can help you

drowsy sundial Mar 6, 2024, 5:09 PM

#

!!!! URGENT

Is this the normal or the right format for the images to go to the model after doing normalization to them ??

tawny saffron Mar 12, 2024, 5:48 AM

#

Hi everyone. I am working at Roco Dasaset and trying to produce captions for images of radiology. #⁠llms I need help on this
I am looking for expert advice so that I can get a solution. ⁠llms ⁠llm-detect-ai-generated-text
Here is the link of my code. https://colab.research.google.com/drive/1SLHq31HcjhUK1dIuNaCNP4dkR66TMd71?usp=sharing

Google Colaboratory

crystal sequoia Apr 3, 2024, 7:08 PM

#

Hi everyone! Check my notebook about Melanoma cancer detection. Your support is greatly appreciated!

https://www.kaggle.com/code/seifwael123/melanoma-cancer-efficientnet

Melanoma Cancer 🦠| EfficientNet

Explore and run machine learning code with Kaggle Notebooks | Using data from 🔬Melanoma Cancer Image Dataset

drowsy sundial Apr 14, 2024, 4:21 PM

#

is anyone here try to fine tune Meta Segment Anything model on satellite images ??

blazing plinth Apr 19, 2024, 3:52 PM

#

I replicated TinyVGG architecture for cifar dataset it didn't work and give very low accuracy, is there some good model for this task

chilly tulip Apr 25, 2024, 1:40 PM

#

For those interested in Medical Image Segmentation, I'm sharing two preprocessed benchmark datasets for cardiac segmentation.
Additionally, weakly-supervised learning, particularly scribble-supervised learning, has been gaining popularity in recent years. This is due to the high cost and difficulty of traditional labeling, especially in the medical field where data sensitivity is paramount.
Therefore, each image in my datasets also comes with corresponding scribble labels, facilitating superior learning in cardiac segmentation.
Moreover, I've included notebooks to guide you on how to load and visualize the data. To learn more about these datasets and access the code, feel free to visit the links below:
https://www.kaggle.com/datasets/anhoangvo/acdc-dataset
https://www.kaggle.com/datasets/anhoangvo/mscmrseg

ACDC Dataset

Data for Automated Cardiac Diagnosis Challenge (ACDC) [Segmentation Task]

MSCMRSeg

Data for MS-CMRSeg 2019: Multi-sequence Cardiac MR Segmentation Challenge

cinder charm May 3, 2024, 4:59 PM

#

drowsy sundial I want to fine-tune the SAM model on a dataset of 256 image sizes, but SAM only ...

I'd just resize to 1024x1024

cinder charm May 3, 2024, 5:04 PM

#

drowsy sundial !!!! URGENT Is this the normal or the right format for the images to go to the...

for sam they have preprocess function which will get in correct format see there github here: https://github.com/facebookresearch/segment-anything/blob/6fdee8f2727f4506cfbbe553e23b895e27956588/segment_anything/modeling/sam.py#L164 . Also here is code example : image_for_sam= sam_model.preprocess(image); //here image would be tensor in format (B,C,H,W) where C=3 and and the pixel/color values in in from 0-255 (it is kinda odd)

GitHub

segment-anything/segment_anything/modeling/sam.py at 6fdee8f2727f45...

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model. -...

flat summit Jun 12, 2024, 8:45 AM

#

hello everyone .
is there anyone who can help me improve my background extraction code, which extracts the front garments from any type of background where it is located?
There is my code :
def background_extraction(image):
if image.shape[2] == 4:
image = image[:, :, 0:3]
gray = rgb2gray(image)

    #img_equalized=exposure.equalize_hist(gray)
    #gauss=gaussian_filter(img_equalized,1)
    gauss_top=white_tophat(gray,disk(2))
    outline = extraction_class.outline_detection(gauss_top, sigma=1)
    #outline = white_tophat(gauss, disk(2))
    filled_outline = binary_fill_holes(outline)

    structurant_element=disk(2)
    smoothing=erosion(filled_outline,structurant_element)
    #smoothing=dilation(filled_outline,structurant_element)
    extraction = np.copy(image)
    extraction[~smoothing] = 0

    return extraction

def outline_detection(image,sigma):
gauss = gaussian_filter(image,sigma)
sobelX = np.array([[-1,0,1],[-2,0,2],[-1,0,1]])
sobelY = np.array([[-1,-2,-1],[0,0,0],[1,2,1]])
derivX = convolve(gauss,sobelX)
derivY = convolve(gauss,sobelY)
gradient = derivX+derivY*1j
G = np.absolute(gradient)
threshold = threshold_li(G)
s = G.shape
for i in range(s[0]):
for j in range(s[1]):
if G[i][j]<threshold:
G[i][j] = 0.0
return G

slate granite Jun 22, 2024, 6:24 PM

#

Hello guys

#

I wanna ask

#

How to do object detection in java huh

carmine flame Jul 14, 2024, 11:44 AM

#

anyone want to work in collab for solving the Petal classification on TPU challenge?

shell kite Jul 19, 2024, 3:45 PM

#

carmine flame anyone want to work in collab for solving the Petal classification on TPU challe...

Me.

carmine flame Jul 22, 2024, 6:55 AM

#

shell kite Me.

Thanks for replying. Are you on twitter?

shell kite Jul 22, 2024, 7:00 AM

#

carmine flame Thanks for replying. Are you on twitter?

No.

mystic heath Jul 28, 2024, 1:34 AM

#

Hello, I want to build some computer vision AI projects to put on my resume. Do you have any great computer vision projects that you would recommend?

slate granite Aug 3, 2024, 2:36 AM

#

Ohh ye i have try a hand recognition calculator like the apple just introduced use media pipe for hands gestures and movements tracting and other chracter recognition and logics to make it work

rancid hemlock Aug 9, 2024, 5:34 PM

#

Hi, Kaggle community!

My friend Alex and I are developing a tool to make deep neural network training interactive, user-friendly, and deterministic.

Are you tired of the guesswork and never-ending experimentation?
Are you seeking an alternative methodology that makes the process less painful and more understandable?

We'd love to hear from you! Please share your thoughts and help us to upgrade the current methodology.

LINK: https://calendly.com/graybx/30min <<

Calendly

Demo | Graybox - Luigi d'Ovidio

dapper path Aug 12, 2024, 4:16 PM

#

Hey gys, is the Image Data Generator in Keras deprecated, then whats the latest standard way of image data augmentation?
Can we use the augmentation layers directly before the main model, or is it better to create a seperate Sequential pipeline for augmentation?

stone seal Aug 20, 2024, 6:33 PM

#

Is it better to use OpenCL or jsut OpenCV

finite nova Aug 22, 2024, 2:34 PM

#

hi guys, does anyone have any experience with roboflow models? I'm doing an ML project for the first time and I was instructed to make my dataset and train my model there, but I'm lost on how to proceed further. if anyone has any experience, please lmk so I can consult you. thank you in advance

fathom crater Aug 27, 2024, 5:59 PM

#

https://www.kaggle.com/datasets/alanjo/gpu-scores-with-cuda-metal-opencl-vulkan
sharing OpenCL, Metal and Vulkan scores dataset

💥GPU - CUDA, Metal, OpenCL, Vulkan Scores📊

Graphics Card Benchmarks with Modern Graphics APIs

river bobcat Oct 17, 2024, 6:15 PM

#

Hello fellow Computer Vision enthusiasts 😄

hot perch Oct 18, 2024, 4:12 AM

#

river bobcat Hello fellow Computer Vision enthusiasts 😄

Hi! how's the day going?

vagrant moon Oct 26, 2024, 3:24 PM

#

hey guys, I know this is a dumb question

#

but does anyone knows how to take out the YOLO model from ultralytics (pretrained) and train it with a self defined training loop?

#

sad_panda unsettled_tom

robust reef Oct 31, 2024, 4:27 AM

#

Hello @here guys i'm starting my project in AIR quality detection and i'm searching for images datasets if you know any source please let me know

vast horizon Nov 6, 2024, 5:59 PM

#

robust reef Hello @here guys i'm starting my project in AIR quality detection and i'm searc...

https://www.kaggle.com/datasets/adarshrouniyar/air-pollution-image-dataset-from-india-and-nepal/data

Air Pollution Image Dataset from India and Nepal

Please Fork and Star our work by visiting our GitHub Repository, Link in About.

fallen saddle Nov 11, 2024, 10:54 AM

#

HI, I am Abdullah I am an ML engineer want to join any team to particapte in kaggle competions

inner willow Nov 13, 2024, 10:46 PM

#

river bobcat Hello fellow Computer Vision enthusiasts 😄

Hi Tom

river bobcat Nov 13, 2024, 10:47 PM

#

Howdy!

inner willow Nov 13, 2024, 10:52 PM

#

Hey, please help me find out a good reference doc for CV (like whitepapers in 5dgai session), especially focusing on implementing OCR.

#

Would love to connect Tom. Can you help me with this stuff?

river bobcat Nov 13, 2024, 10:54 PM

#

What have you found from your own searching? Have you tried any of the many OCR tutorials found on youtube/google search/etc?

inner willow Nov 13, 2024, 10:57 PM

#

Yes I can search internet but worried if I got any non-relevant docs and hence it'd confuse me more. So why seeking an expert help

#

I have done course on the basic architecture of the CNN models. I need help with the implementations

river bobcat Nov 13, 2024, 10:58 PM

#

My advice: try one of the "non-relevant docs" that specifically says its a tutorial on how to implement OCR, if that doesn't work, try another one and see what the differences are. If you don't understand something in the tutorial, search specifically on that topic and figure out what you don't understand.

inner willow Nov 13, 2024, 11:00 PM

#

Why not I start with one suggested doc?

#

BTW, great advice buddy, thanks

river bobcat Nov 13, 2024, 11:03 PM

#

I apologize if it's not exactly what you were looking for, but it's the way to learn on your own. In my opinion, it's better to seek advice after you tried something and have specific questions, rather than asking for someone to walk you through something you haven't even tried.

inner willow Nov 13, 2024, 11:04 PM

#

Yeah, got it

lapis cosmos Nov 14, 2024, 12:18 PM

#

Hi everyone! I'm exploring ways to use image-to-image models to predict deformations and failure loads in stone arch structures based on initial geometry. If anyone has worked on similar structural prediction tasks, used GANs for FEA simulations, or has some insights and paper references, please send me a DM.

austere arch Nov 15, 2024, 10:25 PM

#

river bobcat I apologize if it's not exactly what you were looking for, but it's the way to l...

Beautifully said

#

♥️

river bobcat Nov 15, 2024, 10:29 PM

#

austere arch Beautifully said

That's kind of you to say -- it's a lesson I learned the hard way, too

austere arch Nov 15, 2024, 10:29 PM

#

😁

drowsy jacinth Dec 7, 2024, 1:49 AM

#

I made a cv script that shows the viewers their emotions in real time. Got it running on a buggy intel tablet but with a proper gpu behind it there’s a lot of fun to be had. Some people freak out when it show their emotion changing and others laugh. It’s an interesting night.

digital zenith Dec 14, 2024, 5:10 PM

#

drowsy jacinth I made a cv script that shows the viewers their emotions in real time. Got it ru...

that sounds so cool!

drowsy jacinth Dec 16, 2024, 1:26 AM

#

Do you want to see it running? I think it’s got enough power to do a choppy screen record.

shell kite Dec 17, 2024, 12:00 PM

#

fallen saddle HI, I am Abdullah I am an ML engineer want to join any team to particapte in k...

me as well can you check your dm

topaz bough Dec 24, 2024, 12:31 PM

#

Hi, what is the best model for human fall detection? Any helps would be appreciated

worthy obsidian Dec 27, 2024, 4:54 PM

#

topaz bough Hi, what is the best model for human fall detection? Any helps would be apprecia...

That depends how you want to detect human falls

glossy willow Dec 30, 2024, 2:23 AM

#

drowsy jacinth I made a cv script that shows the viewers their emotions in real time. Got it ru...

✍️

frigid jay Dec 31, 2024, 12:30 PM

#

hi everyone i am looking for research ideas in computer vision can help me by suggesting some ideas

karmic cedar Jan 15, 2025, 1:03 PM

#

im stuck with my college minor project. Can anyone help me?🥹
there's this persistent bug that im stuck on for hours:
ValueError: It looks like you are using a PerReplica object while not inside a replica context, which is not supported. Try running your op or function inside a replica context by using strategy.run
https://www.kaggle.com/code/sidharthad/custom-training

Custom_training

Explore and run machine learning code with Kaggle Notebooks | Using data from lupus class

hazy tapir Jan 23, 2025, 5:38 PM

#

Handwritten Persian numerals - Generative Adversarial Networks: DCGAN, CycleGAN

DCGAN : https://www.kaggle.com/code/omidsakaki1370/handwritten-persian-numerals-dcgans-pytorch
CycleGAN : https://www.kaggle.com/code/omidsakaki1370/handwritten-persian-numerals-cyclegans-pytorch

Website: https://omidsakaki.ir/Projects

Handwritten Persian numerals-DCGANs & PyTorch

Explore and run machine learning code with Kaggle Notebooks | Using data from Handwritten Persian numerals

Handwritten Persian numerals-CycleGANs & PyTorch

Explore and run machine learning code with Kaggle Notebooks | Using data from Handwritten Persian numerals

داده‌پردازان هوش‌یار

Web site created using create-react-app

round stream Jan 25, 2025, 6:37 AM

#

Hi, I am new to computer vision and currently I am trying a multiclass image segmentation using PyTorch, from scratch. From past couple of days, I am stuck with it. Anyone could help me, please?

trail pebble Jan 31, 2025, 5:16 PM

#

I am working with yolo and I am new with it so I want to start understand architecture of it and I understand yolo have (backbone ,neck , head) I want to train head first and then train backbone and neck to fine tuning the model so how can I know number of layers that consider as backbone and neck

rough night Feb 17, 2025, 2:09 PM

#

Hello everyone, I recently added some YOLO notebooks related to object detection, image segmentation, and image classification. If you're interested, you can click the link below.

https://www.kaggle.com/discussions/accomplishments/562631

Excited to Share My YOLOv8 Projects: Object Detection, Image Classi...

Excited to Share My YOLOv8 Projects: Object Detection, Image Classification, and Segmentation.

tough heath Mar 1, 2025, 2:39 PM

#

I need a little help in my project anyone who is having good experience in image enhancing and preprocessing

mellow forge Mar 4, 2025, 2:11 AM

#

guys i need help setting up the kaggle gpu, for some reason it's not working even when it's set on the session

hazy tapir Mar 7, 2025, 8:07 PM

#

Generative AI in Computer Vision - Fake Human Face Generator

Website: https://omidsakaki.ir/projects/29
Github: https://github.com/omid-sakaki-ghazvini/Projects/blob/main/generative-ai-face-image-dcgans-cyclegans.ipynb
kaggle: https://www.kaggle.com/code/omidsakaki1370/generative-ai-face-image-dcgans-cyclegans

داده‌پردازان هوش‌یار

Web site created using create-react-app

GitHub

Projects/generative-ai-face-image-dcgans-cyclegans.ipynb at main · ...

داده پردازان هوش یار. Contribute to omid-sakaki-ghazvini/Projects development by creating an account on GitHub.

Generative AI - face image - DCGANs & CycleGans

Explore and run machine learning code with Kaggle Notebooks | Using data from Face Mask Lite Dataset

leaden wraith May 30, 2025, 5:36 PM

#

I'm having trouble with multi label classification on medical datasets with 15 classes, I am using BCEwithlogitsloss to combat heavy class imbalance but still getting very low f1 scores and strict accuracies, any advice to improve my performance?

fading jetty Jun 22, 2025, 4:24 PM

#

<@&1303433601177751593> scam?

small flare Jun 30, 2025, 5:15 AM

#

Anyone interested to review the code I wrote for custom CNN(it is a colab notebook), like what are the things I need to improve or how much I have got correct. Also it would be helpful if anyone could guide me for the next steps, currently I have been able to create a feature map consisting of multiple neurons which slide over image do convolution, but all the neurons in same layer are producing same output is this correct or anything I need to change over here??

here is the github link:-
https://github.com/TronYash10101/Custom_CNN

GitHub

GitHub - TronYash10101/Custom_CNN

Contribute to TronYash10101/Custom_CNN development by creating an account on GitHub.

rigid jetty Jul 11, 2025, 8:41 AM

#

Hi, I'm currently working on a CNN built from scratch, and I'm training it on a dataset with mnist style digits and with operator symbols too. I've trained for 160 batches of 64 images out of 1 epoch, and my cross entropy loss has plateaued at 2.639, which is 1/14 chance of predicting correctly which means my CNN is randomly guessing. I spent the past 7 days debugging this and still can't find the issue, if anyone can help I have more details just dm 😄

quaint needle Jul 12, 2025, 9:41 AM

#

from scratch means numpy only stuff?

#

😭

left magnet Aug 6, 2025, 6:21 PM

#

Hi everyone,

I just open-sourced YOLOv1-PyTorch, a from-scratch PyTorch reimplementation of the original YOLOv1—complete with a hands-on notebook that walks you through every detail:

YOLO-V1-Explanation.ipynb
A comprehensive tutorial that covers:

Environment & Data: setting up PyTorch, downloading Pascal VOC 2007/2012, inspecting class distributions and annotation formats
Data Loader & Augmentation: parsing XML to YOLO’s S×S grid, handling edge cases, and applying on-the-fly transforms (flips, color jitter)
Model Architecture: building each convolutional layer and prediction head exactly as in the original paper, with tensor-shape diagrams
Loss Function: step-by-step derivation of localization, confidence, and classification losses, directly tied to code
Training Loop: configuring hyperparameters, real-time plotting of total vs. per-term losses, checkpointing
Evaluation & Inference: computing IoU/mAP, visualizing ground-truth vs. predictions, implementing non-max suppression, and generating inline GIF demos

YOLO-V1-Pure-Code.ipynb
The same pipeline stripped of commentary—ideal for quick experimentation or integration into your own projects.

Live examples
Pre-rendered outputs (sheep, bicycle) so you can see detection quality before running a single cell.

https://github.com/franciszekparma/YOLOv1-PyTorch

Whether you’re teaching, researching, or prototyping in classic object detection, this repo guides you through both the “why” and the “how.” Feel free to clone, star, file issues, or send PRs!

GitHub

GitHub - franciszekparma/YOLOv1-PyTorch: Comprehensive guide to YOL...

Comprehensive guide to YOLOv1 using PyTorch, built from Scratch - franciszekparma/YOLOv1-PyTorch

primal plinth Aug 26, 2025, 9:48 PM

#

Hello everyone,
I need some guidance regarding deep learning for our Final Year Project (FYP).

Our project heavily depends on model training using both images and text, but currently, we are focusing on medical image model training. The problem is—we are not sure about the proper direction to follow. When we try to follow GPT instructions, the model doesn’t train effectively, and we lack clarity on how to evaluate and decide what to apply at each step.

Could anyone experienced in deep learning please guide us on the correct path? Specifically, we need clarity on:

Understanding graphs/visualizations (loss vs accuracy, confusion matrix, PR curves, etc.)

Model selection (e.g., ResNet vs DenseNet vs EfficientNet, pretrained vs from scratch)

Evaluation metrics (accuracy, precision, recall, F1, AUC, etc.)

Data augmentation techniques (rotation, flipping, normalization, mixup, etc.)

Overfitting/underfitting (how to detect and solve using dropout, regularization, early stopping)

Hyperparameter tuning (learning rate, batch size, optimizers, etc.)

Training workflow (train/val/test split, cross-validation, reproducibility)

Basically, GPT gives us the “what to do” list, but not the how to properly evaluate, compare, and decide among these options. What’s the right way to structure and approach this whole process so we can train and evaluate our model effectively?

snow bluff Aug 29, 2025, 2:17 AM

#

Job Title: Part-Time Senior AI/ML Engineer (Remote)

We are seeking a skilled and experienced Senior AI/ML Engineer to join our remote team on a part-time basis. The ideal candidate will have a strong technical background, excellent communication skills, and the ability to work independently in a fast-paced environment.

Requirements:
-Minimum of 9–10 years of professional software development experience

-Proven experience working effectively in a remote environment

-Advanced English proficiency (C1 or higher); an American accent is preferred

-Availability to work 10–15 hours per week during EST or CST business hours

If you're a highly motivated engineer with a passion for building high-quality software and can commit to a flexible part-time schedule, we’d love to hear from you.
You can connect with me on WhatsApp: +1 (910) 386-2433

vast horizon Sep 3, 2025, 1:27 PM

#

primal plinth Hello everyone, I need some guidance regarding deep learning for our Final Year ...

Make a dl agent inspired from mle star that can guide u better

snow bluff Sep 5, 2025, 4:10 PM

#

Job Title: Part-Time Senior AI/ML Engineer (Remote)

We are seeking a skilled and experienced Senior AI/ML Engineer to join our remote team on a part-time basis. The ideal candidate will have a strong technical background, excellent communication skills, and the ability to work independently in a fast-paced environment.

Requirements:
-Minimum of 7–10 years of professional software development experience

-Proven experience working effectively in a remote environment

-Advanced English proficiency (C1 or higher); an American accent is preferred

-Availability to work 10–15 hours per week during EST or CST business hours

If you're a highly motivated engineer with a passion for building high-quality software and can commit to a flexible part-time schedule, we’d love to hear from you.
You can connect with me on WhatsApp: +1 (567) 469-5384

shadow musk Oct 12, 2025, 7:41 PM

#

Does anyone know what's the best way to detect characters/symbols on cube-shaped objects?

#

Like, what's the best way to augment the data? 2D augmentation or 3D augmentation?

true sedge Oct 23, 2025, 8:21 PM

#

shadow musk Does anyone know what's the best way to detect characters/symbols on cube-shaped...

Hi! Hope you're doing well, actually that's pretty blurry for me, I'd say that if your data comes from a real 3d you could simply render each face and apply simple matching

If it comes from an image, you'll need to apply transformation to your original matching dataset, i'd say if you could detect the cube and its edges you'll be able to determine the transformation you'll need to do

You'll allways be free to use huffman strategies or whatever to optimize your model, but the core goal is for your model to work, not to be perfectly optimized, if your model has a broader utility, you'll have to teach him similar methods, think with first principles, hope I didn't miss the point

narrow spoke Oct 31, 2025, 4:00 PM

#

Hi! I just finished the Intro to DL and CV courses + the ML Zero to Hero series from Tensorflow. What competitions would you recommend me to try except the get started ones to get some expericene?

rose zealot Nov 3, 2025, 5:07 PM

#

hi

plucky elk Nov 3, 2025, 8:14 PM

#

narrow spoke Hi! I just finished the Intro to DL and CV courses + the ML Zero to Hero series ...

if you find anything please tell me too

wary zinc Nov 10, 2025, 3:44 PM

#

Hi, just joined the server

kind knoll Nov 10, 2025, 5:51 PM

#

I'm finding a US developer for the collaboration. If anybody interested, please dm me.

rocky flax Nov 11, 2025, 4:05 AM

#

Hello!

rocky flax Nov 11, 2025, 9:04 AM

#

I'm profficient in CNN do you all know anything courses with certification free?

reef bluff Nov 12, 2025, 3:00 PM

#

Hi I'm a CV engineer from Nigeria

obtuse rain Dec 12, 2025, 7:44 PM

#

reef bluff Hi I'm a CV engineer from Nigeria

Hi, i'm newbie in LM from Belarus. I want to learn the CV

reef bluff Dec 14, 2025, 2:08 PM

#

obtuse rain Hi, i'm newbie in LM from Belarus. I want to learn the CV

🫸

slim olive Dec 26, 2025, 6:24 PM

#

Hey everyone 👋
Quick question for folks building chatbots / LLM apps here —
how are you currently handling long-term user memory beyond a single session?
Curious what’s actually working in practice (RAG, DB, custom hacks, etc).

raven zephyr Dec 30, 2025, 11:58 AM

#

Newbie in Pytorch from Pakistan. Excited to getting to know yall

tacit idol Jan 28, 2026, 1:13 PM

#

https://www.kaggle.com/code/dastgeerjutt/generative-ai-full-map-kaggle-professional

Please upvote @everyone

lean latch Jan 30, 2026, 9:14 PM

#

Can I use MS coco dataset for training a model for commercial use? As i read the annotations are on the license that allow it, but since the images themselves were taken from flickr, some of them are under some license... so like I don't know wether i am allowed to use it or not ?

#

I read on some website that I am allowed to use it for training in commercial use, but I dont know if i want to trust some random website

lean latch Feb 1, 2026, 1:05 PM

#

so none was ever concerned about it or we don't have many computer vision engineers that worked on commercial projects

#

or i am just being ignored

twilit meadow Feb 2, 2026, 10:14 AM

#

slim olive Hey everyone 👋 Quick question for folks building chatbots / LLM apps here — how...

Hello,
Did you find anything by any chance?

lean latch Feb 2, 2026, 6:29 PM

#

are all datasets on roboflow intented for commercial use/

#

like i see some subset of ms coco and there is a license that in theory allows for commercial use, but again is it there just becuase someone put it there, or like roboflow owns rights to put any kind of respected license on datasets posted on their website ?

plain belfry Feb 6, 2026, 2:01 PM

#

Thanks to everyone for supporting me!
I uploaded a new notebook, "Retail Store Product Sales Simulation—Exploratory Data Analysis (EDA)."
Link: https://www.kaggle.com/code/hammadansari7/retail-store-product-sales-exploratory-analysis
@everyone

slim olive Feb 9, 2026, 3:35 AM

#

twilit meadow Hello, Did you find anything by any chance?

Yeah, a few patterns seem to work in practice:
• Explicit memory layers outside the LLM (DB / KV store) for user facts & preferences
• Embedding-based recall (RAG) for long-term, fuzzy memory
• A light memory manager that decides what to store vs what to forget
In one of my projects (orbmem)www.orbmem.online, we separate:
short-term conversational state
long-term user memory (facts, preferences, habits)
and retrieve only relevant memories per turn instead of stuffing everything into context.
RAG alone wasn’t enough — combining structured memory + embeddings worked better for us.
Curious what others here are using in production.

prisma zealot Feb 19, 2026, 4:13 PM

#

anyone working in computer vision and space applications?

grim dragon Mar 5, 2026, 6:18 PM

#

What kind of space application ?