#data-science-and-ml | Python | Page 163

agile cobalt Apr 8, 2025, 4:15 AM

#

yeah the only reason for using it I could imagine would be portability (taking the Raspberry and BCI gear with you)

limber spear Apr 8, 2025, 6:32 AM

#

Grab a base nvidia gtx 1000 series to start. That is what I am starting with. Most of my training models is on a 10 year old intel Xeon cpu and 10 year old nvidia quadro m1000m gpu which pretty much has like 20 CUDA cores

#

Just barely bought it

#

It’s like $60 @ Walmart ~750 cuda cores give or take

deep bough Apr 8, 2025, 9:53 AM

#

Hi, I need some help. I have to build an invoice categorizer. I’ve already created one, but it breaks when I upload a PDF with a different template. Is there something I can do about that? I’m thinking I should start from scratch.

past meteor Apr 8, 2025, 10:46 AM

#

I think a lot of the BCI stuff is done with SVMs and other algos running on CPU

#

You might actually be fine with a raspberry pi in that case

#

Cause the (typically) tiny sample sizes and extremely high dimensionality are where SVMs still reign supreme

#

A bigger concern is whether or not all the libraries you want to use will support ARM

#

Yeah, KNN is an algorithm you probably should never use. You should be able to train the net on google collab using their GPUs and do inference on your CPU. Especially for image processing, and the fact you have 1000 images you should go down that route imo

serene grail Apr 8, 2025, 11:20 AM

#

past meteor Yeah, KNN is an algorithm you probably should never use. You should be able to t...

Why is KNN something you should never use? Is it because there are better modern algorithms or something else? Just curious

past meteor Apr 8, 2025, 11:29 AM

#

serene grail Why is KNN something you should never use? Is it because there are better modern...

It’s very slow and prone to overfitting

#

Amongst other problems

#

By default it also doesn’t consider 1 feature more important than others

serene grail Apr 8, 2025, 11:33 AM

#

I see, thank you!

past meteor Apr 8, 2025, 11:33 AM

#

Sure you can probably assign weights but the default implementations just do the dot product of 2 normalised vectors and that’s it. In reality the distance of 1 feature should contribute more to the overall distance than others, we also want to learn this scoring and not
hard code it. If you add this requirement then you’ve arrived at how most real ML algos works

#

The idea that we want to learn this scoring is exactly what the coefficients of logistic regression mean btw

serene grail Apr 8, 2025, 12:29 PM

#

past meteor Sure you can probably assign weights but the default implementations just do the...

"we want to learn this scoring" here means that we want the model during training to learn how much weight to assign to this 1 feature vs others?
Or does scoring mean something else?

past meteor Apr 8, 2025, 2:45 PM

#

serene grail "we want to learn this scoring" here means that we want the model during trainin...

Correct

dense pivot Apr 8, 2025, 3:16 PM

#

Is it possible to train custom ai model on virtual machines for a short period of time or the training should be going on continuous?

serene scaffold Apr 8, 2025, 3:18 PM

#

dense pivot Is it possible to train custom ai model on virtual machines for a short period o...

if it's a neural network, you usually want to keep training until the loss stops decreasing. but you can pause training if you have to.

#

you don't just keep training a model forever, though.

dense pivot Apr 8, 2025, 3:23 PM

#

serene scaffold if it's a neural network, you usually want to keep training until the loss stops...

If I want to train it for a long time, is raspberry pi ok or not?

serene scaffold Apr 8, 2025, 3:23 PM

#

dense pivot If I want to train it for a long time, is raspberry pi ok or not?

you probably can't train at all, for any amount of time, on a raspberry pi.

dense pivot Apr 8, 2025, 3:23 PM

#

serene scaffold you probably can't train at all, for any amount of time, on a raspberry pi.

I got no spare pc to do that

#

How can I train if I want to

serene scaffold Apr 8, 2025, 3:24 PM

#

dense pivot How can I train if I want to

google colab, maybe

viscid urchin Apr 8, 2025, 3:49 PM

#

There's also https://vast.ai/ if you just need a short-term rental

dusk fractal Apr 8, 2025, 7:08 PM

#

What model from https://pypi.org/project/g4f/ can I use to create creative&scientific writings of around 6000-8000 words? Considering I won't have to pay anything

#

I've tried the paid deepseek api and it can only return 1500-2000 word texts at a time

#

And I've only tried that because is it the cheapest

#

All the others are too expensive, considering I'll be needing about 400k-500k tokens a day of output, my max monthly budget is 20$, that's why I'm looking at G4F models

agile cobalt Apr 8, 2025, 7:13 PM

#

dusk fractal What model from https://pypi.org/project/g4f/ can I use to create creative&scien...

That package is literally a collection of exploits targeting vulnerable websites to hijack their APIs and steal from them, don't use it

If one of the first things you see when you open a package's description is a Legal Notice, you probably should notice there is something weird about it

dusk fractal Apr 8, 2025, 7:15 PM

#

I don't care about ethics considering I'm poor and those companies are rich...

#

If I were to have had the money I would've just paid for the o1 api but I don't have it

viscid urchin Apr 8, 2025, 7:15 PM

#

Sure but

#

!rule 5

arctic wedgeBOT Apr 8, 2025, 7:15 PM

#

Rules

5. Do not provide or request help on projects that may violate terms of service, or that may be deemed inappropriate, malicious, or illegal.

agile cobalt Apr 8, 2025, 7:16 PM

#

You can try using some of the "free" models under https://openrouter.ai instead, it's mostly companies voluntarily offering spare compute to test their infrastructure or trying to put their name out there to attract costumers

dusk fractal Apr 8, 2025, 7:18 PM

#

agile cobalt You can try using some of the "free" models under https://openrouter.ai instead,...

Thanks!

past meteor Apr 8, 2025, 7:55 PM

#

dense pivot Is it possible to train custom ai model on virtual machines for a short period o...

Yes and no

#

It's definitley possible but it may not be practical

#

Many ML algorithms are online by default. They're trained iteratively, think everything that uses gradient descent

#

That means you can always persist the weights to disk and resume training whenever. It depends on the library you're using to make this process nice or very annoying

pale thunder Apr 8, 2025, 9:00 PM

#

people who have been in a team in kaggle and tried to get actual work done,how?

#

We're suferring.

#

the versioning locks you out while things are running, there is no meaningful merge screen that I can find, ...

glacial root Apr 8, 2025, 10:34 PM

#

is this kind of progression normal for a neural network or did i do something wrong

safe agate Apr 8, 2025, 11:11 PM

#

pale thunder people who have been in a team in kaggle and tried to get actual work done,how?

Yeah I think most people don't use their notebook until submission and bring their own tools instead.

neat pasture Apr 8, 2025, 11:59 PM

#

Can someone here familiar with BCI and AI works help me?

I’m a masters student and I’m currently looking into research ideas for my thesis topic.

Initially I wanted to use brainwaves as input data to be used for controlling the mouse cursor in a PC, using AI as the interpreter of the brain wave data and further develop it to try to form texts from imagined words

But once I saw the prices of sensors that are available especially for the high requirements I have, I realised I have no chance of doing anything.
OpenBCI Daisy sensors or any good sensors with 14-16 channels costs ~$2000 if not more and as a master’s student whose monthly allowance is that (rent and food and utilities) I can’t afford to work with sensors.

So I’ve elected to instead work with existing datasets.

Is it really not possible to buy a bunch of cheap EEG sensors and then feed them into an Arduino to accomplish what OpenBCI Cyton+Daisy can do? Essentially a DIY version that would be cheaper but of course lots of wire and more stuff to program.

Or is dataset my only option?

Currently I have a good PC with a Ryzen 7600X3D cpu and 32GB RAM and a RTX 4070 Ti Super with 16GB VRAM to be used for AI training and coding

viscid urchin Apr 9, 2025, 12:15 AM

#

neat pasture Can someone here familiar with BCI and AI works help me? I’m a masters student ...

Yeah I guess they are pretty damn expensive even at the entry level https://imotions.com/blog/learning/product-guides/eeg-headset-prices/

#

and I guess 5 channels isn't enough for what you want to do? https://www.emotiv.com/products/insight

glacial root Apr 9, 2025, 12:31 AM

#

is it practical to implement a cnn with just numpy?

#

also is an ocr scanner a good resume project

neat pasture Apr 9, 2025, 12:37 AM

#

viscid urchin and I guess 5 channels isn't enough for what you want to do? https://www.emotiv....

5 channel I don’t know if I can do much. Especially since I’d prefer to do data to text conversion but I’ll look into them

pale thunder Apr 9, 2025, 6:31 AM

#

safe agate Yeah I think most people don't use their notebook until submission and bring the...

makes sense

wooden sail Apr 9, 2025, 7:40 AM

#

glacial root is it practical to implement a cnn with just numpy?

not really unless you're making something very small

#

any changes to the architecture (especially number of layers) require appropriate management of the derivatives, and the optimizers would also have to be implemented by you

#

there is certainly a time and place for that, but if you like numpy and wanna do a nitty gritty low level design of a network, i'd suggest jax

#

comes with autodiff and also some optimizers (e.g. in the optax module) if you wanna make custom architectures

quick igloo Apr 9, 2025, 12:16 PM

#

Should i learn python or java

glacial root Apr 9, 2025, 1:32 PM

#

https://tenor.com/view/bako-zintyn-megadeth-java-minecraft-gif-25794465

Tenor

grand minnow Apr 9, 2025, 2:11 PM

#

quick igloo Should i learn python or java

either or

quick igloo Apr 9, 2025, 2:33 PM

#

Is it hard to learn java

serene scaffold Apr 9, 2025, 2:41 PM

#

Hello @quick igloo, this is the data science channel on Python Discord.
Java is not used in data science.

quick igloo Apr 9, 2025, 2:42 PM

#

serene scaffold Hello <@725697208820826142>, this is the data science channel on Python Discord....

Really?

karmic pond Apr 9, 2025, 2:43 PM

#

quick igloo Should i learn python or java

First python, Because it's easier

#

But Java is not used for AI

muted vine Apr 9, 2025, 5:48 PM

#

genetic algorithms with a random entry, is considered reinforcement learning too?

flat token Apr 9, 2025, 5:56 PM

#

muted vine genetic algorithms with a random entry, is considered reinforcement learning too...

?

muted vine Apr 9, 2025, 6:36 PM

#

because G.A use error loss to select the better setup for the next iteration

#

and it's not supervised

knotty breach Apr 9, 2025, 6:51 PM

#

karmic pond But Java is not used for AI

Java is used for AI but too bad

glacial root Apr 9, 2025, 7:54 PM

#

knotty breach Java is used for AI but too bad

https://tenor.com/view/bako-zintyn-megadeth-java-minecraft-gif-25794465

Tenor

knotty breach Apr 9, 2025, 7:58 PM

#

glacial root https://tenor.com/view/bako-zintyn-megadeth-java-minecraft-gif-25794465

bro im not fan of java but yeah its used for AI

tame monolith Apr 9, 2025, 8:07 PM

#

Is there a roadmap I can follow for learning data science?

#

Sorry if it's asked too many times

karmic pond Apr 9, 2025, 8:08 PM

#

knotty breach bro im not fan of java but yeah its used for AI

Would you learn java for AI?

knotty breach Apr 9, 2025, 8:08 PM

#

karmic pond Would you learn java for AI?

absolutely not

karmic pond Apr 9, 2025, 8:08 PM

#

tame monolith Sorry if it's asked too many times

No problem

karmic pond Apr 9, 2025, 8:09 PM

#

tame monolith Is there a roadmap I can follow for learning data science?

Do you speak spanish?

tame monolith Apr 9, 2025, 8:09 PM

#

No I know english and hindi

karmic pond Apr 9, 2025, 8:12 PM

#

tame monolith No I know english and hindi

Ask on reddit, maybe you'll find something about it

glacial root Apr 9, 2025, 8:39 PM

#

tame monolith Is there a roadmap I can follow for learning data science?

you can search one up on roadmap.sh

worldly wagon Apr 9, 2025, 10:32 PM

#

question about polars, so i've purely been using it over pandas for coming on a year now, an i notice 1-2 issues like tuples/list not being the best supported compared to pandas etc pithink

#

my question is how does polars performance compare to pandas with rapids?

#

also also to build on that question what's the viability if any of polars with rapids
@me when u reply i dont look at my laptop much

iron basalt Apr 9, 2025, 11:40 PM

#

worldly wagon my question is how does polars performance compare to pandas with rapids?

Rapids is its own thing. Pandas, Polars, and Rapids use the same memory model (Apache Arrow), so they should all be able to transfer between each other.

#

https://rapids.ai/polars-gpu-engine/

RAPIDS | GPU Accelerated Data Science

Polars GPU Engine

Polars GPU engine powered by cuDF

worldly wagon Apr 9, 2025, 11:46 PM

#

iron basalt Rapids is its own thing. Pandas, Polars, and Rapids use the same memory model (A...

ahh thx for informing me will look more into that, i use the tool rather than question much lol
by this logic using rapids polars is faster than pandas?

iron basalt Apr 9, 2025, 11:53 PM

#

worldly wagon ahh thx for informing me will look more into that, i use the tool rather than qu...

If the operations being done are supported by Rapids (probably numeric stuff), and your GPU is faster than your CPU (probably), then yes.

#

In addition, this is better performance in terms of throughput, but worse latency (GPUs have higher latency).

#

I don't know whether it matters if you use Pandas or Polars, probably not.

#

Since both are just sending their tables to Rapids.

#

However, Polars has that integration to make it easier to use.

#

According to that page, it makes use of Polars' query optimizer, which AFAIK Pandas does not have.

agile cobalt Apr 10, 2025, 12:06 AM

#

worldly wagon question about polars, so i've purely been using it over pandas for coming on a ...

in pandas, lists and tuples are just stored as generic python objects
in polars, there are dedicated Array, List and Struct data types for nested data you should use instead of generic Object

placid drum Apr 10, 2025, 1:02 AM

#

hello!

#

can someone help me

serene scaffold Apr 10, 2025, 1:10 AM

#

placid drum can someone help me

Hello, remember to always always ask your actual question and never wait for someone to commit to answering before you actually ask it.

placid drum Apr 10, 2025, 1:11 AM

#

ok sorry

serene scaffold Apr 10, 2025, 1:11 AM

#

No problem, but be sure to always remember that every time forever

placid drum Apr 10, 2025, 1:11 AM

#

alright so this is to calculate atmic mass from molecular formula

#

but when i compile it

worldly wagon Apr 10, 2025, 1:12 AM

#

agile cobalt in pandas, lists and tuples are just stored as generic python objects in polars,...

struct is a bit cumbersome for my usecase so i chose to do object conversions,
basically storing the tuple (6, 7, 'ed')
polars list/array require linear types

placid drum Apr 10, 2025, 1:12 AM

#

its idle and doesnt prompt me for input

#

I cant debug it because I get this error Terminal exits with code 3221225786 (or similar)

worldly wagon Apr 10, 2025, 1:13 AM

#

placid drum its idle and doesnt prompt me for input

send a code snippet

placid drum Apr 10, 2025, 1:13 AM

#

loop_value = 0 # for infinite loop
while loop_value == 0: # indentation for loop
atomic_number_mass = { # dictionary

worldly wagon Apr 10, 2025, 1:13 AM

#

use backticks

serene scaffold Apr 10, 2025, 1:14 AM

#

placid drum loop_value = 0 # for infinite loop while loop_value == 0: # indentation for loop...

If you're asking a basic python question, open a thread. The instructions are in #❓｜how-to-get-help . Read every word of the instructions.

worldly wagon Apr 10, 2025, 1:14 AM

#

i.e

auto_review = load_dataset(
    "McAuley-Lab/Amazon-Reviews-2023",
    "raw_review_Automotive",
    streaming=True,
    trust_remote_code=True
)

placid drum Apr 10, 2025, 1:15 AM

#

serene scaffold If you're asking a basic python question, open a thread. The instructions are in...

alright thank you

viscid urchin Apr 10, 2025, 1:19 AM

#

worldly wagon i.e ```py auto_review = load_dataset( "McAuley-Lab/Amazon-Reviews-2023", ...

That's the Huggingface API/lib right? I should probably try it sometime.

worldly wagon Apr 10, 2025, 1:35 AM

#

viscid urchin That's the Huggingface API/lib right? I should probably try it sometime.

yea lol feel free
https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023

McAuley-Lab/Amazon-Reviews-2023 · Datasets at Hugging Face

crisp salmon Apr 10, 2025, 11:58 AM

#

I have the following code:

# frame = cv2.resize(frame, (0,0), fx=scaleFactor, fy=scaleFactor)
overlay = frame.copy()
for idx, detection in enumerate(extracted_boxes, 1):
    bbox, text, confidence = detection
    cv2.rectangle(overlay, (self.scoreBoxCoords['x']+int(bbox[0][0]), self.scoreBoxCoords['y']+int(bbox[0][1])),
            (self.scoreBoxCoords['x']+int(bbox[2][0]), self.scoreBoxCoords['y']+int(bbox[2][1])), (255, 0, 0), thickness=2)
    cv2.putText(overlay, str(text), (int(self.scoreBoxCoords['x']+int(bbox[0][0])), int(self.scoreBoxCoords['y']+int(bbox[0][1])-8)),
                cv2.FONT_HERSHEY_PLAIN, 1.5, (0, 0, 255), 1, cv2.LINE_AA)

# Blend the overlay with the original frame
alpha = 0.6  # Transparency factor
frame = cv2.addWeighted(overlay, alpha, frame, 1 - alpha, 0)

# Show the frame with text overlay
cv2.imshow("Video Text Overlay", frame)

It works fine with one caveat: the results are for a scaled image. However, when removing the comment in the first line only the image is shown and no overlay is visible. What is the reason for this? Thanks for your help!

terse bone Apr 10, 2025, 12:39 PM

#

anyone on kaggle

grand minnow Apr 10, 2025, 1:10 PM

#

terse bone anyone on kaggle

Why do you ask?

terse bone Apr 10, 2025, 1:15 PM

#

grand minnow Why do you ask?

I'm new to Kaggle and I'm unable to select the GPU (P100) option. Not sure what I'm missing—any guidance would be great!

grand minnow Apr 10, 2025, 1:16 PM

#

terse bone I'm new to Kaggle and I'm unable to select the GPU (P100) option. Not sure what ...

You've probably used 30 hours a week already

terse bone Apr 10, 2025, 1:17 PM

#

grand minnow You've probably used 30 hours a week already

no , i just login it today

grand minnow Apr 10, 2025, 1:17 PM

#

🤷

terse bone Apr 10, 2025, 1:25 PM

#

grand minnow You've probably used 30 hours a week already

see

grand minnow Apr 10, 2025, 1:26 PM

#

terse bone see

Try make another notebook? I have no idea tbh

terse bone Apr 10, 2025, 1:26 PM

#

grand minnow Try make another notebook? I have no idea tbh

done , tbh i had stuck on that for 2 hr , and exploring internet for solution

karmic pond Apr 10, 2025, 7:45 PM

#

terse bone done , tbh i had stuck on that for 2 hr , and exploring internet for solution

Did you find a solution?

terse bone Apr 10, 2025, 7:46 PM

#

karmic pond Did you find a solution?

Yea , I didn't verify my phone number

weary timber Apr 10, 2025, 8:35 PM

#

terse bone no , i just login it today

you gotta get verified getting the gpu access

hollow pagoda Apr 10, 2025, 9:55 PM

#

is anyone familiar with fbProphet time series forecasting library

serene scaffold Apr 10, 2025, 10:26 PM

#

hollow pagoda is anyone familiar with fbProphet time series forecasting library

Hello, remember to always ask your actual question. Don't ask to ask

glacial root Apr 10, 2025, 11:35 PM

#

if i made a project where i did all the under the hood stuff with cython instead of using libraries, would that look better to recruiters or would they not care

serene scaffold Apr 10, 2025, 11:51 PM

#

glacial root if i made a project where i did all the under the hood stuff with cython instead...

If the job involves using cython then sure.

Generally speaking, writing code with self-imposed constraints isn't impressive to employers. They don't care if their developers are "hard core"--they want people who will write maintainable code that works

glacial root Apr 11, 2025, 12:04 AM

#

serene scaffold If the job involves using cython then sure. Generally speaking, writing code wi...

how often is it that mle roles involve usage of cython

glacial root Apr 11, 2025, 12:07 AM

#

serene scaffold If the job involves using cython then sure. Generally speaking, writing code wi...

would it be good to do if i don't have a gpu though?

#

i'm trying to implement my own cnn and am using numpy but just setting up the architecture and getting the cost takes super long to run, and so i'm not sure how astronomically long the gradient descent part would take

#

and also i think it would speed up my first ann implementation, which i've never been able to test cause it never stops running

stone night Apr 11, 2025, 12:41 AM

#

AI i been working on for a year and a half training constantly

serene scaffold Apr 11, 2025, 12:41 AM

#

glacial root how often is it that mle roles involve usage of cython

I used cython exactly once to make an open source contribution to spaCy. Users of spaCy do not need to know or use cython.

Cython is not a viable alternative to GPU-accelerated code.

#

you can use a GPU for free on google colab.

hollow pagoda Apr 11, 2025, 1:17 AM

#

serene scaffold Hello, remember to always ask your actual question. Don't ask to ask

when i do they get ignored

serene scaffold Apr 11, 2025, 1:17 AM

#

hollow pagoda when i do they get ignored

if you don't ask your actual question, it's guaranteed that it won't get answered.

hollow pagoda Apr 11, 2025, 1:17 AM

#

as if nobody understands the data science questions i have

hollow pagoda Apr 11, 2025, 1:18 AM

#

serene scaffold if you don't ask your actual question, it's guaranteed that it won't get answere...

when i do ask them they still do whats the point

#

i assume maybe people are busy, seeing if someone may have knowledge and wanna answer later

serene scaffold Apr 11, 2025, 1:18 AM

#

hollow pagoda when i do ask them they still do whats the point

you have to be willing to front the effort of actually asking your question if you want to get free help.

hollow pagoda Apr 11, 2025, 1:18 AM

#

i did

serene scaffold Apr 11, 2025, 1:19 AM

#

is anyone familiar with fbProphet time series forecasting
unless you're doing a survey of who knows about what, this isn't your actual question.

hollow pagoda Apr 11, 2025, 1:19 AM

#

its a data science library

serene scaffold Apr 11, 2025, 1:20 AM

#

why do you care if anyone knows about it

hollow pagoda Apr 11, 2025, 1:20 AM

#

because asking the question will get it overlooked and pushed out of chat

#

someone that knows is more likely to answer starting with the subject

serene scaffold Apr 11, 2025, 1:21 AM

#

That is false.

#

You need to always ask your actual question. If you're not willing to do that, for whatever reason, I have to ask that you refrain from asking at all.

hollow pagoda Apr 11, 2025, 1:22 AM

#

ok ill ask it in python discussion

serene scaffold Apr 11, 2025, 1:23 AM

#

Okay, but the same rule applies. If you ask "does anyone know about X", I'm going to tell you again that you need to ask your actual question.

hollow pagoda Apr 11, 2025, 1:23 AM

#

not if its a discussion

serene scaffold Apr 11, 2025, 1:24 AM

#

hollow pagoda not if its a discussion

but it isn't. there's a question you want help with. If you're not willing to say what it is, you need to not waste the time of our volunteers.

hollow pagoda Apr 11, 2025, 1:24 AM

#

how am i wasting time im not forcing anyone to do anything

#

my real questions already get ignored how can their time be wasted

serene scaffold Apr 11, 2025, 1:24 AM

#

hollow pagoda how am i wasting time im not forcing anyone to do anything

because people are going to tell you to ask your actual question, and this whole conversation is going to happen over again.

serene scaffold Apr 11, 2025, 1:25 AM

#

hollow pagoda my real questions already get ignored how can their time be wasted

then copy/paste the real question that you already wrote.

hollow pagoda Apr 11, 2025, 1:25 AM

#

so u rather me spam then see if i should even continue asking the ignored question

serene scaffold Apr 11, 2025, 1:25 AM

#

I would rather that you repost your actual question than ask to ask, yes.

hollow pagoda Apr 11, 2025, 1:26 AM

#

im not asking to ask im beginning a discussion

#

didnt even ask for help yet

serene scaffold Apr 11, 2025, 1:27 AM

#

hollow pagoda im not asking to ask im beginning a discussion

this is lawyering. please stop.

hollow pagoda Apr 11, 2025, 1:27 AM

#

bro what 😭 you are mall copping

serene scaffold Apr 11, 2025, 1:29 AM

#

I mean, I'm one of the directors of this community.
But I'm genuinely trying to maximize the number of questions that get answered and minimize the amount of volunteer time that gets spent not answering questions (by, for example, coercing people into asking their question).

hollow pagoda Apr 11, 2025, 1:30 AM

#

tell volunteers to ignore asking-to-ask questions the same way normal questions get ignored

serene scaffold Apr 11, 2025, 1:30 AM

#

hollow pagoda tell volunteers to ignore asking-to-ask questions the same way normal questions ...

We don't ignore asking to ask because we genuinely care that questions get answered, and we want you to maximize your chance of getting one.

glacial root Apr 11, 2025, 1:45 AM

#

hollow pagoda when i do ask them they still do whats the point

well then what's the point of asking what you asked if no one's gonna answer that either

#

it happens to all of us sometimes, people are busy or maybe no one who knows the answer sees the question

#

and if we go by your logic which is that either one will get ignored, at least if you ask the full question there is a greater chance of it being answered

hollow pagoda Apr 11, 2025, 1:46 AM

#

glacial root well then what's the point of asking what you asked if no one's gonna answer tha...

helps not give the brain the micro relaxation of expecting help after sending something ur working on in the chat

glacial root Apr 11, 2025, 1:47 AM

#

?

#

that makes no sense

hollow pagoda Apr 11, 2025, 1:47 AM

#

it does

#

if u cant figure something out and u send it in chat asking for help its like 200% easier to start working on something else and waiting for a response

glacial root Apr 11, 2025, 1:48 AM

#

i agree, but what extra are you doing by asking the full question

hollow pagoda Apr 11, 2025, 1:49 AM

#

wasting ur own time

glacial root Apr 11, 2025, 1:49 AM

#

i do that too a lot

glacial root Apr 11, 2025, 1:49 AM

#

hollow pagoda wasting ur own time

?

#

you're spending an extra 5-10 seconds typing out the entire question

hollow pagoda Apr 11, 2025, 1:49 AM

#

if u gonna ask a question u should provide as much context as u can

#

otherwise the person will be complaining about not seeing the examples/data/code, etc

glacial root Apr 11, 2025, 1:50 AM

#

i mean do you want help or not

#

so if asking for help is a waste of time for you, why ask in the first place

hollow pagoda Apr 11, 2025, 1:50 AM

#

its the type of question

glacial root Apr 11, 2025, 1:51 AM

#

i somewhat get what you mean, but if you want help then you gotta ask properly

hollow pagoda Apr 11, 2025, 1:51 AM

#

im assuming nobody active in chat uses the prophet library so if i ask why its forecasting strange results itll waste my time showing the results and data

glacial root Apr 11, 2025, 1:51 AM

#

it happens

#

sometimes there aren't people who can answer

hollow pagoda Apr 11, 2025, 1:51 AM

#

i know

glacial root Apr 11, 2025, 1:52 AM

#

but if you are asking your colleagues for help, how would you ask

hollow pagoda Apr 11, 2025, 1:54 AM

#

"have you done time forecasting im getting strange results from this model, expecting x but getting y and not sure if its just the model or the parameters"

#

a simple y or n then u show them

#

yall acting like i started pinging people bothering them

glacial root Apr 11, 2025, 2:09 AM

#

hollow pagoda yall acting like i started pinging people bothering them

i'm not gonna lie i don't care much about it either, in another server i had this same issue with the people there

#

but in all the programming servers i'm in, it seems to be a common norm and so i just follow it cause it's not really causing any inconvenience for me

#

i suggest you do the same just cause it would probably be more helpful for you

jade sinew Apr 11, 2025, 2:11 AM

#

Anyone want to practice Pytorch with me?

quaint rivet Apr 11, 2025, 3:40 AM

#

I am trying to convert torch.FloatTensor into cuda's type. But it's not converting. What to do?


path = os.path.join("train", "masks/**")
path_lst = glob.glob(path)

y = torch.tensor([], dtype=torch.float32)
out = torch.tensor([], dtype=torch.float32)

for path in path_lst:
    with rio.open(path) as dataset:
        img = dataset.read(1)  
        flat = torch.from_numpy(img).float().flatten()  
        y = torch.cat((y, flat), dim=0)
        out = y


np_out = out.numpy()
weights = compute_class_weight(class_weight="balanced", classes=np.unique(np_out), y=np_out)

class_weights = torch.from_numpy(weights)
c_weights = class_weights.to(device)

viscid urchin Apr 11, 2025, 5:28 AM

#

I think you need to move it back to the cpu to do the numpy stuff?
e.g. np_out = out.cpu().numpy()?

#

And maybe specify device=your_device when initializing the tensors?

stoic hollow Apr 11, 2025, 8:20 AM

#

Should a laplace mechanism be applied to every value in a dataset or just to the result of a query?

odd meteor Apr 11, 2025, 12:32 PM

#

hollow pagoda im assuming nobody active in chat uses the prophet library so if i ask why its f...

Don’t "presume" before actually asking. If you’re looking for commitment and a quick response, avoid asking a question just for the sake of it. Instead, provide clear and detailed context upfront. This allows those who know the answer to immediately understand your problem and offer help without needing to dig for more information before deciding whether they want to assist.

woven tinsel Apr 11, 2025, 2:33 PM

#

does anyone know the best course on Data analytics online ? kindly give some suggestions

agile cobalt Apr 11, 2025, 2:50 PM

#

woven tinsel does anyone know the best course on Data analytics online ? kindly give some sug...

maybe Google or DeepLearning.ai

Data Analytics Certificate & Training - Grow with Google

The Data Analytics Certificate, developed by Google, can help you learn how to use AI to process, analyze, and visualize data.

Data Analytics Professional Certificate

Build a job-ready data analytics skillset using industry-standard tools, including generative AI to extract insights, make decisions, and solve real-world business problems.

glacial root Apr 11, 2025, 2:55 PM

#

serene scaffold I used cython exactly once to make an open source contribution to spaCy. Users o...

how about the python c api, how often is it used

#

not asking as a replacement for gpu usage, i mean how much is it used for performance optimization on top of gpu usage

serene scaffold Apr 11, 2025, 2:57 PM

#

glacial root how about the python c api, how often is it used

many of the tools that DS/AI people use are written in C, but users of those libraries don't need to use C.

#

I've never created a python tool in C.

glacial root Apr 11, 2025, 2:59 PM

#

so the only reason i would need to worry about using c/cpp for anything in ml is if i'm implementing a model to be used for an embedded system

serene scaffold Apr 11, 2025, 3:05 PM

#

glacial root so the only reason i would need to worry about using c/cpp for anything in ml is...

you would only need to worry about that when it comes time to deploy the model. you'd still be implementing it in python code.

glacial root Apr 11, 2025, 3:13 PM

#

serene scaffold you would only need to worry about that when it comes time to deploy the model. ...

oh how would deployment work then

#

and in that case why do many people (i know no where near the majority but still a decent number) use cpp for computer vision

serene scaffold Apr 11, 2025, 3:14 PM

#

glacial root oh how would deployment work then

you can export models to something that can be used in cpp code. idk the specifics.

glacial root Apr 11, 2025, 3:15 PM

#

oh i see

#

that's cool

#

so efficiency of c/cpp while not having to go through all the extras complications of the language

calm thicket Apr 11, 2025, 3:58 PM

#

glacial root oh how would deployment work then

the important parts of a model, the weights, can be exported to a common format that different languages can read

iron basalt Apr 11, 2025, 7:01 PM

#

glacial root and in that case why do many people (i know no where near the majority but still...

If you are doing computer vision on a small device you need to be as efficient as possible. Technically you are also using CPP when doing computer vision in Python with something like OpenCV, just indirectly, and not written by you.

#

However, there will often still be some Python that is just there to basically connect things. Like feed the camera data to your CPP library, read config files, maybe do some networking, etc. Python shows up almost all the time as at least some general scripting tool that ties it all together.

#

This is because Python is simple to use, easy to get packages for, and has a lot of packages for everything imaginable.

#

These packages are usually all each implemented in something like C, with a small Python layer on top (sometimes that Python layer is large).

#

However, there are some cases where you may need to cut out Python entirely, these are not super common (for most ML devs).

worldly wagon Apr 11, 2025, 8:29 PM

#

so i'm working with a fairly large dataset (200gb) so i moved it to parquet, then read it in using polars, however when merging/cleaning my kernel keeps dying I assume due to ram so i went from 32->48gb of ram but similar issues, haven't really been seeing success with the streaming engine

#

Is there any advice on dealing with datasets this large? I'm considering getting WSL to try using rapids with polars but idk the viablility of that if its a ram issue

#

code for context

lf_review: pl.LazyFrame = pl.scan_parquet("amazon_review_auto.parquet")
lf_meta: pl.LazyFrame = pl.scan_parquet("amazon_meta_auto.parquet")

lf_review: pl.LazyFrame = lf_review.filter(pl.col("rating").is_in([1, 2, 3, 4, 5]))
lf_review = lf_review.filter(pl.col("text").str.strip_chars().str.len_chars() > 0)


lf_review = lf_review.with_columns([
    pl.col("text").str.count_matches(r"\b\w+\b").alias("review_length"),
    (pl.col("timestamp").cast(pl.Datetime("ms")).dt.year()).alias("year")
])


lf: pl.LazyFrame = lf_review.join(lf_meta, on="parent_asin", how="left")
lf = lf.with_columns([extract_brand()])
lf = lf.unique(subset=["user_id", "text", "asin"], keep="first")

df: pl.DataFrame = lf.collect(streaming=True)

viscid urchin Apr 11, 2025, 8:36 PM

#

df: pl.DataFrame = lf.collect(streaming=True) is the problem.. Yes it's streaming, but you're asking it to collect everything into a single dataframe

worldly wagon Apr 11, 2025, 8:37 PM

#

viscid urchin `df: pl.DataFrame = lf.collect(streaming=True)` is the problem.. Yes it's stream...

so how should i approach it instead? appreciate the advice btw

viscid urchin Apr 11, 2025, 8:38 PM

#

worldly wagon so how should i approach it instead? appreciate the advice btw

I'm a total Polars noob, but looking at their docs, this is maybe what you want? https://docs.pola.rs/api/python/stable/reference/api/polars.LazyFrame.sink_parquet.html

#

The example they give at the bottom is:

lf = pl.scan_csv("/path/to/my_larger_than_ram_file.csv")  
lf.sink_parquet("out.parquet")

worldly wagon Apr 11, 2025, 8:39 PM

#

viscid urchin I'm a total Polars noob, but looking at their docs, this is maybe what you want?...

yea i was considering that

#

the only thing is what happens when i need to merge multiple parquets about 34 into 1 pithink

#

do i just sink it into a parquet again? (Asking btw not being condescending)

viscid urchin Apr 11, 2025, 8:40 PM

#

Yeah, you should just be able to feed it more

#

It looks like there's also for batch in lf.collect_streaming_batches(): if you need to operate on the data before writing it to disk
Edit: I might be wrong about the function name, double-check me

worldly wagon Apr 11, 2025, 8:41 PM

#

viscid urchin It looks like there's also `for batch in lf.collect_streaming_batches():` if you...

ohh smart pithink

viscid urchin Apr 11, 2025, 8:41 PM

#

It looks like you can also ask for compression and stuff, that's useful:

lf.sink_parquet(
    "processed_results.parquet", 
    compression="zstd",
    maintain_order=False,  # Might improve performance?
    streaming=True
)

worldly wagon Apr 11, 2025, 8:41 PM

#

i'll read into it

#

some quick very very dumb questions, I'm new to ML if i was to train after reading in a large parquet would that affect my ram or mainly cpu?

worldly wagon Apr 11, 2025, 8:42 PM

#

viscid urchin It looks like you can also ask for compression and stuff, that's useful: ```pyth...

also would using rapids improve this?

viscid urchin Apr 11, 2025, 8:42 PM

#

This looks maybe helpful https://www.rhosignal.com/posts/streaming-in-polars/

Rho Signal

Streaming large datasets in Polars

One major advantage of Polars over Pandas is that working with larger-than-memory datasets can be as easy as adding a single argument to a function call. However, streaming doesn’t work in all cases. In this post I introduce how streaming works and how to work around some challenges you may face.

viscid urchin Apr 11, 2025, 8:43 PM

#

worldly wagon also would using rapids improve this?

Rapids I guess would presumably let you split it up across machines more readily, but I think polars can do this on its own if I'm understanding the docs correctly

worldly wagon Apr 11, 2025, 8:44 PM

#

viscid urchin Rapids I guess would presumably let you split it up across machines more readily...

ahh yea just one machine

#

was just wondering if the gpu methods wud help with the ram/performance issues

viscid urchin Apr 11, 2025, 8:45 PM

#

I guess it depends on how intense the operations you want to perform on the data are. If they are 'lightweight' you will be memory limited on the GPU.

#

pl.Config.set_streaming_chunk_size() seems to be how you can manually adjust how big the chunks it works on are.

#

I guess it's measured in rows

worldly wagon Apr 11, 2025, 8:46 PM

#

viscid urchin `pl.Config.set_streaming_chunk_size()` seems to be how you can manually adjust h...

ahh will have to check this out later tn hopefully i dont get cooked

viscid urchin Apr 11, 2025, 8:50 PM

#

Oh, aha, my earlier thing should probably be phrased as for batch in lf.collect(streaming=True, streaming_chunk_size=batch_size) where you pick the batch size. The function I mentioned earlier is something I found on Google but seems to have been a user wondering about something

worldly wagon Apr 11, 2025, 8:52 PM

#

viscid urchin Oh, aha, my earlier thing should probably be phrased as `for batch in lf.collect...

appreciate it 🙏

hollow pagoda Apr 11, 2025, 9:19 PM

#

anyone know why plotly express boxplot appearing like scatter of data points instead of like the seaborn box

#

tryna use the dash app with it but its only pltexp compatible

agile cobalt Apr 11, 2025, 9:22 PM

#

hollow pagoda anyone know why plotly express boxplot appearing like scatter of data points ins...

check if total_price is properly treated as a float, not as string or something else like decimal
(as in, check the dataframe dtypes)

hollow pagoda Apr 11, 2025, 10:58 PM

#

They have decimal places like a float but I'll check if it's decimal type in a bit

worldly wagon Apr 11, 2025, 11:18 PM

#

anyone getting this recently in vscode and aware how to turn it off? (i dont use copilot)

Generate button in notebook cells

#

nvm found it idk if anyone is dumb as me so i'll leave this here

hollow pagoda Apr 12, 2025, 12:25 AM

#

agile cobalt check if `total_price` is properly treated as a float, not as string or somethin...

its float64

hollow pagoda Apr 12, 2025, 12:25 AM

#

worldly wagon anyone getting this recently in vscode and aware how to turn it off? (i dont use...

i nvr seen that before i personally just disabled copilot

hollow pagoda Apr 12, 2025, 12:45 AM

#

agile cobalt check if `total_price` is properly treated as a float, not as string or somethin...

a string wouldnt even be plottable

hollow pagoda Apr 12, 2025, 12:47 AM

#

hollow pagoda anyone know why plotly express boxplot appearing like scatter of data points ins...

its like it gave a box to every value or somethin

glacial root Apr 12, 2025, 2:54 AM

#

if i wanted to implement an rnn for predicting the next word in a sequence, would it be a good idea to implement a graph database with the words for better word embeddings?

serene scaffold Apr 12, 2025, 2:55 AM

#

glacial root if i wanted to implement an rnn for predicting the next word in a sequence, woul...

I don't think so.

hollow pagoda Apr 12, 2025, 2:56 AM

#

hollow pagoda anyone know why plotly express boxplot appearing like scatter of data points ins...

it was because of the colors lolz cant do it with pltexp unless theres a way to lessen the bins

#

oh specifically because i did colors on the y instead of x makes sense

glacial root Apr 12, 2025, 2:59 AM

#

serene scaffold I don't think so.

oh should i just randomly assign an index to each word

#

i thought having organization with respect to semantic relationships was helpful

serene scaffold Apr 12, 2025, 3:01 AM

#

glacial root i thought having organization with respect to semantic relationships was helpful

how would you decide which words are semantically related without making the whole graph by hand?

glacial root Apr 12, 2025, 3:02 AM

#

serene scaffold how would you decide which words are semantically related without making the who...

yeah i would do it by hand

#

though that would probably be an issue for larger datasets

serene scaffold Apr 12, 2025, 3:02 AM

#

glacial root yeah i would do it by hand

would take too long.

glacial root Apr 12, 2025, 3:02 AM

#

how do people typically do it

serene scaffold Apr 12, 2025, 3:02 AM

#

typically do what?

glacial root Apr 12, 2025, 3:03 AM

#

implement graph databases for this task

#

or is it typically just not used

serene scaffold Apr 12, 2025, 3:03 AM

#

I've never heard of anyone doing this.

glacial root Apr 12, 2025, 3:03 AM

#

oh

#

i read somewhere that it was useful

glacial root Apr 12, 2025, 3:04 AM

#

serene scaffold I've never heard of anyone doing this.

so would we just disregard semantic relationships and randomly assign an index to each word

serene scaffold Apr 12, 2025, 3:05 AM

#

glacial root so would we just disregard semantic relationships and randomly assign an index t...

the two parts of this sentence are unrelated

#

you want to create a model that generates text, right?

glacial root Apr 12, 2025, 3:11 AM

#

serene scaffold you want to create a model that generates text, right?

not generates, but predicts what the next word will be based on what's typed by the user, and i'll be doing it using a rnn

serene scaffold Apr 12, 2025, 3:14 AM

#

glacial root not generates, but predicts what the next word will be based on what's typed by ...

that's ultimately the same thing

wide sphinx Apr 12, 2025, 3:25 AM

#

Can I ask, why u r called stelercus papabilissimus?

glacial root Apr 12, 2025, 3:26 AM

#

serene scaffold that's ultimately the same thing

oh i thought they were different tasks

#

so it's essentially not necessary?

glacial root Apr 12, 2025, 3:27 AM

#

wide sphinx Can I ask, why u r called stelercus papabilissimus?

that is the greatest name one can find in a discordian

wide sphinx Apr 12, 2025, 3:27 AM

#

I see…

glacial root Apr 12, 2025, 3:28 AM

#

can i ask why you like monkeys and friend chicken

#

same reasoning here

wide sphinx Apr 12, 2025, 3:28 AM

#

Monkeys

#

R very very cute

#

And they remind me of humans

glacial root Apr 12, 2025, 3:28 AM

#

same reasoning here

#

oh

#

not quite same reasoning then

#

but similar reasoning

wide sphinx Apr 12, 2025, 3:28 AM

#

Similar

glacial root Apr 12, 2025, 3:32 AM

#

serene scaffold that's ultimately the same thing

also with the graph dbms tool, wouldn't it help by allowing us to create more data with less

#

because with the same set of words, we can find new sequences

serene scaffold Apr 12, 2025, 3:43 AM

#

glacial root also with the graph dbms tool, wouldn't it help by allowing us to create more da...

I don't think you'll get anything from the graph database step that you wouldn't have gotten implicitly from other language model training techniques.

glacial root Apr 12, 2025, 3:48 AM

#

serene scaffold I don't think you'll get anything from the graph database step that you wouldn't...

oh what technique should i use

worldly wagon Apr 12, 2025, 5:55 AM

#

hollow pagoda i nvr seen that before i personally just disabled copilot

never enabled it so ig it just kinda did its thing not a fan myself

snow plume Apr 12, 2025, 7:56 AM

#

im making my first CNN project and im just after some advice on what statistics i should have. So far i have:

accuracy vs validation accuracy before data augmentation
loss vs validation loss before data augmentation
accuracy vs validation accuracy after data augmentation
loss vs validation loss after data augmentation
time taken for epochs
multiclass confusion matrix
f1 score

is there anything else people would recommend me adding?

unkempt apex Apr 12, 2025, 8:28 AM

#

snow plume im making my first CNN project and im just after some advice on what statistics ...

you don't need to consider lot of things actually
the main goal should be your validation loss is decreasing over the period of time ( along with training loss )

and then you will test the model on different images ( but of same type ) to check if model has overfitted or not

snow plume Apr 12, 2025, 8:34 AM

#

perfect, thank you!

river cape Apr 12, 2025, 9:16 AM

#

guys could anyone clear out my confusion , what does LSTM(64) mean?
is it like an lstm layer with 64 units?

and could some one clarify between lstm layer,lstm cell ,lstm unit

viral rune Apr 12, 2025, 11:15 AM

#

Yes

charred stag Apr 12, 2025, 12:16 PM

#

if object != None:
            filt = {"_id": index, object: {"$exists": True}}
            update = {"$set": {f"{object}.{to_edit}": value}}
            #* check if the field exists
            check_exist = file.find_one(filt)
            if check_exist == None:
                #* create new field
                file.update_one({"_id": index}, {"$set": {object: {}}})
        elif object == None:
            filt = {}
            update = {"$set": {to_edit: value}}
        file.update_one(filt, update)```

guess the library

snow plume Apr 12, 2025, 12:23 PM

#

im a little lost on what i have done wrong with my CNN. Im using Efficientnetb0 model with CIFAR10 database.
i ran the cnn for 50 epochs without data augmentation and 50 epochs with data augmentation

after data augmentation my validation loss is slightly increasing and my validation accuracy is sitting around 0.85.

#

river cape Apr 12, 2025, 1:05 PM

#

snow plume im a little lost on what i have done wrong with my CNN. Im using Efficientnetb0 ...

tbh that accuracy is good

snow plume Apr 12, 2025, 1:07 PM

#

yeah the accuracy is good but my validation accuracy hasnt really changed which im confused about

river cape Apr 12, 2025, 1:08 PM

#

snow plume yeah the accuracy is good but my validation accuracy hasnt really changed which ...

well if you consider train accuracy is 92 and test accuracy is 85 , there is a slight overfitting issue here

snow plume Apr 12, 2025, 1:09 PM

#

how would that be solved then, saying this is after the data augmentation?

#

because from my understanding, data augmentation is one way to solve an overfitting problem?

river cape Apr 12, 2025, 1:10 PM

#

snow plume because from my understanding, data augmentation is one way to solve an overfitt...

did you use early stopping?

snow plume Apr 12, 2025, 1:11 PM

#

i have not, no. What would that do?

river cape Apr 12, 2025, 1:12 PM

#

basically you stop training model once its goes above a certain patience

#

check it out

snow plume Apr 12, 2025, 1:13 PM

#

river cape basically you stop training model once its goes above a certain patience

makes sense. Anything else you'd recommend as well as the early stopping?

river cape Apr 12, 2025, 1:24 PM

#

snow plume makes sense. Anything else you'd recommend as well as the early stopping?

batch normalization , dropout , using different optimizers , i think regularization also not sure

snow plume Apr 12, 2025, 1:25 PM

#

Sweet, I'll check them out and try again. Thanks for your time

dry raft Apr 12, 2025, 1:29 PM

#

btw, do you know any github repos or kaggle notebooks that do this?

#

most things I see are always confusing

safe agate Apr 12, 2025, 2:51 PM

#

Hosting a data science workshop in a few hours

https://discord.gg/python?event=1350928346422186065

worldly wagon Apr 12, 2025, 3:46 PM

#

any ideas on how one would load a large file/parquet using polars or alternatives? The files range from 15gb-100gb parquets
context: https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023
my compute 32gb ram 256ssd 2tb hdd doesnt seem enough

#

I'm debating if to try ssh into amazon ec2 and just ripping it once/twice

viscid urchin Apr 12, 2025, 4:27 PM

#

stream it in chunks with collect() or similar, operate on it as needed, and use sink_parquet to stream it out to disk, seems to be the way with Polars

worldly wagon Apr 12, 2025, 4:39 PM

#

viscid urchin stream it in chunks with `collect()` or similar, operate on it as needed, and us...

where do you see the option to stream it in using chunks with collect?
https://docs.pola.rs/api/python/stable/reference/lazyframe/api/polars.LazyFrame.collect.html

viscid urchin Apr 12, 2025, 4:41 PM

#

Oh damn I thought there was a chunk_size arg on collect, lemme look for where that is

#

aha polars.Config.set_streaming_chunk_size on that page

worldly wagon Apr 12, 2025, 4:43 PM

#

viscid urchin aha `polars.Config.set_streaming_chunk_size` on that page

ahh i saw this will try it albeit based on the docs its believed that this may lead to memory errors

viscid urchin Apr 12, 2025, 4:43 PM

#

Aha, here's one approach it seems https://github.com/pola-rs/polars/issues/18820

#

to get all the ids and then scan over them in chunks

worldly wagon Apr 12, 2025, 4:53 PM

#

viscid urchin Aha, here's one approach it seems https://github.com/pola-rs/polars/issues/18820

damm lil cooked pithink

#

i wonder if pyspark could help curb this issue not sure of the config hopefully someone discussed it in chat since imma search

viscid urchin Apr 12, 2025, 4:56 PM

#

It really seems like collect should support a 'how much to collect' arg to me, but I guess I'm not a Polars expert, maybe there's a good reason not to.

worldly wagon Apr 12, 2025, 4:57 PM

#

viscid urchin It really seems like collect should support a 'how much to collect' arg to me, b...

yea really curious how people workwith this dataset cant find anyone except the publisher

jaunty helm Apr 12, 2025, 5:08 PM

#

worldly wagon any ideas on how one would load a large file/parquet using polars or alternative...

there's also read_csv_batched https://docs.pola.rs/api/python/stable/reference/api/polars.read_csv_batched.html

worldly wagon Apr 12, 2025, 5:11 PM

#

jaunty helm there's also read\_csv\_batched https://docs.pola.rs/api/python/stable/reference...

thats really nice however pithink wouldnt i be exponentially increasing my storage requirements on the csv side choosing csv over parquet?

#

not trying to be difficult btw if going parquet->csv means i can actually use the data it might be a common sense trade off

still no idea how this is going into plotly and sklearn lol

jaunty helm Apr 12, 2025, 5:23 PM

#

worldly wagon thats really nice however<:pithink:652247559909277706> wouldnt i be exponentiall...

ah right, you have a parquet... well yeah, ig parquet -> csv if you want to use this
the problem with streaming is it's still be actively worked on, so you might see some rough edges
slice to get chunks of a lazyframe if you need ig

jaunty helm Apr 12, 2025, 5:26 PM

#

worldly wagon not trying to be difficult btw if going parquet->csv means i can actually use th...

how this is going into plotly and sklearn
ngl, I don't think it's going to
you're gonna have to sample it before plotting if you don't want plotly to combust
sklearn... well no way all of that data's fitting in your memory, so look for estimators that can incrementally learn through .partial_fit so you can give it chunks of data and don't need to fit the entire thing in memory
you might also just consider neural nets, so pytorch

#

if you're willing to consider other libraries, other than spark, also check dask for dataframes and datashader for plotting (+ hvplot if you want a higher level API)

viscid urchin Apr 12, 2025, 5:36 PM

#

Dask is very good, IMO.

#

Used it at work for a major thing; we ended up mostly rewriting it in Databricks due to management pressure, but Dask worked great.

worldly wagon Apr 12, 2025, 5:54 PM

#

jaunty helm ah right, you have a parquet... well yeah, ig parquet -> csv *if* you want to us...

sorry for the late reply was researching hard lol

worldly wagon Apr 12, 2025, 5:58 PM

#

jaunty helm > how this is going into plotly and sklearn ngl, I don't think it's going to you...

i just need two columns for the sklearn classifiers so hopefully it fits lol

worldly wagon Apr 12, 2025, 5:59 PM

#

jaunty helm if you're willing to consider other libraries, other than spark, also check [`da...

initially a year ago when u guys introduced me to polars (Forever grateful), i had seen dask, would dask be able to load such large datasets though?

worldly wagon Apr 12, 2025, 6:00 PM

#

jaunty helm if you're willing to consider other libraries, other than spark, also check [`da...

will look into these (the plotting libs)

viscid urchin Apr 12, 2025, 6:00 PM

#

With dask the approach is usually to divide the work up into chunks that each get processed by a dask worker, and individually fit in RAM.

#

You could pair it with a streaming thing though if you wanted to do it differently

worldly wagon Apr 12, 2025, 6:02 PM

#

viscid urchin With dask the approach is usually to divide the work up into chunks that each ge...

ohh so with some research i could compute any large dataset i.e 200gb on dask regardless of my PCs capability?

viscid urchin Apr 12, 2025, 6:02 PM

#

Yeah, assuming the task can be 'chunked' in the first place, some things are really hard to break up.

worldly wagon Apr 12, 2025, 6:03 PM

#

viscid urchin Yeah, assuming the task can be 'chunked' in the first place, some things are rea...

ahh i see dask has some cloud compute on their website (cud be wrong) as my next step was making a pipeline, buying an amazon ec2 instance with 512gb ram and running it once
https://www.dask.org/

viscid urchin Apr 12, 2025, 6:07 PM

#

Yeah, "dask cloud provider" is the cloud back-end. It's optional but pretty handy.

swift gale Apr 12, 2025, 9:33 PM

#

Could you recommend a suitable free and open-source model for generating embeddings to populate a vector database?

serene scaffold Apr 12, 2025, 9:38 PM

#

swift gale Could you recommend a suitable free and open-source model for generating embeddi...

BERT

viscid urchin Apr 12, 2025, 10:18 PM

#

worldly wagon yea really curious how people workwith this dataset cant find anyone except the ...

Hey wait, what if we're making this way harder than it needs to be.. do you HAVE to call collect() yourself up-front, or will sink_parquet() just do it for you? Hmm.

#

Aha https://www.rhosignal.com/posts/sink-parquet-files/

Rho Signal

Sinking larger-than-memory Parquet files

Polars now allows you to write Parquet files even when the file is too large to fit in memory. It does this by using streaming to process data in batches and then writing these batches to a Parquet file with a method called sink_parquet.

#

Unlike a normal lazy query we evaluate the query and write the output by calling sink_parquet instead of collect.

agile cobalt Apr 12, 2025, 10:23 PM

#

polars also improved their streaming engine a lot recently, try updating to the latest version if you aren't using it yet

limpid dew Apr 12, 2025, 11:31 PM

#

Hello, all,

serene scaffold Apr 12, 2025, 11:33 PM

#

Please be transparent about what this project is by posting a link to the open-source repository.

limpid dew Apr 12, 2025, 11:37 PM

#

serene scaffold Please be transparent about what this project is by posting a link to the open-s...

The project is currently private so I can't post a link to it at this time.

serene scaffold Apr 12, 2025, 11:37 PM

#

limpid dew The project is currently private so I can't post a link to it at this time.

I removed your message, as soliciting contributions to closed-source projects is not allowed.

limpid dew Apr 12, 2025, 11:55 PM

#

serene scaffold I removed your message, as soliciting contributions to closed-source projects is...

My mistake, didn't realize that was against policy. I just made the repo public. You all can find it at https://github.com/gkerr708/D2DraftNet.

glacial root Apr 13, 2025, 4:21 AM

#

what would be the best approach to encoding text data into vectors without using any libraries except numpy

viscid urchin Apr 13, 2025, 4:28 AM

#

I guess you'll need to choose a word encoding and write it from scratch; people typically use a second library to do that and then jam the encoded text into numpy, but you can do it yourself also

#

"Word2Vec" is the/a classic

#

SciKit has a ton of choices that make it popular for this https://scikit-learn.org/stable/auto_examples/text/plot_hashing_vs_dict_vectorizer.html#sphx-glr-auto-examples-text-plot-hashing-vs-dict-vectorizer-py

glacial root Apr 13, 2025, 4:32 AM

#

viscid urchin I guess you'll need to choose a word encoding and write it from scratch; people ...

yeah i usually do that with images, but that's only cause i have no clue how i would manually convert images to array format

#

with text though it's probably worth trying on my own if it's not too much data right

viscid urchin Apr 13, 2025, 4:32 AM

#

Yeah definitely

glacial root Apr 13, 2025, 4:33 AM

#

i'd probably just use python's default file i/o to parse each text file and get them all into an array, and then from there it would be pretty easy to create a vector for each word

#

also one thing, i don't know if this is some variation of imposter syndrome or something, but i get this feeling that i need to only use numpy unless there's something i really need to use a library for (like using PIL to convert images to arrays), otherwise i'm skipping learning the concepts

viscid urchin Apr 13, 2025, 4:36 AM

#

I mean, you're not wrong.. that's a great way to make sure you learn it.

glacial root Apr 13, 2025, 4:38 AM

#

do most ml classes in college work like this?

#

or does it vary

viscid urchin Apr 13, 2025, 4:40 AM

#

Not sure in modern times.. it probably varies a lot

#

I could also see the "get it working first, then go and explain all the parts" approach being used

glacial root Apr 13, 2025, 5:14 AM

#

viscid urchin I could also see the "get it working first, then go and explain all the parts" a...

as in use any libraries i want and then afterwards learn the theory behind each step?

#

i think part of the reason why i feel a need to stick to numpy is cause i take a very implementation based approach to learning

#

so typically i'll just watch a quick theory video, and then i'll start trying to implement it myself

#

i usually never look at code examples or even pseudo code

charred estuary Apr 13, 2025, 5:57 AM

#

Does anyone know if this is how you are supposed to set the temperature??

vocal cove Apr 13, 2025, 4:07 PM

#

charred estuary Does anyone know if this is how you are supposed to set the temperature??

Temperature iirc controls the reliability of the model. It also controls exploration. Sometimes it can generate sth new. Usually it just increases hallucinations. Play around with it and see.

charred estuary Apr 13, 2025, 4:44 PM

#

vocal cove Temperature iirc controls the reliability of the model. It also controls explora...

I know what it does but I don’t thing I changed it correctly

vocal cove Apr 13, 2025, 4:44 PM

#

charred estuary I know what it does but I don’t thing I changed it correctly

Lower means more reliable.

#

Higher means less reliable, more hallucinations.

viscid urchin Apr 13, 2025, 4:44 PM

#

I believe 1.0 is the default for gemini-2.0 so 0.1 is pretty low I guess

charred estuary Apr 13, 2025, 4:44 PM

#

Yes I am aware but is what I did the correct way to change it in python

charred estuary Apr 13, 2025, 4:45 PM

#

viscid urchin I believe `1.0` is the default for gemini-2.0 so `0.1` is pretty low I guess

I know I need it to analyze data so I want a reliable accurate responce

viscid urchin Apr 13, 2025, 4:45 PM

#

What makes you think the setting isn't working?

#

Aah, the README shows setting it slightly differently https://github.com/googleapis/python-genai?tab=readme-ov-file#system-instructions-and-other-configs

#

with config=types.GenerateContentConfig(...)

torpid mirage Apr 13, 2025, 7:43 PM

#

Got this.

📎 cleaned_avian_influenza.csv

arctic wedgeBOT Apr 13, 2025, 7:43 PM

#

torpid mirage Got this.

There was an error uploading your paste.

torpid mirage Apr 13, 2025, 7:43 PM

#

huh.
Was it meant to work?

#

Worth a bug report IG.
Anyway.

viscid urchin Apr 13, 2025, 7:44 PM

#

Cool. Are you already using any relevant libraries, or are you just starting out?

torpid mirage Apr 13, 2025, 7:44 PM

#

I'm just starting out on the cleaning process. Just got pandas

viscid urchin Apr 13, 2025, 7:44 PM

#

OK. One thing people seem to use a lot for this is scikit, because it has https://scikit-learn.org/stable/modules/impute.html

#

but it looks like Pandas has some stuff we could try to use directly https://pandas.pydata.org/docs/user_guide/missing_data.html

torpid mirage Apr 13, 2025, 7:45 PM

#

What about external contextual data which can not be averaged, such as geographical locations/coordinates?

viscid urchin Apr 13, 2025, 7:46 PM

#

https://pandas.pydata.org/docs/user_guide/missing_data.html#interpolation

#

Hmm, it looks like all the available 'scipy' interpolation algorithms are for data that is smoother than yours

torpid mirage Apr 13, 2025, 7:47 PM

#

Yep. It's super rough

viscid urchin Apr 13, 2025, 7:47 PM

#

So what's an example missing attribute in your data that we need to impute?

torpid mirage Apr 13, 2025, 7:48 PM

#

There's a lot.
The main critical are the latitude, longitude, bird species, and the municipality

#

these are missing a lot

#

diagnosis date is another too

#

I asked for Copilot to calculate and summarize the quantity of data present in the columns, and he came back with;

Column    Missing %
focos_de_dnc    94.5%
focos_de_iaap    94.5%
doença    86.3%
número_da_investigação    86.3%
longitude    86.3%
data_do_laudo    86.3%
latitude    86.3%
espécie    86.3%
municipio    86.3%
ocorrência    19.2%

#

some of these will just have to be dropped

#

I'm fine

#

However, I'd like to be able to recover what we can

viscid urchin Apr 13, 2025, 7:50 PM

#

Hmm. I guess those each kinda need a different approach. For example if the municipality is set but not lat/lon, we could just use the lat/lon of the center of that municipality. For the bird species, we might need a classifier that can figure out the most-likely bird for a location?

torpid mirage Apr 13, 2025, 7:50 PM

#

yes

#

we got contextual data about common bird migratory patterns and also possibly domestic birds such as chicken

#

how can we do something with that?

#

I can also mix in environmental and biome data

#

such as wetlands, forests, etc

viscid urchin Apr 13, 2025, 7:51 PM

#

Hmm, isn't latitude/longitude totally missing here, or am I reading the columns wrong?

torpid mirage Apr 13, 2025, 7:51 PM

#

but to be honest, I have no clue how to do that.

torpid mirage Apr 13, 2025, 7:51 PM

#

viscid urchin Hmm, isn't latitude/longitude totally missing here, or am I reading the columns ...

yep 🥴

#

we need to infer those somehow

viscid urchin Apr 13, 2025, 7:52 PM

#

OK that's fine, we just need to calculate it from the municipality. I wonder what we can know about it when THAT isn't set though?

torpid mirage Apr 13, 2025, 7:52 PM

#

I have no clue 😩

#

From the bird type, perhaps?

#

They have some set migration patterns that should narrow down the possible location
Try to mean it

viscid urchin Apr 13, 2025, 7:53 PM

#

I guess let's just work on it one piece at a time, and maybe the rest will fill itself in

#

for geocoding, we can use from geopy.geocoders import Nominatim

#

and then like

geocoder = Nominatim(user_agent="avian_influenza_analysis")
geocoder.geocode(f"{municipio}, {uf}, Brazil")

#

there's a rate-limiter thing built into geopy you might need to wrap that Nominatim() instance with, I guess

#

like maybe

locator = Nominatim(user_agent="avian_influenza_analysis")
geocoder = RateLimiter(locator.geocode, min_delay_seconds=1)

torpid mirage Apr 13, 2025, 7:56 PM

#

Do you think we could possibly enrich the data with contextual information before cleaning?

#

Would that make it simpler, perhaps?

#

Since we would have more things to infer from

viscid urchin Apr 13, 2025, 7:57 PM

#

Maybe, yeah. What else do you have to join with? You mentioned bird migratory patterns, I guess that could be cross-linked via the lat/lon you determine...

torpid mirage Apr 13, 2025, 7:58 PM

#

So far, I've thought about;

Weather data
Environmental/geographical data
Bird migratory patterns

#

Just these three

viscid urchin Apr 13, 2025, 7:58 PM

#

Best I can think of I guess for determining municipality when given only a state is to have a list of towns in the state, and go by whichever one has the most of these rows associated with it?

torpid mirage Apr 13, 2025, 7:58 PM

#

I don't know/haven't studied in depth what else more could I plug in

viscid urchin Apr 13, 2025, 7:59 PM

#

We could train a little classifier on the municipalities in the dataset, but the data is so small, hmm.

torpid mirage Apr 13, 2025, 7:59 PM

#

Would you like the raw data?

#

With no cleaning

viscid urchin Apr 13, 2025, 8:01 PM

#

Sure

#

I think a random forest might make sense to train the municipality-guesser, but we're getting into the limits of my experience now

#

https://scikit-learn.org/stable/modules/ensemble.html#forest

scikit-learn

1.11. Ensembles: Gradient boosting, random forests, bagging, voting...

Ensemble methods combine the predictions of several base estimators built with a given learning algorithm in order to improve generalizability / robustness over a single estimator. Two very famous ...

torpid mirage Apr 13, 2025, 8:05 PM

#

Just a second. I accidentally cooked my notebook...

viscid urchin Apr 13, 2025, 8:10 PM

#

I've got a little implementation I'm working on, let's see if I can get it to successfully make up anything plausible

torpid mirage Apr 13, 2025, 8:10 PM

#

God the raw CSV
@viscid urchin

#

📎 combined_raw_avian_influenza.csv

arctic wedgeBOT Apr 13, 2025, 8:11 PM

#

torpid mirage

~~Please react with ✅ to upload your file(s) to our paste bin, which is more accessible for some users.~~

torpid mirage Apr 13, 2025, 8:11 PM

#

1309 rows

viscid urchin Apr 13, 2025, 8:11 PM

#

Imputed 1130 missing values in doença
Imputed 72 missing values in situação
Imputed 72 missing values in tipo_de_exploração
Imputed 1130 missing values in espécie
Imputed 72 missing values in espécie_principal

hmm, I guess that's something, let's see what it looks like

#

Successfully imputed coordinates for 179/1309 records hmm that's way fewer than I expected, I guess I have something to fix.

#

I wonder what the right play is in situations like this, where so much of the key data is missing. Seems really challenging to get right.

#

Oh I guess there are two columns in this for municipality? 'municipio', 'município'

torpid mirage Apr 13, 2025, 8:17 PM

#

Yes

#

I guess it's just a duplicated from the combining.

#

I'm combining three spreadsheets into one here.

#

🥴

#

Yeah.
This one going to take a while.

#

And worse, I have to turn this into a pipeline.

viscid urchin Apr 13, 2025, 8:34 PM

#

I sorta have that aspect of it working, but the imputations I've got are still far from ideal

hollow pagoda Apr 13, 2025, 10:48 PM

#

@weak oxide

late lichen Apr 14, 2025, 1:22 AM

#

Guys... I'm curious and got a perhaps stupid idea while coding a MLP library on what will happen if you initialize the network with 0 weights and biases or any real values

delicate pivot Apr 14, 2025, 1:24 AM

#

How can I start AI/ML .What could you suggest for beginners?

viscid urchin Apr 14, 2025, 1:25 AM

#

late lichen Guys... I'm curious and got a perhaps stupid idea while coding a MLP library on ...

Not a dumb question at all, actually a great way to run into a fundamental property that's worth understanding.

#

If all the neurons in a layer start with identical weights and biases, they will:
A) all calculate the same output
B) receive the same gradient during backpropagation
C) all make the same updates to their weights

So basically instead of a multi-neuron network, you just have one neuron per layer now

#

You specifically are trying to avoid symmetry

serene scaffold Apr 14, 2025, 1:38 AM

#

late lichen Guys... I'm curious and got a perhaps stupid idea while coding a MLP library on ...

a neural network is a bunch of numbers (the weights and biases) that you compute with in a specified order (the computation graph). if you don't have any weights or biases, you have nothing.

late lichen Apr 14, 2025, 4:11 AM

#

serene scaffold a neural network is a bunch of numbers (the weights and biases) that you compute...

I never said no baisas and weights

#

I mean all parameter will be initialized on a same value

#

let's say zero's

#

https://github.com/hitoyaCute/making-new-AI/

GitHub

GitHub - hitoyaCute/making-new-AI

Contribute to hitoyaCute/making-new-AI development by creating an account on GitHub.

#

am I doing it right???

#

look on the first attempt

viscid urchin Apr 14, 2025, 4:16 AM

#

You have weigth spelled two different ways between __call__ and dif etc

late lichen Apr 14, 2025, 4:16 AM

#

huh

late lichen Apr 14, 2025, 4:17 AM

#

viscid urchin You have `weigth` spelled two different ways between `__call__` and `dif` etc

where??

viscid urchin Apr 14, 2025, 4:18 AM

#

lines 13, 16, and 18 in 'nuralnet.py'

#

also here https://github.com/hitoyaCute/making-new-AI/blob/main/first_attempt/FUNC.py#L42
that should probably be m.tanh() not just tanh()

arctic wedgeBOT Apr 14, 2025, 4:18 AM

#

first_attempt/FUNC.py line 42

return 1 - (tanh(x) ^ 2)```

late lichen Apr 14, 2025, 4:19 AM

#

I saw it thanks

late lichen Apr 14, 2025, 4:20 AM

#

viscid urchin also here https://github.com/hitoyaCute/making-new-AI/blob/main/first_attempt/FU...

ill Change that one's I finished making a simple neural network that can defeat me on tic tac toe

#

@viscid urchin on summary am i doing it right????

#

I'm heading to something???

wintry relic Apr 14, 2025, 4:27 AM

#

late lichen I saw it thanks

^ isn't power btw

viscid urchin Apr 14, 2025, 4:29 AM

#

late lichen I'm heading to something???

It's not nothing, but you've got a number of things left to conquer as well.

#

The current situation is kinda:
No actual weight initialization
No backpropagation implementation
No loss function
No training loop

#

Usually you put your weights in a matrix instead of having an explicit Edge concept but I'm not enough of an expert to say having what you have is wrong, just less-likely to be fast on modern hardware

#

Also I think your forward method is backwards, you raise an error when values are compatible?

#

To summarize before I crash..
a neuron is a weighted sum of inputs + bias, followed by activation
forward propagation is matrix multiplication between inputs and weights
backward propagation is computing gradients and updating weights

#

Any actual-expert feel free to correct anything I've said, glad to learn.

late lichen Apr 14, 2025, 4:38 AM

#

viscid urchin The current situation is kinda: No actual weight initialization No backpropagati...

well obviously it's still work in progress and I haven't done any testing but yeah

late lichen Apr 14, 2025, 4:40 AM

#

viscid urchin Also I think your `forward` method is backwards, you raise an error when values ...

it will take a value from a list EDGEs that will be processed by network.forward then it will make sure that network.forward and layer.forward is heading on same thing

#

and for debugging

viscid urchin Apr 14, 2025, 4:41 AM

#

Sure, but when your modulo test returns 0, that's when things have the same shape and are compatible, right?

#

Surely you want to throw an error when that's non-zero?

late lichen Apr 14, 2025, 4:42 AM

#

it will make sure if we divide the amount of the layer to the amount of value that each neuron will have same amount of input values

late lichen Apr 14, 2025, 4:42 AM

#

viscid urchin Sure, but when your modulo test returns 0, that's when things have the same shap...

YES

#

I wanna make sure the shape of the input is compatible to the layer

viscid urchin Apr 14, 2025, 4:49 AM

#

Right, so I'm saying you have it backwards, but feel free to test it

grand minnow Apr 14, 2025, 4:52 AM

#

https://github.com/hitoyaCute/making-new-AI/blob/dec43aa815510c3ea93721f90e2642edd054d44e/first_attempt/nuralnet.py#L18

arctic wedgeBOT Apr 14, 2025, 4:52 AM

#

first_attempt/nuralnet.py line 18

return self.weight```

grand minnow Apr 14, 2025, 4:53 AM

#

oh nvm

late lichen Apr 14, 2025, 4:53 AM

#

grand minnow https://github.com/hitoyaCute/making-new-AI/blob/dec43aa815510c3ea93721f90e2642e...

??

late lichen Apr 14, 2025, 4:53 AM

#

viscid urchin Right, so I'm saying you have it backwards, but feel free to test it

I think I don't get your point...

grand minnow Apr 14, 2025, 4:54 AM

#

late lichen ??

I didn't scroll down enough but according to that repo (I assume that's yours) it shows 2 different spellings

https://github.com/hitoyaCute/making-new-AI/blob/dec43aa815510c3ea93721f90e2642edd054d44e/first_attempt/nuralnet.py#L13-L18

arctic wedgeBOT Apr 14, 2025, 4:54 AM

#

first_attempt/nuralnet.py lines 13 to 18

    self.weigth = weigth
def __call__(self, value:float) -> float:
    """takes a value, apply to the parent node, the multiply that output to the weigth"""
    return value * self.weigth
def dif(self) -> float:
    return self.weight```

grand minnow Apr 14, 2025, 4:54 AM

#

Pick one: weight or weigth

late lichen Apr 14, 2025, 4:55 AM

#

alr fixed btw

grand minnow Apr 14, 2025, 4:55 AM

#

I know

viscid urchin Apr 14, 2025, 4:55 AM

#

late lichen I think I don't get your point...

You have if len(values)%len(self.nodes) == 0:, which will be True whenever the two things fit together evenly.. but inside the if block, you raise an error saying they don't fit together.

late lichen Apr 14, 2025, 4:57 AM

#

wait

#

oof my bad lol

#

thanks

late lichen Apr 14, 2025, 4:58 AM

#

viscid urchin To summarize before I crash.. a neuron is a weighted sum of inputs + bias, follo...

hmmm matrix mul of weights and inputs will work to

#

math lib has it???

#

I know numpy has it but numpy takes so damn long t get imported

viscid urchin Apr 14, 2025, 5:01 AM

#

You can do it, it's just nested loops. I don't think anything in the stdlib implements it for the general case? Maybe I'm wrong https://medium.com/@vtalladin06/matrix-multiplication-in-python-without-libraries-a83b68819477

#

https://www.programiz.com/python-programming/examples/multiply-matrix

late lichen Apr 14, 2025, 5:14 AM

#

uhhhhh just in case...I wanna learn how to make my code use gpu to process stuff... how to do it???

#

I don't have tensor compatible gpu so no tenserflow

#

i don't have cuda compatible gpu to

#

what I have is Intel dual core graphics

viscid urchin Apr 14, 2025, 5:19 AM

#

I mean, technically you can do it on that platform, but it's all very experimental and complex, not something I can really walk you through. On Intel the right path is to use a thing called SYCL

#

PyTorch has some Intel support now but I think it's only for their datacenter GPUs https://github.com/pytorch/pytorch?tab=readme-ov-file#intel-gpu-support

#

Oh maybe I'm wrong, I see Intel Core Ultra on the page that links to.. but nothing earlier.

#

https://pytorch.org/blog/intel-gpus-pytorch-2-4/

#

Honestly I wouldn't think about it at all until you're comfortable building your neural network on the CPU

late lichen Apr 14, 2025, 5:24 AM

#

that's fair

#

thanks for advice anyways

#

I also wanna learn attention block just in case I want to make transformers._. is there resource you would suggest?

serene grail Apr 14, 2025, 5:32 AM

#

late lichen I also wanna learn attention block just in case I want to make transformers._. i...

3blue1brown on YouTube has a series on transformers, I don't remember it being super in-depth but it does explain attention blocks at some point and I think the explanation is pretty good (and visual)

viscid urchin Apr 14, 2025, 5:33 AM

#

late lichen I also wanna learn attention block just in case I want to make transformers._. i...

This site has some good stuff https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/

Visualizing A Neural Machine Translation Model (Mechanics of Seq2se...

Translations: Chinese (Simplified), French, Japanese, Korean, Persian, Russian, Turkish, Uzbek

Watch: MIT’s Deep Learning State of the Art lecture referencing this post

May 25th update: New graphics (RNN animation, word embedding graph), color coding, elaborated on the final attention example.

Note: The animations below are videos. Touch or...

#

Linked to from the very-good https://www.databricks.com/blog/llm-inference-performance-engineering-best-practices

#

https://www.datacamp.com/blog/attention-mechanism-in-llms-intuition

What is Attention and Why Do LLMs and Transformers Need It?

In this article, we focus on building an intuitive understanding of attention. The attention mechanism was introduced in the “Attention Is All You Need” paper. It is the key element in the transformers architecture that has revolutionized LLMs.

bronze wyvern Apr 14, 2025, 8:34 AM

#

Hello guys, I want to learn about ML and AI, how LLMs work and stuff like that, can someone recommend any good resource/books that englobe AI and its entirety please (please bip me if anyone has something to recommend :c)

grand minnow Apr 14, 2025, 9:45 AM

#

bronze wyvern Hello guys, I want to learn about ML and AI, how LLMs work and stuff like that, ...

https://kaggle.com/learn is a good one.
Or a more in-depth ML with Tensorflow: https://www.tensorflow.org/learn
Or if you wanna go into Deep Learning, Keras is pretty good: https://keras.io/getting_started/

bronze wyvern Apr 14, 2025, 9:46 AM

#

but hmm I wanted to learn a bit of the theoretical parts first, like how things work behind the scenes

grand minnow Apr 14, 2025, 9:50 AM

#

bronze wyvern but hmm I wanted to learn a bit of the theoretical parts first, like how things ...

The intro sections in Kaggle and Tensorflow explains it quite well

bronze wyvern Apr 14, 2025, 9:51 AM

#

Ok, will have a look at them ,ty !

grand minnow Apr 14, 2025, 9:51 AM

#

late lichen Apr 14, 2025, 10:46 AM

#

bronze wyvern Hello guys, I want to learn about ML and AI, how LLMs work and stuff like that, ...

search 3blue1brown on YouTube

#

and look for the machine learning list

#

6

#

guys what will happen to a MLP if you straight up initialized it's parameters to 0?

serene scaffold Apr 14, 2025, 1:59 PM

#

late lichen guys what will happen to a MLP if you straight up initialized it's parameters to...

before I tell you, what do you think?

late lichen Apr 14, 2025, 2:08 PM

#

serene scaffold before I tell you, what do you think?

not much

#

I have no idea

serene scaffold Apr 14, 2025, 2:08 PM

#

late lichen I have no idea

what happens when you multiply by zero?

late lichen Apr 14, 2025, 2:09 PM

#

serene scaffold what happens when you multiply by zero?

it will be zero

serene scaffold Apr 14, 2025, 2:09 PM

#

late lichen it will be zero

can you imagine what consequences that will have for the neural network?

late lichen Apr 14, 2025, 2:10 PM

#

serene scaffold can you imagine what consequences that will have for the neural network?

I barely has experience on machine learning and on that I doubt I have any idea

#

also I didn't just stated "zero"

#

wait

late lichen Apr 14, 2025, 2:11 PM

#

late lichen it will make sure if we divide the amount of the layer to the amount of value th...

....

serene scaffold Apr 14, 2025, 2:14 PM

#

@late lichen I'm not following. Can you restate what you're current question is, from the top?

late lichen Apr 14, 2025, 2:16 PM

#

serene scaffold <@1200330195387813903> I'm not following. Can you restate what you're current qu...

I want to know what will be the behavior of a neural network if you initialized its parameters with zero or any value

serene scaffold Apr 14, 2025, 2:17 PM

#

late lichen I want to know what will be the behavior of a neural network if you initialized ...

"with zero or any value"
do you mean "with zero or no value"?

late lichen Apr 14, 2025, 2:17 PM

#

not sure why you replace "any" with "no"

#

it's clearly different words isn't?

serene scaffold Apr 14, 2025, 2:18 PM

#

because "zero or any value" is just "any value", and the "or zero" is meaningless.

late lichen Apr 14, 2025, 2:18 PM

#

infact it's rather opposite

late lichen Apr 14, 2025, 2:19 PM

#

serene scaffold because "zero or any value" is just "any value", and the "or zero" is meaningles...

I was emphasizing the situation where it's all zeroes

serene scaffold Apr 14, 2025, 2:19 PM

#

late lichen I was emphasizing the situation where it's all zeroes

if all the weights in the neural network start as 0, they'll never be able to stop being zero.

late lichen Apr 14, 2025, 2:20 PM

#

why

serene scaffold Apr 14, 2025, 2:20 PM

#

the updated values are determined through multiplication, but x * 0 is always 0.

late lichen Apr 14, 2025, 2:21 PM

#

yeah?? then??

serene scaffold Apr 14, 2025, 2:22 PM

#

so if you have a neural network where all the weights are 0, they will stay as 0 no matter how much you try to train it.

#

and the network won't learn anything.

late lichen Apr 14, 2025, 2:23 PM

#

uhh is there some resources where it clearly shows that??

serene scaffold Apr 14, 2025, 2:23 PM

#

look into gradient descent

late lichen Apr 14, 2025, 2:25 PM

#

okay I found one...

#

but how it will behave if it's all 1??

#

or 2???? or 3??

serene scaffold Apr 14, 2025, 2:29 PM

#

late lichen but how it will behave if it's all 1??

if all the weights are the same, you get a symmetry problem

river cape Apr 14, 2025, 2:55 PM

#

late lichen but how it will behave if it's all 1??

I think its better you draw simple neural network and calculate the weights and biases in forward propagation ,it will give you a clear understanding

daring crystal Apr 14, 2025, 4:52 PM

#

I have learned some basics of the ml, what fun projects i should work on to get started??

viscid urchin Apr 14, 2025, 5:08 PM

#

late lichen guys what will happen to a MLP if you straight up initialized it's parameters to...

I also answered this yesterday

river cape Apr 14, 2025, 7:06 PM

#

daring crystal I have learned some basics of the ml, what fun projects i should work on to get ...

classifier? seniment-analysis ?

iron basalt Apr 14, 2025, 8:05 PM

#

bronze wyvern Hello guys, I want to learn about ML and AI, how LLMs work and stuff like that, ...

Artificial Intelligence: A Modern Approach (4th edition). For about AI in general.

#

Also has stuff on ML, deep learning, language, etc.

#

A bit of everything.

bronze wyvern Apr 14, 2025, 9:42 PM

#

ok, ty !

viscid urchin Apr 14, 2025, 9:56 PM

#

bronze wyvern Hello guys, I want to learn about ML and AI, how LLMs work and stuff like that, ...

Here's a really good video (more stuff from the same person, also worth checking afterward)
https://www.youtube.com/watch?v=SmZmBKc7Lrs

YouTube

Artem Kirsanov

The Most Important Algorithm in Machine Learning

Shortform link:
https://shortform.com/artem

In this video we will talk about backpropagation – an algorithm powering the entire field of machine learning and try to derive it from first principles.

OUTLINE:
00:00 Introduction
01:28 Historical background
02:50 Curve Fitting problem
06:26 Random vs guided adjustments
09:43 Derivatives
14:34 ...

▶ Play video

untold bloom Apr 14, 2025, 11:07 PM

#

serene scaffold so if you have a neural network where all the weights are 0, they will stay as 0...

what activation function you are using matters here such that so long as at 0 it has a nonzero gradient, learning is possible, e.g., with the standard logistic function

viscid urchin Apr 14, 2025, 11:44 PM

#

Do you have to use "pickle" format? It's not really the fastest way to serialize/deserialize data

#

There are some things that claim to be faster at pickle stuff, but I'm not sure how much they manage to beat dill by

#

There are lots of alternatives, if you're in control of the format of the incoming data https://github.com/jcrist/msgspec etc

#

pyarrow is sweeping the world too https://pypi.org/project/pyarrow/

#

If you can just use JSON also that's way faster

#

OK, so you can't just json.dumps(your_root_object)?

#

Aha, yeah, you may have better luck with PyArrow then.

flint onyx Apr 15, 2025, 12:21 AM

#

can someone pls help me understand why the residual plot D is a problem. Ive read the solution and I still cant seem to get it

viscid urchin Apr 15, 2025, 12:26 AM

#

Man those are close, to me, but I guess what it's saying is that you're looking to find a purely random pattern

#

whereas the one on the right has kind of a curve shape to it, where the residuals are positive as you get closer to 0 or 1, and negative as you get closer to the middle values (0.4 to 0.6)

#

but ugh I do have to stare at it to see that

#

I dunno, maybe I'm even reading it wrong, it's so subtle

tired otter Apr 15, 2025, 12:41 AM

#

Guys

#

In a recommendation algorithm, the system analyzes the items that the user has already viewed and tries to predict what they might like. This type of AI is closest to:

A) Supervised learning.
D) Unsupervised learning.

What's the best answer here? He didn't say that the user rated the movies

viscid urchin Apr 15, 2025, 12:54 AM

#

Collaborative filtering (which is what I get out of what you describe) is considered "unsupervised"

#

If you had explicit ratings it would/could be supervised

#

Arguably though there's a spectrum between unsupervised and supervised, and it's not a binary thing

#

Because what happens if you take view-counts into account.. that's suddenly "kinda" supervised...

tired otter Apr 15, 2025, 1:04 AM

#

Got it

viscid urchin Apr 15, 2025, 1:11 AM

#

That's just my take at least; Google search seems to back me up but I guess it's subjective.

viscid urchin Apr 15, 2025, 2:27 AM

#

I might be in over my head tonight, team:

#

It's making sense, but slower than I hoped. Oof.

sand pine Apr 15, 2025, 3:19 AM

#

Hello, very new to all of this after years away from any programming. More of a system setup to take advantage of GPU, running a laptop with 4060 and when searching how there was one method that adds Visual Studio Code to advanced graphic settings and selecting high performance GPU usage and then there is the Nvidia CUDA … are these doing the same thing or are they apples and oranges. Mainly for class project so it isn’t a must but getting into this so I figured I should learn. Any advice for the rookie would be great

viscid urchin Apr 15, 2025, 3:21 AM

#

Those are apples and oranges, yeah.

#

CUDA is a 'programming toolkit' from nVidia for running code on GPUs, whereas the other thing is just telling Visual Studio Code to use hardware accelerated drawing techniques etc.

#

If you can say more about what kind of projects interest you, we can give better advice about what you should look at next.

jaunty helm Apr 15, 2025, 4:08 AM

#

sand pine Hello, very new to all of this after years away from any programming. More of a...

advanced graphic settings -> high performance
this is telling windows that it should use GPU to draw that program instead of the cpu; if said program (in your example, VSCode) is graphics intensive, it'll boost the performance, making it look smooth, etc.
one quick example off the top of my head is RPG MV games; by default on my pc it draws using CPU, lagging it a lot; setting it to high performance makes it use the GPU which gets me way higher fps
nvidia CUDA
this is probably what you're looking for in the context of programming, but honestly you may not even have to worry about it; for example, if you want to use the popular deep learning library pytorch, you can just install the correct version of pytorch and it'll automatically install CUDA for you during the process

limber spear Apr 15, 2025, 7:41 AM

#

Can we build a new stack for CUDA called CUDIE or CUDY 😏

#

Maybe COODY

serene scaffold Apr 15, 2025, 9:39 AM

#

limber spear Can we build a new stack for CUDA called CUDIE or CUDY 😏

Be the change you want to see in the world

grand minnow Apr 15, 2025, 11:08 AM

#

<@&831776746206265384> spam ad

mystic harbor Apr 15, 2025, 11:09 AM

#

!ban 1360871168776867991 giveaway spam

arctic wedgeBOT Apr 15, 2025, 11:09 AM

#

:incoming_envelope: :ok_hand: applied ban to @teal stump permanently.

woeful lodge Apr 15, 2025, 11:31 AM

#

Any good books i can get from amazon on data science?

sand pine Apr 15, 2025, 11:46 AM

#

viscid urchin CUDA is a 'programming toolkit' from nVidia for running code on GPUs, whereas th...

Thanks, later today I’ll throw some more details!

sand pine Apr 15, 2025, 11:46 AM

#

jaunty helm > advanced graphic settings -> high performance this is telling windows that it ...

Thanks, I’ll follow up with a couple questions on this later today.!

ivory root Apr 15, 2025, 1:18 PM

#

How do I make my models usable(integrated in a system). My friend asked me to design a project priority level prediction model I finished it but i tried deploying it using fastapi so as for him to access through api but am failing miserably. am developing in colab notebook

agile cobalt Apr 15, 2025, 1:33 PM

#

ivory root How do I make my models usable(integrated in a system). My friend asked me to de...

you cannot really use google colab to host it long term

generally you should export/download the model after training, then host the API on a computer/server/virtual machine you own or rent

ivory root Apr 15, 2025, 1:42 PM

#

Am developing it to be used in a friend's system how would he go about it coz basically I develop them and then github keeps them

ivory root Apr 15, 2025, 1:42 PM

#

agile cobalt you cannot really use google colab to host it long term generally you should ex...

I haven't done this

agile cobalt Apr 15, 2025, 1:51 PM

#

ivory root Am developing it to be used in a friend's system how would he go about it coz b...

what do you mean by "github keeps them"?

#

How and where is your friend planning to run/use that system?

ivory root Apr 15, 2025, 2:03 PM

#

agile cobalt How and where is your friend planning to run/use that system?

I normally deploy them on github and as of my friend his system is a java springboot project(project priority level ) kinda of a system plans of where he wanna use it am not sure but he just want to use it as part of his system

agile cobalt Apr 15, 2025, 2:06 PM

#

what exactly do you mean by 'deploy them on github'? how are you "deploying" it?

you can upload your code to GitHub, but when doing that it only stores the files, it does not runs anything
or do you mean GitHub Pages? It only supports static websites (i.e. you cannot run python in the server side, at most embed in the browser)

#

they will need to host their system in some machine for users to be able to access it
usually you'll want to host your API in the same system, or something connected to it (same cloud provider if you're hosting on the cloud, or in the same network if self-hosting)

ivory root Apr 15, 2025, 2:13 PM

#

In simple terms yeah I was uploading them. I was tryin to host using fastapi and ngrok

agile cobalt Apr 15, 2025, 2:15 PM

#

ivory root In simple terms yeah I was uploading them. I was tryin to host using fastapi and...

your api you'll create using fastapi is a computer program

you need to have a computer to run your program in first place

you could host it on your own computer, but if you do so it'll only be available while you are running it yourself

#

neither Google Colab nor GitHub offers compute for you to host (run) it, are you planning to run it yourself? in some cloud server? in a machine your friend owns?

ivory root Apr 15, 2025, 2:19 PM

#

not really obviously I wanna host it globally not locally

agile cobalt Apr 15, 2025, 2:21 PM

#

you will need to decide where to host it in first place then

ivory root Apr 15, 2025, 2:26 PM

#

I was using ngrok idk if am allowed to share a link in here, I wanted to show you where am at at the moment coz you know how we used to host web locally and get like a responsive page and you are the only seeing the page unless you hosted on like heroku.

#

I tried to host it globaly using ngrok but I can only access the root endpoint no other endpoint is accessible

#

root is a get request but I can't post

#

kidly reach back to me plz

austere prawn Apr 15, 2025, 5:14 PM

#

Was there any recording of this marimo presentation?

safe agate Apr 15, 2025, 5:15 PM

#

austere prawn Was there any recording of this marimo presentation?

There is, @spiral peak is currently editing the recording and will share it at a later date.

austere prawn Apr 15, 2025, 5:16 PM

#

ok 🙂

serene scaffold Apr 15, 2025, 5:19 PM

#

@safe agate you were on TalkPython recently, right?

safe agate Apr 15, 2025, 5:34 PM

#

serene scaffold <@303698973091299329> you were on TalkPython recently, right?

It was the creator of marimo, Akshay, not myself.
https://talkpython.fm/episodes/show/501/marimo-reactive-notebooks-for-python

sand pine Apr 15, 2025, 8:12 PM

#

viscid urchin If you can say more about what kind of projects interest you, we can give better...

@jaunty helm hitting both of your responses at once, thanks again for the feedback. In the short all i am working on now is class work for a graduate class on data analytics, so basic ML dealing with Logistics Regression, SVM, model comparison ... im sure I am not doing the summary justice. However following this semester I want to begin take some of what I have learned and begin slowing seeing what I can do in my current role within distribution center planning and design (order management, inventory management.....). In the next couple of weeks I will be wrapping up this class and I noticed some of the datasets are taking longer to run and the impatient person I am began to look into these things. Example some exercises we will use for loops with 3-4 kernel values, 3-4 C values - generally speaking "linear" always takes the longest. Like I said not catastrophic but looking down the road more than anything.

viscid urchin Apr 15, 2025, 8:19 PM

#

sand pine <@208918673178492929> hitting both of your responses at once, thanks again for ...

Cool, hopefully you're making good progress with your class. Linear kernels often scale poorly, it's kinda a core problem in ML. If you end up trying scikit-learn, it has some built-in parallelization that might help (n_jobs etc)

For SVM specifically, there's a thing called LinearSVC that is supposedly zippy, but I'm not an expert: https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html

Do you have a sample dataset for sales/orders/etc you want to play with? You could make a model to predict the right inventory levels for a given product based on historical data or something?

#

Actually it looks like there's one on Kaggle you could use https://www.kaggle.com/datasets/amirmotefaker/supply-chain-dataset

Supply Chain DataSet

The dataset solve case study on Supply Chain Analysis

#

This is also an interesting list https://www.interviewquery.com/p/supply-chain-datasets

#

Ooh, this one has 308,000 rows https://data.montgomerycountymd.gov/Community-Recreation/Warehouse-and-Retail-Sales/v76h-r7br/about_data

glacial root Apr 15, 2025, 9:58 PM

#

what would be a good way to get into nlp, and what models should i be focusing on to start with

viscid urchin Apr 15, 2025, 10:02 PM

#

glacial root what would be a good way to get into nlp, and what models should i be focusing o...

How comfortable are you with Python?

#

Here's one option that uses a particular PyTorch-based library: https://guide.allennlp.org/your-first-model

#

and there's this, but yikes it covers a lot of topics very briefly, you'll want to follow some links and do some reading https://www.projectpro.io/article/how-to-build-an-nlp-model-step-by-step-using-python/915

#

Also https://course.fast.ai/

Practical Deep Learning for Coders

Practical Deep Learning for Coders - Practical Deep Learning

A free course designed for people with some coding experience, who want to learn how to apply deep learning and machine learning to practical problems.

odd meteor Apr 15, 2025, 10:28 PM

#

glacial root what would be a good way to get into nlp, and what models should i be focusing o...

Prof. Jurafsky's book is really good and beginner friendly. https://web.stanford.edu/~jurafsky/slp3/
🤗 Intro to NLP course

https://huggingface.co/learn/llm-course/en/chapter1/1

Speech and Language Processing

Introduction - Hugging Face LLM Course

glacial root Apr 15, 2025, 10:32 PM

#

viscid urchin How comfortable are you with Python?

i'm not too good at programming and i think i should definitely get better, but i do know the basics and have tried implementing a neural network with numpy

glacial root Apr 15, 2025, 10:34 PM

#

odd meteor 1. Prof. Jurafsky's book is really good and beginner friendly. https://web.stan...

would a good approach be to read through that textbook and implement models/algorithms on my own along the way after reading that model's/algorithm's respective section

torpid mirage Apr 15, 2025, 11:55 PM

#

Soooooooooooooooooooo

#

I cooked a schema.

#

Also why do we have a slowmode here? Anyway,

#

!pastebin

arctic wedgeBOT Apr 15, 2025, 11:56 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

torpid mirage Apr 15, 2025, 11:57 PM

#

https://paste.pythondiscord.com/JWAQ

#

It took a lot of Github copy, documentation diving, and just asking around.
Came up with this.

#

You remember that CSV file that we were cleaning? So, yeah. I managed to get it... somewhat working and clean

#

(Clean as in; I manually took each column and filled it out just to test the schema)

#

Thoughts and review? This is still a lot of magic as I've never built a schema for this purpose.

viscid urchin Apr 16, 2025, 12:07 AM

#

Cool, let me look

#

Honestly this looks better than I expected it to. Only a few things come to mind.

#

One is that you might want to add a confidence_score column to track how confident the prediction was that filled it in

#

If you have multiple people working on it, it might make sense to add modified_by but that's really about auditing not quality

#

the postgis stuff looks correct to me also, nice

#

ST_SetSRID(ST_MakePoint(NEW.longitude, NEW.latitude), 4326); is nardog but I believe that is a fully correct invocation for WGS84

odd meteor Apr 16, 2025, 12:12 AM

#

glacial root would a good approach be to read through that textbook and implement models/algo...

It totally depends on how you'd wanna learn tbh.

Learning from 1st principle, that is, implementing stuff from scratch is what makes you cracked, however, it requires a clear plan, serious dedication and consistency.

In summary, you know yourself better than I do. Just do what works best for you.

lapis sequoia Apr 16, 2025, 12:41 AM

#

when making a GAN, does the image size have to be pretty large? Should the image be normalized to be larger?

#

no, that would make it too big

viscid urchin Apr 16, 2025, 12:44 AM

#

Actually the early/influential papers used sizes like 32x32, 64x64, so no

#

Some approaches I guess start out super small, like 4x4 and 8x8, and then progressively scale up

#

e.g. https://github.com/NVlabs/stylegan does fancy stuff

lapis sequoia Apr 16, 2025, 12:46 AM

#

So, if it is RGB, then the image size is 32 * 32 * 3?

viscid urchin Apr 16, 2025, 12:47 AM

#

Yeah, I guess width x height x color-channels

lapis sequoia Apr 16, 2025, 12:48 AM

#

what would be a good latent_size if it 32?

viscid urchin Apr 16, 2025, 12:51 AM

#

Apparently the rule of thumb is that the latent space is 10x to 30x smaller than the output space, so I guess somewhere betwen 3072 / 30 and 3072 / 10?

#

Split the difference and call it 200 to start with maybe?

#

This paper just uses 100 https://arxiv.org/abs/1511.06434

pearl barn Apr 16, 2025, 12:57 AM

#

I want to ask which order for learning data analysis with python from Jose portilla courses on udemy he has many courses on python data analysis and i feel there are the same libraries in each course I don't know if they complete each other but which order should I take them to master these libraries??

viscid urchin Apr 16, 2025, 12:59 AM

#

pearl barn I want to ask which order for learning data analysis with python from Jose porti...

Never used Udemy, but there's not some 'landing page' for his courses that has some order to it? That surprises me.

viscid urchin Apr 16, 2025, 1:02 AM

#

pearl barn I want to ask which order for learning data analysis with python from Jose porti...

I guess it looks like these two, in this order?
https://www.udemy.com/course/complete-python-bootcamp/
https://www.udemy.com/course/learning-python-for-data-analysis-and-visualization/

#

Bizarrely, Udemy doesn't even let you filter by instructor.

lapis sequoia Apr 16, 2025, 1:14 AM

#

pearl barn I want to ask which order for learning data analysis with python from Jose porti...

there has to be some order to it, I just do not know if the laten size and hidden size depending on if it is RGB

viscid urchin Apr 16, 2025, 1:16 AM

#

The latent space depends on the size of the 'output space', so RGB matters in that you've got three color channels to care about.

lapis sequoia Apr 16, 2025, 1:27 AM

#

yes

viscid urchin Apr 16, 2025, 4:08 AM

#

Anybody tried to do anything with HiDream yet?

river cape Apr 16, 2025, 7:30 AM

#

hey guys anyone aware of any open-source embedding models that works just as fine as OpenAIEmbeddings?

light cobalt Apr 16, 2025, 9:18 AM

#

Hello, can anybody advice me a discord channel with topic of AI development (preferably on python) ?

#

I am new to AI, need to create one for my game. Just researching.

serene scaffold Apr 16, 2025, 9:20 AM

#

light cobalt Hello, can anybody advice me a discord channel with topic of AI development (pre...

This is the one.

light cobalt Apr 16, 2025, 9:26 AM

#

Well, then is there any articles about AI usage in gamedev. I just want to implement smart NPC enemies. I want to understand is it feasable in my project.

serene scaffold Apr 16, 2025, 9:27 AM

#

light cobalt Well, then is there any articles about AI usage in gamedev. I just want to imple...

you can have AI-driven NPCs, yes

shadow folio Apr 16, 2025, 9:35 AM

#

Can I get Resources to learn data science

serene scaffold Apr 16, 2025, 10:16 AM

#

shadow folio Can I get Resources to learn data science

!resources data science

arctic wedgeBOT Apr 16, 2025, 10:16 AM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

shadow folio Apr 16, 2025, 10:16 AM

#

serene scaffold !resources data science

Thank you

pearl barn Apr 16, 2025, 11:05 AM

#

serene scaffold !resources data science

No matching results

serene scaffold Apr 16, 2025, 1:29 PM

#

pearl barn No matching results

what settings did you do?

pearl barn Apr 16, 2025, 1:29 PM

#

Paid course sql

lapis sequoia Apr 16, 2025, 3:42 PM

#

python c++ AI ML?
AR VR XR PYHON C++

pallid badge Apr 16, 2025, 8:16 PM

#

safe agate It was the creator of marimo, Akshay, not myself. <https://talkpython.fm/episode...

I have to listen still to this episode and I am looking forward to test it. Out of curiosity, what alternatives besides marimo and jupyter are still available? Did i understand correctly, that marimo is pure Python, easier to track changes with git? I am also looking forward to this new option, to keep order of cells

safe agate Apr 16, 2025, 8:20 PM

#

pallid badge I have to listen still to this episode and I am looking forward to test it. Out ...

Yeah that's correct, marimo is pure Python and git friendly.

austere prawn Apr 16, 2025, 9:07 PM

#

safe agate Yeah that's correct, marimo is pure Python and git friendly.

however looking at the python code in editor it lights up quite a bit because of warnings from ruff linter 😛

flint onyx Apr 16, 2025, 9:18 PM

#

p1 - low leverage + small residual (good leverage point)
p2 - low leverage + large (?) residual (outlier (?))
p3 - high leverage + large residual (bad leverage point) (influential)
is this correct?
I asked chatgpt and it gave me something completely diff
it said p1 was an outlier but I cant see why
and it said p2 had high leverage

flint onyx Apr 16, 2025, 9:33 PM

#

Also am I understanding this right?

low res + high lev = influential
high res + low lev = outlier
both high = could be influential

lapis sequoia Apr 16, 2025, 11:00 PM

#

Are GANS just nonsense hard and unpredictable and just fry your gpu? Like, they are fighting to get what they want. They are in chaos. Does anyone casually just make GANS? At least in RL, they is some sort of sanity. You know?

viscid urchin Apr 16, 2025, 11:09 PM

#

They are hard to train and resource-intensive, yeah.. but arguably with modern frameworks people are out there "casually" making them. It's just not easy to get to that level from scratch.

#

Are you doing your own by hand or are you using a library?

#

I'm feeling super dumb suddenly, what's a use-case for np.sum(array, axis=-1)? I'm trying to think of when I want to work backwards vs forwards.

serene scaffold Apr 16, 2025, 11:24 PM

#

viscid urchin I'm feeling super dumb suddenly, what's a use-case for `np.sum(array, axis=-1)`?...

Try making an array and taking the sum of it with different values for axis

#

See what happens

viscid urchin Apr 16, 2025, 11:28 PM

#

I mean, I understand what it's doing mechanically, I'm just trying to come up with an algorithm where I'm going to want that

#

I feel like I should be able to think of four really easily and it's not happening haha

#

Oh duh, image processing where you want to sum across the three color channels or something, I suppose.

#

I guess axis=0 is like "reduce rows", axis=1 is like "reduce columns", and axis=-1 is like "reduce innermost dimension"

desert oar Apr 17, 2025, 12:23 AM

#

Note that this isn't just part of np.sum -- it's shared behavior for most vectorized numpy operations ("ufuncs")

desert oar Apr 17, 2025, 12:24 AM

#

viscid urchin I guess axis=0 is like "reduce rows", axis=1 is like "reduce columns", and axis=...

This is precisely the idea

#

It's a very flexible system

viscid urchin Apr 17, 2025, 12:25 AM

#

I'm trying to re-program my brain to think of the 'matrix approach' to things first; it's slightly painful. Thanks for the confirmation.

hearty depot Apr 17, 2025, 12:31 AM

#

viscid urchin I guess axis=0 is like "reduce rows", axis=1 is like "reduce columns", and axis=...

ye -1, is p nice for that

lapis sequoia Apr 17, 2025, 12:35 AM

#

viscid urchin They are hard to train and resource-intensive, yeah.. but arguably with modern f...

generator, discriminator, leaky relu, I do not know what you mean by scratch. Honestly, it is resource intensive no matter what.

jaunty helm Apr 17, 2025, 12:53 AM

#

lapis sequoia Are GANS just nonsense hard and unpredictable and just fry your gpu? Like, they ...

off the top of my head, I think prior to diffusion models GAN was pretty competitive in imagegen
tho I'll admit I don't know the specifics

#

fry your gpu
I mean all neural networks do that once you get big enough

#

llama 8b? casually eats 20gbs of vram (if you do no quantization)

glacial root Apr 17, 2025, 12:58 AM

#

viscid urchin I'm feeling super dumb suddenly, what's a use-case for `np.sum(array, axis=-1)`?...

i'm guessing it's the same as setting the axis to the final dimension

#

i could be wrong though

hearty depot Apr 17, 2025, 12:58 AM

#

jaunty helm >fry your gpu I mean all neural networks do that once you get big enough

gan isn't too bad tbf in the grand scheme of things, like compared to diffusion models inference and training is p cheap for hte most part

hearty depot Apr 17, 2025, 12:58 AM

#

glacial root i'm guessing it's the same as setting the axis to the final dimension

ye it's similar to how do u -i

glacial root Apr 17, 2025, 12:59 AM

#

glacial root i'm guessing it's the same as setting the axis to the final dimension

as in if there's n dimensions, then axis = -1 would be the same as axis = n - 1

hearty depot Apr 17, 2025, 12:59 AM

#

for lists, its select the ith list strating from the inner d imension

pearl barn Apr 17, 2025, 1:17 AM

#

What do you thin of Maven Data analysis course with python is it good or I will just repeat the process and when I find a real world problem I will be stuck like stupid? And What About Alice Zhao is she good she made an advanced sql course but I couldn't download it because it was uploaded to rapidgator?

lapis sequoia Apr 17, 2025, 1:22 AM

#

jaunty helm off the top of my head, I think prior to diffusion models GAN was pretty competi...

I do not know, two NN's just figting for nash equillibrium. Bert and stuff, you are pretraining text data on a genius, not two little angry neural nets trying to win a war and come to terms.

jaunty helm Apr 17, 2025, 1:30 AM

#

lapis sequoia I do not know, two NN's just figting for nash equillibrium. Bert and stuff, you ...

I mean, clearly there's merit to it if it works well
probably will have to read a bunch to really understand why it works

lapis sequoia Apr 17, 2025, 1:47 AM

#

jaunty helm I mean, clearly there's merit to it if it works well probably will have to read ...

I am just stressed, it is hard. I get how it works. you are trying to get the discriminator and generator to pretty much come to terms and agree on where they are like "ok, I can pump fakes and you can pump real data", and they are like "ok, that works". Pretty much. Just take so long to train and it is never optimal even with insane epochs and hyper parameters. The GAN game.

dusty valve Apr 17, 2025, 4:04 AM

#

lapis sequoia I am just stressed, it is hard. I get how it works. you are trying to get the di...

The history of captchas is the longest running training session

umbral geyser Apr 17, 2025, 4:52 AM

#

Ah soon I will be active In this group as I am taking data analytic, ai, data science/ml path later on

#

So excited to discuss with you guys n learn some cool hacks n tips

viscid urchin Apr 17, 2025, 5:29 AM

#

I wonder if anybody's tried to make a meme model that just uses the mantissa bits of FP16 NaNs, and ignores all real floats

tribal kettle Apr 17, 2025, 5:53 AM

#

Hey guys give me the road map to data base administration

grand minnow Apr 17, 2025, 8:19 AM

#

tribal kettle Hey guys give me the road map to data base administration

https://roadmap.sh/postgresql-dba

roadmap.sh

DBA Roadmap: Learn to become a database administrator with PostgreSQL

Community driven, articles, resources, guides, interview questions, quizzes for DevOps. Learn to become a modern DevOps engineer by following the steps, skills, resources and guides listed in this roadmap.

grand breach Apr 17, 2025, 9:57 AM

#

does huggingface allow to generate embeddings through api without downloading the model locally ?

lapis sequoia Apr 17, 2025, 9:59 AM

#

AI

#

Ml

grand breach Apr 17, 2025, 10:00 AM

#

grand breach does huggingface allow to generate embeddings through api without downloading th...

or there is any other free service ?

tribal kettle Apr 17, 2025, 10:24 AM

#

Thank you so much agentQ

lapis sequoia Apr 17, 2025, 11:11 AM

#

dusty valve The history of captchas is the longest running training session

underrated comment. that was amazing

outer cloak Apr 17, 2025, 11:56 AM

#

Yo! What's up mates I am back after a LOT of work! What's goin' on?

#

I learnt using Pandas and learned how to Clean Data

#

Now what??

pallid badge Apr 17, 2025, 11:59 AM

#

Hi, are there any pages good for checking for data science jobs, scientific software dev in Europe? Linkedin shows me only promoted jobs first, very annoying

outer cloak Apr 17, 2025, 12:43 PM

#

umm HI mate!

#

You can do Freelancing! Its easy and good for DS and ML.

Go to Fiverr or Upwork

#

sign in, create Gig and give out your sample projects. and BAM! you are done!

#

u get it?

lapis sequoia Apr 18, 2025, 12:00 AM

#

bro, like, when most people talk of RL who are not in robotics or optimal control theory EE stuff, are they just talking Q-learing? With Q-tables? Like, what is up?

limber token Apr 18, 2025, 12:04 AM

#

grand breach does huggingface allow to generate embeddings through api without downloading th...

Not that I'm aware of

#

They're more of a hosting thing than a service thing

#

nvm, you absolutely can: https://huggingface.co/docs/inference-providers/en/index

Inference Providers

#

Doesn't seem to support every model type though

#

Here's a snippet on how to generate embeddings:

from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="hf-inference",
    api_key="hf_xxxxxxxxxxxxxxxxxxxxxxxx",
)

result = client.feature_extraction(
    inputs="Today is a sunny day and I will get some ice cream.",
    model="intfloat/multilingual-e5-large-instruct",
)

grand minnow Apr 18, 2025, 12:15 AM

#

grand breach or there is any other free service ?

Here's a bunch of other embedding models: https://python.langchain.com/docs/integrations/text_embedding/

warm iron Apr 18, 2025, 3:17 AM

#

#

Guys what could cause it?

odd meteor Apr 18, 2025, 7:13 AM

#

warm iron

A lot of things can cause this.

From this learning curve we could infer your model is overfitting. Your model keeps fitting better to the training data, but it’s no longer generalizing better to unseen validation data after 70th-80th epoch.

What's the model you're training?
Are you using sufficient amount of data to train the model?
Did you apply any regularization?
BatchNorm, Dropout, Weight Decay... are you using any of those?
Have you done hyperparameter sweep on your learning rate?
Tried using different batch size and it's still not improving?

I'd like to hear what you've tried so far.

fallen cliff Apr 18, 2025, 8:03 AM

#

Hello I've made a Hcaptcha solver in python as university project, useing Ml.
Don't need anything but a feedback.
That's the repo -> https://github.com/Irodavlas/HCaptchaSolver .
If anyone wants to give it a try lmk it should take few minutes since I've put tests on it.

GitHub

GitHub - Irodavlas/HCaptchaSolver

Contribute to Irodavlas/HCaptchaSolver development by creating an account on GitHub.

river cape Apr 18, 2025, 11:08 AM

#

hey guys i have just used the function calling in openai. and I dont understand where could it be useful? I understand that we define a function description and if the prompt matches the description , the model would extract the parameters . Now these params could be use for an api function to extract the real-time info , but then to pass that real-time info , we again need to send the same prompt again , to get the final outcome . So its a two step process.

weak oxide Apr 18, 2025, 1:12 PM

#

I have a question
What's your opinion on Neural Prophet?

#

Because I saw normal Facebook prophet get completely bashed

narrow tiger Apr 18, 2025, 2:35 PM

#

guys bge-m3 or openAI, which embeddings sare better?

inland fractal Apr 18, 2025, 3:31 PM

#

#1362805794004799698 message
anyone can hele me?

hollow lake Apr 18, 2025, 4:19 PM

#

Hello guys Anyone here who master using RAG framework with chatbots ?

serene scaffold Apr 18, 2025, 4:35 PM

#

hollow lake Hello guys Anyone here who master using RAG framework with chatbots ?

Hello, remember to never ask to ask or ask for an expert. Always ask your actual question right away.

rose spade Apr 18, 2025, 5:13 PM

#

How do I train my AI model with my own scratch datasets? I'm planning to use pytorch and pandas for this.

viscid urchin Apr 18, 2025, 5:14 PM

#

rose spade How do I train my AI model with my own scratch datasets? I'm planning to use pyt...

This is a big question, you may want a whole course https://course.fast.ai/

rose spade Apr 18, 2025, 5:17 PM

#

In a hypothetical situation, if I have a project that uses AI and Machine Learning, I should probably learn the basics first to understand the better logic and such?

viscid urchin Apr 18, 2025, 5:18 PM

#

Yep. There's a lot to learn, and it's daunting at first, but you should learn it from the ground up.

#

Feel free to have a big goal in mind to motivate you of course.

#

For me that's meant re-learning a bunch of calculus I hadn't paid enough attention to the first time around.

rose spade Apr 18, 2025, 5:24 PM

#

Well damn, that's a tough one. but ig I'll read the needed docs for that. hopefully I can show some progress for my project that involes machine learning and ai.

#

Supposedly, my only focus are on web-dev, but I got shifted to learning ai and such.

viscid urchin Apr 18, 2025, 5:27 PM

#

That's a big shift of scope

#

Like, I don't want to belittle webdev, but ML is a bigger problem to tackle

ionic surge Apr 18, 2025, 6:09 PM

#

i want to learn model training in python ising yolo in kaggle any one help

#

pls

viscid urchin Apr 18, 2025, 6:10 PM

#

Check out the 'pinned' stuff at the top of the channel, it seems pretty good

viscid urchin Apr 18, 2025, 6:12 PM

#

ionic surge i want to learn model training in python ising yolo in kaggle any one help

Welcome to the server.
Are you familiar with any basic ML stuff and/or Python, or is this your first outing?
Have you looked at this? https://docs.ultralytics.com/
Do you have a Kaggle notebook going already?
Do you have a dataset to work with?

ionic surge Apr 18, 2025, 6:14 PM

#

yes i have a notebook and i also create my own custom dataset for my project

ionic temple Apr 18, 2025, 8:19 PM

#

Anyone?

viscid urchin Apr 18, 2025, 8:50 PM

#

ionic temple

This might just be your font that it's chosen to use, because I think the MongoDB console is utf-8 by default.

charred estuary Apr 19, 2025, 12:16 AM

#

Has anyone played around with the HALO Hat for the Raspberry Pi 5?

#

It adds 26 TOPs and I was wondeing if I should get one or save to build a dedicated rig with a 3060

limpid dew Apr 19, 2025, 1:16 AM

#

Is there a fundamental difference between using an embedding layer and one‑hot encoding into a fully connected layer?

untold fable Apr 19, 2025, 3:24 AM

#

Imagine I want to create a program where that will guess your facial expression and based on your facial expression place the song on Spotify

#

I don't have any Spotify premium

versed bloom Apr 19, 2025, 7:25 AM

#

Are people with knowledge in Deep Learning here? If yes, please write me a DM

limber spear Apr 19, 2025, 7:43 AM

#

versed bloom Are people with knowledge in Deep Learning here? If yes, please write me a DM

What is it about. We can all learn on the channel here

versed bloom Apr 19, 2025, 8:05 AM

#

limber spear What is it about. We can all learn on the channel here

I am kind of confused by ResNets. I've seen the provided picture and know that the architecture on the right represents them. But I don't get how the input is provided since it is combined with more prior output as of what I've read, so how the "Residuum" really is calculated. And how the Skip Connections work

limber spear Apr 19, 2025, 8:08 AM

#

versed bloom I am kind of confused by ResNets. I've seen the provided picture and know that t...

From my experience, the most simplified explanation of everything in data science and ai is - input goes in, output comes out. It is really that simple.

versed bloom Apr 19, 2025, 8:10 AM

#

limber spear From my experience, the most simplified explanation of everything in data scienc...

I know that one haha but that doesnt help me 😄

limber spear Apr 19, 2025, 8:11 AM

#

This looks like a convulational neural network for image processing. With a bunch of processing layers

#

@versed bloom did you pull the equations? That is the how. Or what you may be seeking to understand

versed bloom Apr 19, 2025, 8:18 AM

#

The skipping is the dotted lines i gues, ResNet is a translated form of CNNs. I dont need a equation, I dont know how the input of a layer comes up since it is combined with some other prior output and the initial input?

limber spear Apr 19, 2025, 8:21 AM

#

The ‘skipping lines’ have equations. That is how most of these ‘complex’ models work

#

Just googled it

severe blade Apr 19, 2025, 3:19 PM

#

hey, i've been struggling with this for 3-4 days now.

#

grand minnow Apr 19, 2025, 3:37 PM

#

severe blade hey, i've been struggling with this for 3-4 days now.

Did you try this? https://stackoverflow.com/a/61034368

Stack Overflow

Why `torch.cuda.is_available()` returns False even after installing...

On a Windows 10 PC with an NVidia GeForce 820M
I installed CUDA 9.2 and cudnn 7.1 successfully,
and then installed PyTorch using the instructions at pytorch.org:
pip install torch==1.4.0+cu92 torch...

strange vault Apr 19, 2025, 4:12 PM

#

I am trying to create a bot that extracts energy prices in de EU, per country and want to have live updates that are relatively up to date. I found one website that is both free and updates their dat frequently. But I can't find out how to use their API as they seem to be transitioning websites.

https://newtransparency.entsoe.eu/market

#

Does anyone know an alternative database for this or how to actually access their api to extract the data live and semi-continuous?

lapis sequoia Apr 19, 2025, 4:25 PM

#

I made a GAN that did not mode collapse, I have goosebumps this is amazing and magical. GANS are magical. They really are.

#

I did not think this was possible. I love this!

spring field Apr 19, 2025, 4:55 PM

#

strange vault I am trying to create a bot that extracts energy prices in de EU, per country an...

wdym "transitioning websites"? I can't find any API docs on their site anyway

limber spear Apr 19, 2025, 5:32 PM

#

strange vault Does anyone know an alternative database for this or how to actually access thei...

Check to see what government agencies have available for this category of data. Leading energy companies. But I wouldn’t think they would freely have that data available for their competitors to gain a competitive advantage

charred estuary Apr 19, 2025, 5:48 PM

#

Hey does anyone know if PyTorch could take advantage of 2 GPUs? Was planning on getting x2 3050’s with 6gb of VRAM each. I know I can’t train a massive model but I want to try training something small from scratch or fine tuning a 500M - 1B model

#

Do I need 1 GPU with 12GB vram or will x2 with 6gb do the trick?

serene scaffold Apr 19, 2025, 5:53 PM

#

charred estuary Hey does anyone know if PyTorch could take advantage of 2 GPUs? Was planning on ...

Yes. You can adjust the device map settings

viscid urchin Apr 19, 2025, 5:53 PM

#

charred estuary Do I need 1 GPU with 12GB vram or will x2 with 6gb do the trick?

Yeah it's got a thing called DataParallel. Here's a slightly old tutorial but I don't immediately see anything wildly out of date https://pytorch.org/tutorials/beginner/blitz/data_parallel_tutorial.html

charred estuary Apr 19, 2025, 6:07 PM

#

serene scaffold Yes. You can adjust the device map settings

Thank you 😄

pseudo lagoon Apr 19, 2025, 6:41 PM

#

Hey everyone I have basic knowledge of python mainly I am a web dev but want to learn ai since I am not getting anyone who wants a website :-:
Can anyone guide how to start in ai field and how to improve further
For context i ald have basic knowledge of python can use APIs have a foundation of pandas too

charred estuary Apr 19, 2025, 6:55 PM

#

pseudo lagoon Hey everyone I have basic knowledge of python mainly I am a web dev but want to ...

Oh hey I do the same stuff! If you want to get started first learn prompt engineering. I made a flask website and started using the Gemini 2.0 flash API since it’s free and easy to get started with . I recommend learning more about how AI works before you start making your own.

#

My Gemini site is cryptoknightai.com

#

Building it definitely helped me learn the basics and now I’m learning PyTorch

charred estuary Apr 19, 2025, 6:56 PM

#

pseudo lagoon Hey everyone I have basic knowledge of python mainly I am a web dev but want to ...

https://www.codedex.io

Codédex

Codédex | Start Your Coding Adventure ⋆˙⟡

Codédex is a new way to learn to code for kids and adults alike. Journey through the fantasy land of Python, HTML, CSS, or JavaScript, earn experience points (XP) to unlock new regions, and collect all the badges at your own pace. Start your adventure today.

#

Great tool to start learning

reef gazelle Apr 19, 2025, 7:11 PM

#

import numpy as np
from PIL import Image, ImageOps

const_x_mean = 33.318421449829934
const_x_std = 78.56748998339798
epsilon = 1e-10


img_path = 'one.png'
img = Image.open(img_path).convert('L')  


pixel_mean = np.mean(img)
if pixel_mean > 127: 
    img = ImageOps.invert(img)

img = img.resize((28, 28))
img_array = np.array(img).reshape(1, 784)


img_array = (img_array - const_x_mean) / (const_x_std + epsilon)

prediction = model.predict(img_array)
predicted_class = np.argmax(prediction)

print("Predicted class:", predicted_class)

#

I need help it load wrong image data

#

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential(
[
Dense(128,activation='relu', input_shape=(784,)),
Dense(128,activation='relu', ),
Dense(10,activation='softmax')

]
)

#

any boday can help me how i can load image correctly when i predict

#

model work with the test data correctly

#

Screenshot_2025-04-20_at_12.43.27_AM.png

reef gazelle Apr 19, 2025, 7:41 PM

#

hey

lapis sequoia Apr 19, 2025, 10:03 PM

#

any of you remember your first GAN?

lapis sequoia Apr 19, 2025, 10:24 PM

#

lapis sequoia any of you remember your first GAN?

I remember mine like yesterday, even tho it was today. What am I going to? Generate fake celebs? It just sounds so dumb to master gans. I mean another tool in the arsenal. Oh, boys,(and girls) I am proud.

severe blade Apr 20, 2025, 7:23 AM

#

grand minnow Did you try this? https://stackoverflow.com/a/61034368

tried everything. nothing seems to work.

viscid urchin Apr 20, 2025, 7:26 AM

#

severe blade tried everything. nothing seems to work.

What does torch.zeros(1).cuda() do for you? Do you get an error message?

severe blade Apr 20, 2025, 7:30 AM

#

yes.

grand minnow Apr 20, 2025, 7:36 AM

#

severe blade yes.

Did you install CUDA toolkit?

severe blade Apr 20, 2025, 7:52 AM

#

yes.

#

i literally bought this 4060 for this purpose. and it doesn't seem to work. 😭

grand minnow Apr 20, 2025, 8:13 AM

#

severe blade i literally bought this 4060 for this purpose. and it doesn't seem to work. 😭

How did you install pytorch?

severe blade Apr 20, 2025, 9:24 AM

#

normally. pip install torch.

grand minnow Apr 20, 2025, 9:42 AM

#

severe blade normally. pip install torch.

thats not right.

grand minnow Apr 20, 2025, 9:43 AM

#

severe blade normally. pip install torch.

https://pytorch.org/get-started/locally/

PyTorch

Start Locally

severe blade Apr 20, 2025, 9:43 AM

#

yea, just noticed this.

grand minnow Apr 20, 2025, 9:43 AM

#

reinstall your pytorch

#

Then try again

severe blade Apr 20, 2025, 10:27 AM

#

done.

#

#

device = 'cuda' if torch.cuda.is_available() else 'cpu'
this too returns cpu only.

#

still doesn't work.

grand minnow Apr 20, 2025, 10:33 AM

#

severe blade

#

Try uninstalling pytorch completely and then run your test code. If it throws "Module Not Found" error, then its definitely in the right environment. If it still runs, then you've installed it in the wrong environment.

#

You might consider setting up a virtual environment

arctic wedgeBOT Apr 20, 2025, 10:34 AM

#

Virtual environments

Virtual environments are isolated Python environments, which make it easier to keep your system clean and manage dependencies. By default, when activated, only libraries and scripts installed in the virtual environment are accessible, preventing cross-project dependency conflicts, and allowing easy isolation of requirements.

To create a new virtual environment, you can use the standard library venv module: python3 -m venv .venv (replace python3 with python or py on Windows)

Then, to activate the new virtual environment:

Windows (PowerShell): .venv\Scripts\Activate.ps1
or (Command Prompt): .venv\Scripts\activate.bat
MacOS / Linux (Bash): source .venv/bin/activate

Packages can then be installed to the virtual environment using pip, as normal.

For more information, take a read of the documentation. If you run code through your editor, check its documentation on how to make it use your virtual environment. For example, see the VSCode or PyCharm docs.

Tools such as poetry and pipenv can manage the creation of virtual environments as well as project dependencies, making packaging and installing your project easier.

Note: When using PowerShell in Windows, you may need to change the execution policy first. This is only required once per user:

Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

severe blade Apr 20, 2025, 10:43 AM

#

grand minnow You might consider setting up a virtual environment

i've already done it.

severe blade Apr 20, 2025, 10:44 AM

#

grand minnow Try uninstalling pytorch completely and then run your test code. If it throws "M...

the same thing's happening. how do i uninstall it from the "wrong" environment too?

grand minnow Apr 20, 2025, 10:44 AM

#

severe blade the same thing's happening. how do i uninstall it from the "wrong" environment t...

How do you know its in the "wrong" environment?

grand minnow Apr 20, 2025, 10:45 AM

#

severe blade the same thing's happening. how do i uninstall it from the "wrong" environment t...

How did you uninstall it?

grand minnow Apr 20, 2025, 10:45 AM

#

severe blade i've already done it.

If you have done it, then you would have used it and installed directly in it. Then you won't have this "right" or "wrong" environment

severe blade Apr 20, 2025, 10:46 AM

#

still runs.

rich river Apr 20, 2025, 11:47 AM

#

  "postCreateCommand": "cd detectron2-0.6 && python3 -m pip install -e ."

this is in my docker file
I got

Running the postCreateCommand from devcontainer.json...

[5223 ms] Start: Run in container: /bin/sh -c cd detectron2-0.6 && python3 -m pip install -e .
Defaulting to user installation because normal site-packages is not writeable
Obtaining file:///workspaces/FFS-main/detectron2-0.6
  Preparing metadata (setup.py) ... error
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> [8 lines of output]
      running egg_info
      creating /tmp/pip-pip-egg-info-qpqnconv/detectron2.egg-info
      writing /tmp/pip-pip-egg-info-qpqnconv/detectron2.egg-info/PKG-INFO
      writing dependency_links to /tmp/pip-pip-egg-info-qpqnconv/detectron2.egg-info/dependency_links.txt
      writing requirements to /tmp/pip-pip-egg-info-qpqnconv/detectron2.egg-info/requires.txt
      writing top-level names to /tmp/pip-pip-egg-info-qpqnconv/detectron2.egg-info/top_level.txt
      writing manifest file '/tmp/pip-pip-egg-info-qpqnconv/detectron2.egg-info/SOURCES.txt'
      error: package directory 'detectron2-0.6/projects/PointRend/point_rend' does not exist
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
[7480 ms] postCreateCommand from devcontainer.json failed with exit code 1. Skipping any further user-provided commands.
Done. Press any key to close the terminal.

However

vscode ➜ /workspaces/FFS-main $ ls -l detectron2-0.6/projects/PointRend/
total 24
-rw-rw-r-- 1 vscode vscode 7467 Oct 26  2021 README.md
drwxrwxr-x 4 vscode vscode 4096 Oct 26  2021 configs
drwxrwxr-x 2 vscode vscode 4096 Oct 26  2021 point_rend
-rwxr-xr-x 1 vscode vscode 5160 Oct 26  2021 train_net.py

I do have this folder, any ideas?

grand minnow Apr 20, 2025, 1:48 PM

#

severe blade still runs.

strange to see pip3 in a windows. What happens if you do pip instead of pip3?

severe blade Apr 20, 2025, 2:38 PM

#

its the same.

#

im using python version 3.12.6, for the record.

grand minnow Apr 20, 2025, 2:40 PM

#

severe blade its the same.

can you share what you do have installed? pip list I think

severe blade Apr 20, 2025, 2:41 PM

#

Package            Version
------------------ -----------
certifi            2025.1.31
charset-normalizer 3.4.1
filelock           3.18.0
fsspec             2025.3.2
idna               3.10
Jinja2             3.1.6
MarkupSafe         3.0.2
mpmath             1.3.0
networkx           3.4.2
numpy              2.2.4
pandas             2.2.3
pillow             11.2.1
pip                25.0.1
python-dateutil    2.9.0.post0
pytz               2025.2
regex              2024.11.6
requests           2.32.3
setuptools         78.1.0
six                1.17.0
sympy              1.13.1
tiktoken           0.9.0
torchaudio         2.6.0
torchvision        0.21.0
typing_extensions  4.13.2
tzdata             2025.2
urllib3            2.4.0

#

why the hell is there no "torch"

#

but the code still runs?

grand minnow Apr 20, 2025, 2:58 PM

#

severe blade its the same.

how did you run your notebook?

#

What does your "kernel" show?

severe blade Apr 20, 2025, 3:07 PM

#

grand minnow What does your "kernel" show?

"Python 3.12.6"

grand minnow Apr 20, 2025, 3:07 PM

#

severe blade "Python 3.12.6"

And how did you run your notebook?

#

Im off to bed

severe blade Apr 20, 2025, 3:12 PM

#

grand minnow Im off to bed

'night.

severe blade Apr 20, 2025, 3:13 PM

#

grand minnow And how did you run your notebook?

"Run All" and yes, i restarted the kernel many times.

fallow coyote Apr 20, 2025, 4:24 PM

#

I use CoPilot and ChatGPT to find resources and to give me an idea on how to go about starting a particular project in mind. Is this an effective way of using these tools? I never use AI for helping me complete any of my programming projects; any issues that come up in my code, either i figure out myself, ask the discord here or seaching on google (stackoverflow, reddit etc)

torpid jungle Apr 20, 2025, 4:59 PM

#

crazy pfp

wild solar Apr 20, 2025, 6:02 PM

#

Is there a preferred Python version to use when it comes to machine learning and AI? I used to work with Python 3.13 but had many issues with PyTorch and had to roll back to an older version, now i use 3.10 mainly and 3.12.6 occasionally.

tardy vessel Apr 20, 2025, 6:08 PM

#

I'm building an ai voice agent in python that streams audio from twilio to assmebly ai stt for transcription and vad, but it takes 4 seconds to reply back with the final transcript. I want it to be less than 1 sec.

Can anyone help with this?

viscid urchin Apr 20, 2025, 6:29 PM

#

wild solar Is there a preferred Python version to use when it comes to machine learning and...

3.11 is sort of a sweet spot at the moment for many libraries.

inland fractal Apr 20, 2025, 10:30 PM

#

should I only use sklearn correlation matrix before start feature scaling?

limpid dew Apr 20, 2025, 10:52 PM

#

Anyone have experience with embedding? Trying to understand how it difference form one hot encoding at a low level.

grand minnow Apr 20, 2025, 11:09 PM

#

tardy vessel I'm building an ai voice agent in python that streams audio from twilio to assme...

Take a look at VAPI if you havn't yet.

grand minnow Apr 20, 2025, 11:11 PM

#

fallow coyote I use CoPilot and ChatGPT to find resources and to give me an idea on how to go ...

I use the same. Helps me with research but I combine them together with my Google-fu searches. Once I lay everything into a document and review what I want and how I want, I can then start properly.

grand minnow Apr 20, 2025, 11:12 PM

#

wild solar Is there a preferred Python version to use when it comes to machine learning and...

There's no preference except look at whether the libraries you wanna use is supported for that Python version. For instance, spaCy is supported just before 3.13.

shadow cobalt Apr 20, 2025, 11:12 PM

#

When doing PCA, increasing number of components shouldn't affect previous features are selected should it? i.e. picking N features should lead to the same list of features N-1 but just with an extra feature added

grand minnow Apr 20, 2025, 11:15 PM

#

shadow cobalt When doing PCA, increasing number of components shouldn't affect previous featur...

Increasing number of components should affect previous features, because you added a new principal component.

shadow cobalt Apr 20, 2025, 11:17 PM

#

maybe i need to rewatch how it works

#

i thought it was like tuning a polynomial fit like a taylor series

glacial root Apr 20, 2025, 11:42 PM

#

hey guys, i just implemented a byte-pair encoding algorithm for a corpus of text, however i'm not exactly sure why this issue is happening
basically the issue is that after a certain number of iterations, rather than creating pairs it adds empty strings to the corpus, which originally started out as an array of each char from the original text used
here is my code and the corpus used, if anyone knows about this and can see the cause of the issue can you please help? thank you

#

import os

command = 'cat text_data/forms/abc/AbcPoems2AbcHkAndChinaV2Cauchy3Poembycheungshunsang.txt'
executable = os.popen(command)
corpus = list(executable.read())
executable.close()

vocabulary = []
for i in corpus:
    if i not in vocabulary:
        vocabulary.append(i)

for i in range(1000):
    pairs = dict()
    for j in range(len(corpus) - 1):
        key_list = list(pairs.keys())
        pair = ''.join(corpus[i:i+2])
        if pair in key_list:
            pairs[pair] += 1
        else:
            pairs.update({pair: 1})
    
    pairs = dict(sorted(pairs.items(), key = lambda item: item[1]))
    vocabulary.append(list(pairs)[-1])
    for j in range(len(corpus) - 1):
        chars = ''.join(corpus[i:i+2])
        if chars == vocabulary[-1]:
            corpus[i:i+2] = [chars]

#

and here's the corpus i used, which i then converted to an array of chars

#

2 ABC of H.k. and China revised vision.
Barrels tears are wines and salts.
With a whisk on goody tails!
Wiggle maces to fix the heads.
Heads in jack on boxes are ceased.
Cry to paranoid truly bosses.
Bosses are jokers take your boys.
Studs are bogs with fire apples.
True predicates worth cases.’
Descents wash in badly bands.
Wholly sales are smart with cats.
Who got tenth honors in China?
Homage grand to play and plays!
Trim the times of hearts then cry.
Tanks in steels but voice wail.
Bossy dragged by tails that whisked.
Go very timid and love the wise.
Hands are lent but laws are ends.
Cases on courts are borrowed lands.
Length long with treads to retch!
Straps on times and watch here.
Arrays tanks but all are men.
Cross all suctions steal the ends.
Cave on minds are cages on objects.
Rouser rockets powers holes.
Confine curses to stop our wounds.
Whirl your bodies and jump on grounds.
Crouch of soldiers after kicks with flings.
Block one leg and hit the middle.
Cauchy3 know the tricks to kill.
Threaten weak oppressed ill.
Surpass scores are bad in honors.
Wash to think that build the homes.
Angel sins but cauchy3 has funs.
Make ones tools when hats are found.
Worlds are drawers on bottom noses.
Singular ugly piece is rose.
Wily mores are teeth of sharks.
Saw with tooth is laws in arts.
Artful men power with grids.
Bodies stamped and wills are ridden.
Sign in forth with battles conquered.
Triumphs on candles whip the stands.
Soups are soaps and faiths not come.
We are meats in balls and rice to constants.
---Cheung Shun Sang=Cauchy3---

#

i know the poem is a little weird lol, was from a random dataset i found on kaggle

#

here's an example result of what i mean with what's going on with the corpus

📎 message.txt

arctic wedgeBOT Apr 20, 2025, 11:45 PM

#

glacial root here's an example result of what i mean with what's going on with the corpus

Click here to see this code in our pastebin.

shadow cobalt Apr 21, 2025, 12:23 AM

#

shadow cobalt i thought it was like tuning a polynomial fit like a taylor series

I might be thinking of a cosine transform actually

#

I'm pretty sure that works that way

fallow coyote Apr 21, 2025, 12:29 AM

#

grand minnow I use the same. Helps me with research but I combine them together with my Googl...

Exactly what I do. Tbf I think using AI in this regard is so much more useful than just letting it do everything for you

pine heron Apr 21, 2025, 2:11 AM

#

Hello everyone, I implement some optimizers using TensorFlow. I hope this project can help you.

https://github.com/NoteDance/optimizers

GitHub

GitHub - NoteDance/optimizers: This project implements optimizers f...

This project implements optimizers for TensorFlow and Keras, which can be used in the same way as Keras optimizers. Machine learning, Deep learning - NoteDance/optimizers

charred estuary Apr 21, 2025, 2:30 AM

#

I’m building a new rig and I am getting into AI training and running LLMs locally. Are there any good AMD GPUs for AI devs or is it just really an NVIDIA thing? I’m finding a lot of AMD GPUs with a decent amount of VRAM are much less than NVIDIA.

serene scaffold Apr 21, 2025, 2:37 AM

#

charred estuary I’m building a new rig and I am getting into AI training and running LLMs locall...

I'm not aware of any non-NVIDIA hardware that's anywhere nearly as widely supported as NVIDIA hardware. You can look to see if PyTorch runs on any non-NVIDIA devices, and with what caveats.

You will find that the amount of compute resources needed to fine-tune or deploy LLMs varies by orders of magnitude. you might consider not buying any AI-specific hardware at all, and using the savings to rent cloud compute.

You can't train an LLM from scratch on consumer-grade hardware--you can only maybe fine-tune an existing one.

iron basalt Apr 21, 2025, 2:43 AM

#

serene scaffold I'm not aware of any non-NVIDIA hardware that's anywhere nearly as widely suppor...

Modern AMD works mostly fine on PyTorch now, this is in large part due to AMD directly supporting Pytorch.

charred estuary Apr 21, 2025, 2:44 AM

#

serene scaffold I'm not aware of any non-NVIDIA hardware that's anywhere nearly as widely suppor...

Actually you can from scratch. I know a guy who trained a 2B and a 4B model off data sets he got off hugging face

#

The 4B may have been cloud but I know he did the 2B himself

charred estuary Apr 21, 2025, 2:44 AM

#

iron basalt Modern AMD works mostly fine on PyTorch now, this is in large part due to AMD di...

Yea I was gonna say that too

#

It’s all compatible just asking wether or not it runs well

iron basalt Apr 21, 2025, 2:44 AM

#

You would need something like this to run it locally though: https://tinygrad.org/#tinybox

#

One GPU is not enough.

#

(15k USD)

charred estuary Apr 21, 2025, 2:45 AM

#

Damn 😭

charred estuary Apr 21, 2025, 2:46 AM

#

iron basalt You would need something like this to run it locally though: https://tinygrad.or...

Not tryna train the next chat gpt, just learning how everything works. I trained a 100M on my M1 MacBook Air

#

It’s more important to learn the skills IMO and the you can judge if a device like that is worth it

iron basalt Apr 21, 2025, 2:47 AM

#

charred estuary Not tryna train the next chat gpt, just learning how everything works. I trained...

This is not to train something like ChatGPT, that costs millions.

charred estuary Apr 21, 2025, 2:47 AM

#

Yk what I mean

#

Not wanting to make the next big thing

iron basalt Apr 21, 2025, 2:47 AM

#

Ok, if you just want to learn some ML, any modern consumer GPU will do.

#

Except Intel or whatever.

charred estuary Apr 21, 2025, 2:48 AM

#

Yea but from what I’m finding most AMD cards come no where near NVIDIA

iron basalt Apr 21, 2025, 2:48 AM

#

No idea of the status of Intel, seems like no one cares about it.

charred estuary Apr 21, 2025, 2:48 AM

#

iron basalt Except Intel or whatever.

Imagine running trying to fine tune DeepSeek on an Arc 😭

iron basalt Apr 21, 2025, 2:48 AM

#

charred estuary Yea but from what I’m finding most AMD cards come no where near NVIDIA

The most recent is not too far off. And way cheaper.

#

AMD is chosen for price.

charred estuary Apr 21, 2025, 2:49 AM

#

iron basalt No idea of the status of Intel, seems like no one cares about it.

There not great for much. They started with laptop GPUs and then tried to do desktop but it didn’t really work out

agile cobalt Apr 21, 2025, 2:49 AM

#

iirc there are some programs that support inference on AMD, but for training you'll really want NVIDIA

charred estuary Apr 21, 2025, 2:50 AM

#

charred estuary There not great for much. They started with laptop GPUs and then tried to do des...

Kind like snapdragon is now trying to make non phone chips

charred estuary Apr 21, 2025, 2:50 AM

#

agile cobalt iirc there are some programs that support inference on AMD, but for training you...

There all so expensive tho 😭

iron basalt Apr 21, 2025, 2:50 AM

#

agile cobalt iirc there are some programs that support inference on AMD, but for training you...

You can train on AMD.

charred estuary Apr 21, 2025, 2:50 AM

#

Ima email Jenson and js be like hey you gotta have an extra H100 laying around somewhere right

charred estuary Apr 21, 2025, 2:51 AM

#

iron basalt You can train on AMD.

Can and should are very different

iron basalt Apr 21, 2025, 2:51 AM

#

Nvidia is the typical option, and probably what you want. If you can get one...

agile cobalt Apr 21, 2025, 2:51 AM

#

there is also the option of just renting cloud compute instead of purchasing a GPU though, specially if you want to try training/fine tuning larger models

iron basalt Apr 21, 2025, 2:52 AM

#

charred estuary Can and should are very different

Your goal is not to make some giant model or anything anyhow, so why not.

charred estuary Apr 21, 2025, 2:52 AM

#

agile cobalt there is also the option of just renting cloud compute instead of purchasing a G...

I have a $200 digital ocean credit from the GitHub student program so that’s what I’ve actually been doing

#

It’s $3.39 an hour for an H100 rig

charred estuary Apr 21, 2025, 2:53 AM

#

iron basalt Your goal is not to make some giant model or anything anyhow, so why not.

Because I also don’t wanna make a micro model. Target is like 2B-5B-7B

#

By not huge I mean not like DeepSeeks full 164B or whatever it is

#

Anyway any AMD cards that you think would be semi fast for training a 5B?

iron basalt Apr 21, 2025, 3:01 AM

#

charred estuary Anyway any AMD cards that you think would be semi fast for training a 5B?

No, nor would any Nvidia I think. Not enough memory even. IIRC you would need like 60-80GB of VRAM.

charred estuary Apr 21, 2025, 3:02 AM

#

iron basalt No, nor would any Nvidia I think. Not enough memory even. IIRC you would need li...

3090 24gb actually could and would only take a few days-a week

iron basalt Apr 21, 2025, 3:06 AM

#

charred estuary 3090 24gb actually could and would only take a few days-a week

Have you done this?

charred estuary Apr 21, 2025, 3:07 AM

#

iron basalt Have you done this?

Same friend that I was talking about before. He trained a 2B on a 3090

iron basalt Apr 21, 2025, 3:08 AM

#

Well 2B is a lot smaller.

#

Also fine tuning or from scratch?

charred estuary Apr 21, 2025, 3:08 AM

#

From scratch with torch

charred estuary Apr 21, 2025, 3:08 AM

#

iron basalt Also fine tuning or from scratch?

And I ment it’s the guy I was talking about that did the 2 and 5B

#

I asked if he did both locally or the 5 in the cloud and he said 3090 did both

iron basalt Apr 21, 2025, 3:16 AM

#

charred estuary I asked if he did both locally or the 5 in the cloud and he said 3090 did both

Well it seems like you have your answer then already.

charred estuary Apr 21, 2025, 3:18 AM

#

iron basalt Well it seems like you have your answer then already.

Nah not rly, I’m asking about AMD cards that have similar performance

iron basalt Apr 21, 2025, 3:20 AM

#

charred estuary Nah not rly, I’m asking about AMD cards that have similar performance

The memory is 24 GB, and it has about 36 (rounded up) TFLOPS at half precision.

#

It goes for about $1,700-1,800.

#

The Radeon RX 7900 XT has 20 GB, and about 103 TFLOPS at half precision. It goes for about $1,000-1,300.

#

The Radeon RX 6950 XT has 16 GB, and about 47 TFLOPS at half precision. It goes for about $500.

#

So the 3090 is clearly optimized around memory, likely to be able to hold a lot of texture data.

#

So games can load once and hold it all in there.

#

The conclusion here is the Nvidia prices are absurd, especially for a GPU that old.

#

Nvidia 5090 and such are way faster in terms of half precision FLOPs, but nobody can get a hold of them.

#

(And also have 32 GB)

iron basalt Apr 21, 2025, 3:42 AM

#

charred estuary From scratch with torch

Doing a bit of math to check this. It seems like with 24GB and some tricks you can just barely fit the 5B in 24 GB (during training).

#

So, important to keep that in mind if you want to go AMD, since it has less VRAM (unless you are willing to increase the price, then you can get 32 or 48 GB).

#

But on the other hand more FLOPs. So if you go smaller, you can go faster than the 3090 (e.g. 3-4B).

#

Note that the tricks used also degrade the quality, but since this is just for learning / messing around, that does not really matter.

river cape Apr 21, 2025, 7:59 AM

#

severe blade its the same.

hi i saw this right one, first of check which cuda toolkit you have and if its compactible with exisiting torch version

#

and also are you on windows or linux?

severe blade Apr 21, 2025, 11:29 AM

#

river cape and also are you on windows or linux?

windows.

severe blade Apr 21, 2025, 11:30 AM

#

river cape hi i saw this right one, first of check which cuda toolkit you have and if its c...

ok, ill get back to you in a while.

river cape Apr 21, 2025, 12:15 PM

#

severe blade ok, ill get back to you in a while.

sometimes the versions of cuda , cudnn might cause such problems ,

rich river Apr 21, 2025, 1:23 PM

#

This is using the detectron2 framework

from detectron2.engine import launch
...
def main2(args):
  ...
  print(outputs)
  return outputs

def launch_main():
    # Create arg parser
    arg_parser = setup_arg_parser()
    # args = arg_parser.parse_args()
    args = arg_parser.parse_args(["--dataset-dir", "/workspaces/FFS-main/data",\
                                  "--test-dataset","E2E_Robotics_ood_val",\
                                  "--num-gpus", "1",\
                                  "--config-file", "/workspaces/FFS-main/Flow_Feature_Synthesis/detection/configs/AD-Detection/regnetx.yaml",\
                                  "--inference-config","/workspaces/FFS-main/Flow_Feature_Synthesis/detection/configs/Inference/standard_nms.yaml",\
                                  "--random-seed", "8",\
                                  "--image-corruption-level","0",\
                                  "--visualize","1"
                                  ])
    # Support single gpu inference only.
    args.num_gpus = 1
    # args.num_machines = 8

    print("Command Line Args:", args)

    outputs = launch(
        main2,
        args.num_gpus,
        num_machines=args.num_machines,
        machine_rank=args.machine_rank,
        dist_url=args.dist_url,
        args=(args,),
    )
    print("outputs in launch main are:")
    print(outputs)

the outputs printed inside main2 are correct
but when I try to get the result in launch_main, it shows

outputs in launch main are:
None

any ideas?

viscid urchin Apr 21, 2025, 2:31 PM

#

You’re not returning anything from “launch”

upbeat prism Apr 21, 2025, 2:33 PM

#

does the autograd/backprop engine in PyTorch first build a topologically sorted graph and then just runs backprop or do they somehow "merge" the two?

viscid urchin Apr 21, 2025, 2:56 PM

#

Sounds like it does a topological sort first https://pytorch.org/blog/how-computational-graphs-are-executed-in-pytorch/

#

https://cismography.medium.com/understanding-autograd-from-scratch-66c2d209c61f

gilded sundial Apr 21, 2025, 4:39 PM

#

What type of projects can I make with CNN classification?

jaunty helm Apr 21, 2025, 4:44 PM

#

gilded sundial What type of projects can I make with CNN classification?

mnist is a classic
cnn works wonders in image recognition

gilded sundial Apr 21, 2025, 4:46 PM

#

jaunty helm mnist is a classic cnn works wonders in image recognition

Our teacher gave us a dataset to classify cotton leaf diseases and the images are about 100kb on avg. Would I be able to train the model with a good F1 score using these low scaled images ?

river cape Apr 21, 2025, 4:50 PM

#

gilded sundial Our teacher gave us a dataset to classify cotton leaf diseases and the images ar...

use a pretrained model

gilded sundial Apr 21, 2025, 4:53 PM

#

Ok

umbral hatch Apr 21, 2025, 5:04 PM

#

hey guys I'm interested in data science is there any specific website in the pythondiscord.com/resources for data science? or should i just learn python for now?

viscid urchin Apr 21, 2025, 5:10 PM

#

Do you have any budget for courses/websites? Some of the nicer-seeming options cost a little something.

#

(Plenty of free stuff too, but there are some nicely-structured paid things)

umbral hatch Apr 21, 2025, 5:13 PM

#

nah unfrotuantely

#

student mainly with data science studying on the side

viscid urchin Apr 21, 2025, 5:14 PM

#

Are you past the basics of Python? If not this is a pretty good course https://pll.harvard.edu/course/cs50s-introduction-programming-python

Harvard University

CS50's Introduction to Programming with Python | Harvard University

An introduction to programming using Python, a popular language for general-purpose programming, data science, web programming, and more.

umbral hatch Apr 21, 2025, 5:15 PM

#

mooc.fi part 3 so far