past meteor Sep 20, 2025, 8:01 PM

#

I just misunderstood what point you were trying to make 🙂

next panther Sep 20, 2025, 8:01 PM

#

past meteor I just misunderstood what point you were trying to make 🙂

might u be talking to me?

errant lake Sep 20, 2025, 8:03 PM

#

past meteor I just misunderstood what point you were trying to make 🙂

non-native english speaker just trying to have a conversation haha

#

sorry if unclear.

worldly dawn Sep 20, 2025, 8:03 PM

#

I am big fan of doing some RICE scoring

#

It helps frame problems with a specific goal and can be done as a team

#

(https://www.intercom.com/blog/rice-simple-prioritization-for-product-managers/)

past meteor Sep 20, 2025, 8:04 PM

#

I didn’t know of this, I’ll check it out thanks a lot

worldly dawn Sep 20, 2025, 8:05 PM

#

And in terms of orgs/groups/teams, I do like giving problems with KPI, so it's about converging towards solving a problem that is crisp to everyone

#

It's as important to know what to work on as it is important to know what to not work on. It avoids a lot of issues with respect to engineers being annoyed or wondering about why we ain't working on that shiny thing

past meteor Sep 20, 2025, 8:05 PM

#

We rarely formulate this stuff because most people on my team have great “instinct”

worldly dawn Sep 20, 2025, 8:06 PM

#

Sure and that's great! But I find that having great instinct going in the same direction has its benefits

past meteor Sep 20, 2025, 8:07 PM

#

Yup, it’s not an excuse

#

Lastly, when it comes to data and Python one thing I notice a lot is that classic DS/DA profiles are highly specialised to the point where it becomes annoying

errant lake Sep 20, 2025, 8:09 PM

#

past meteor Lastly, when it comes to data and Python one thing I notice a lot is that classi...

can you please elaborate?

past meteor Sep 20, 2025, 8:11 PM

#

Sometimes the solution isn’t building a new data thing or model but just rethinking the business process and maybe putting a small app in the middle

#

Most of the people that use Python in my company are fully siloed to pandas, spark stuff

errant lake Sep 20, 2025, 8:13 PM

#

past meteor Sometimes the solution isn’t building a new data thing or model but just rethink...

Well as DAs you should have a say in those processes/apps or additions that would solve that problem

next panther Sep 20, 2025, 8:14 PM

#

i think bro loves vtube

#

btw gn guys

little bobcat Sep 21, 2025, 12:18 AM

#

Hi

cedar fox Sep 21, 2025, 4:37 AM

#

I am trying to train a model with tensorflow/keras and get this error:

UserWarning: Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches. You may need to use the .repeat() function when building your dataset.
self._interrupted_warning()
2025-09-20 22:35:46.493184: I tensorflow/core/framework/local_rendezvous.cc:407] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
[[{{node IteratorGetNext}}]]
[[StatefulPartitionedCall/ArgMax/_6]]
2025-09-20 22:35:46.493207: I tensorflow/core/framework/local_rendezvous.cc:426] Local rendezvous recv item cancelled. Key hash: 1198440015494271145

I preprocessed my training data and saved it to .npz files. When I try to just loop infinitely over the .npz files, training never advances from "1/15 epochs". How do I resolve this? What am I missing and how do I trouble shoot it?

I'm fitting the model like this:

def train(
    model: keras.models.Model,
    training_data: Iterable[tuple[np.ndarray, np.ndarray]],
    steps_count: int,
    validation_data: Iterable[tuple[np.ndarray, np.ndarray]],
    batch_size: int,
    output_directory: Path,
) -> keras.Model:
    typer.echo("Training model")

    model.compile(
        loss="categorical_crossentropy", optimizer="sgd", metrics=["accuracy"]
    )
    model.fit(
        batches(training_data, batch_size),
        validation_data=batches(validation_data, batch_size),
        epochs=15,
        steps_per_epoch=steps_count,
        verbose=2,
        callbacks=[BackupAndRestore(output_directory, delete_checkpoint=False)],
    )
    return model

My code is here: https://github.com/codeguru42/gobot/blob/steps_count/src/train.py

GitHub

gobot/src/train.py at steps_count · codeguru42/gobot

AI for playing Go. Contribute to codeguru42/gobot development by creating an account on GitHub.

gritty vessel Sep 21, 2025, 5:47 AM

#

Hey a rookie question but is there a way I can see how cnn is extracting features from an image

#

Because it's not able to capture patterns on my dataset but it memorize a sample when I train it on same like 2-10 sample for 500 epochs but when I train on like 12000 samples it's not able to capture patterns

waxen kindle Sep 21, 2025, 5:49 AM

#

Your CNN might not be complex enough

#

Or your hyper parameterd are badly tuned

#

Do you have a plot epoch x error or epoch x loss ?

#

That you would have furing training

mossy blaze Sep 21, 2025, 11:01 AM

#

Hi there! I want to share a GitHub project about artificial intelligence: https://github.com/Julien-Livet/ai. I am currently thinking about natural language learning step by step, after composing numbers, expressions and dealing relationships with Python standard types (bool, int, float, numpy.ndarray, Sympy and OpenCV functions). I am open to any constructive feedback about my work 🙂 .

abstract loom Sep 21, 2025, 1:32 PM

#

Hi

hollow oasis Sep 21, 2025, 2:31 PM

#

sup guys

mystic heron Sep 21, 2025, 3:14 PM

#

Anyone know

#

Anything about my script

#

That I can improve for better results/

serene scaffold Sep 21, 2025, 3:29 PM

#

mystic heron Anything about my script

no, none of us have seen your script or know what it's supposed to do or what the current results are. you have to say all of that for us to be able to help you.

mystic heron Sep 21, 2025, 3:29 PM

#

serene scaffold no, none of us have seen your script or know what it's supposed to do or what th...

I know I just wanted to see who would respond first

#

📎 message.txt

arctic wedgeBOT Sep 21, 2025, 3:29 PM

#

mystic heron

Click here to see this code in our pastebin.

serene scaffold Sep 21, 2025, 3:29 PM

#

mystic heron I know I just wanted to see who would respond first

please never do that. always give all the information people would need to start helping you right away.

mystic heron Sep 21, 2025, 3:29 PM

#

Sure

#

can you see it?

#

https://paste.pythondiscord.com/KELBSCU7W4S2PQIUXO4IIXZLRE

serene scaffold Sep 21, 2025, 3:30 PM

#

yes. you also have to say what the current results are.
I'm actually heading out, but hopefully someone will take a look.

mystic heron Sep 21, 2025, 3:30 PM

#

Alr

#

The current resutls are... amazing

#

I got a 300% return within a 3 month period on BTCUSD

#

simulated on previous data

#

but its the same thing as it would do if it were live other than latency

#

that could be an issue but the model is pretrained

cedar fox Sep 21, 2025, 4:44 PM

#

mystic heron I know I just wanted to see who would respond first

This only wastes time. Just ask your question.

mystic heron Sep 21, 2025, 4:45 PM

#

cedar fox This only wastes time. Just ask your question.

I DID

#

are you listening???

#

https://paste.pythondiscord.com/KELBSCU7W4S2PQIUXO4IIXZLRE

I NEED FEEDBACK

cedar fox Sep 21, 2025, 4:45 PM

#

that's not a question

mystic heron Sep 21, 2025, 4:45 PM

#

Can you please give me feedback?

cedar fox Sep 21, 2025, 4:48 PM

#

mystic heron Can you please give me feedback?

Can you be more specific? Maybe take some time to explain the purpose of your code. Then describe what the current results are. Is there any problems with the code that you need help with? Or are you just looking for a general code review?

mystic heron Sep 21, 2025, 4:51 PM

#

I have a pdf

#

but it wont let me send. ill dm it to you

cedar fox Sep 21, 2025, 4:55 PM

#

what does a pdf have to do with anything?

#

it's probably malware

serene scaffold Sep 21, 2025, 5:55 PM

#

mystic heron I have a pdf

Can you say what the PDF says? PDFs aren't safe to share.

mossy blaze Sep 21, 2025, 6:43 PM

#

mossy blaze Hi there! I want to share a GitHub project about artificial intelligence: https:...

I added a test with the Syracuse suite 🙂 . Here is the associated graph. Enjoy!

stark frigate Sep 21, 2025, 8:59 PM

#

Yoooo anyone codes in manim

#

Or uses manim

shy sonnet Sep 21, 2025, 10:00 PM

#

Hi, I’m Francis 👋

Aspiring Data Engineer learning Python & SQL, currently building my first projects.

Excited to learn & connect 🚀

meager gate Sep 22, 2025, 7:06 AM

#

shy sonnet Hi, I’m Francis 👋 Aspiring Data Engineer learning Python & SQL, currently bu...

Hello Francis, I am Ivan. 🙂

#

I am open to contact too

fervent badger Sep 22, 2025, 8:52 AM

#

stark frigate Yoooo anyone codes in manim

what is manim used for ?

toxic vault Sep 22, 2025, 11:35 AM

#

fervent badger what is manim used for ?

math animations

calm cipher Sep 22, 2025, 3:17 PM

#

fervent badger what is manim used for ?

it's written and used by the 3blue1brown YouTube channel

hot otter Sep 22, 2025, 3:30 PM

#

hi i am sparkling
i am exited to connect with you guys.🙂

fervent badger Sep 22, 2025, 3:33 PM

#

calm cipher it's written and used by the 3blue1brown YouTube channel

ok tanks

marsh iron Sep 22, 2025, 4:47 PM

#

hot otter hi i am sparkling i am exited to connect with you guys.🙂

hello there, sparkling!

crude escarp Sep 22, 2025, 5:04 PM

#

hot otter hi i am sparkling i am exited to connect with you guys.🙂

sup

warm flame Sep 22, 2025, 9:25 PM

#

heeeeeeeeeey guys

mystic heron Sep 22, 2025, 9:51 PM

#

yo im gay

serene scaffold Sep 22, 2025, 10:15 PM

#

This channel is for talking about data science and AI. You're welcome to participate, but don't just say "hi" or anything like that. Say something about the topic that can contribute to meaningful@warm flame @crude escarp @marsh iron @hot otter conversation. @mystic heron

hexed maple Sep 22, 2025, 10:34 PM

#

do we know of any Time-Series adjusted Random Forests or Neural Networks?

serene scaffold Sep 22, 2025, 10:59 PM

#

hexed maple do we know of any Time-Series adjusted Random Forests or Neural Networks?

What's your actual question?

hexed maple Sep 22, 2025, 11:00 PM

#

i need to estimate some nuisance functions in the DML framework, but i need time series adjusted methods

earnest light Sep 23, 2025, 12:53 AM

#

hello i'm sheiza,nice to see you guys

serene scaffold Sep 23, 2025, 1:50 AM

#

earnest light hello i'm sheiza,nice to see you guys

Hello! Please read this: #data-science-and-ml message

hot otter Sep 23, 2025, 8:58 AM

#

earnest light hello i'm sheiza,nice to see you guys

hello sheiza

serene scaffold Sep 23, 2025, 1:41 PM

#

Please stop just writing greetings without saying anything about data science or AI. These messages will be treated as intentional spam!

frosty mountain Sep 23, 2025, 2:15 PM

#

in pandas, how do you set a negative number to NaN

agile cobalt Sep 23, 2025, 2:17 PM

#

usually you'd just use numpy for that, np.where(series < 0, np.nan, series)
(series being a pandas series)

#

series[series < 0] = float('nan') also works but I'd recommend against using in-place operations if you can avoid it

frosty mountain Sep 23, 2025, 2:19 PM

#

Cheers

versed pilot Sep 23, 2025, 5:18 PM

#

agile cobalt usually you'd just use numpy for that, `np.where(series < 0, np.nan, series)` (s...

You can use .where in pandas https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.where.html

trail mist Sep 23, 2025, 8:42 PM

#

hey guys please let me out from this error

latent heath Sep 23, 2025, 9:09 PM

#

I'm trying to get jupyter notebooks to work in Pycharm, I've run the pip install notebook command, but I can't find the way to create a jupyter project the way the website shows it.
https://www.jetbrains.com/help/pycharm/editing-jupyter-notebook-files.html
I don't even see the sidebard on the right in the first image shown.

PyCharm Help

Create and edit Jupyter notebooks | PyCharm

serene scaffold Sep 23, 2025, 9:31 PM

#

trail mist hey guys please let me out from this error

hello, please always show the code and the whole entire error message as text. it's difficult to read all this, and some of the error message is cut off.

#

!code

arctic wedgeBOT Sep 23, 2025, 9:31 PM

#

Formatting code on Discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

serene scaffold Sep 23, 2025, 9:34 PM

#

latent heath I'm trying to get jupyter notebooks to work in Pycharm, I've run the `pip instal...

what's the specific issue that you'r ehaving?

latent heath Sep 23, 2025, 9:34 PM

#

I cannot confirm that jupyter is working as intended, and if it is how to work with it in pycharm.

serene scaffold Sep 23, 2025, 9:36 PM

#

latent heath I cannot confirm that jupyter is working as intended, and if it is how to work w...

can you show a screenshot of what you're currently seeing in pycharm?

#

are you still there, @latent heath?

latent heath Sep 23, 2025, 9:41 PM

#

I am, sorry.

#

I was walking, I need a moment to get set up.

#

That's the new project screen, and the options for enviroments I have.

agile cobalt Sep 23, 2025, 9:48 PM

#

latent heath That's the new project screen, and the options for enviroments I have.

the tutorial is using Conda, which you do have in the list?

any of them should work though, it's just that conda contains binaries for some annoying to build packages

latent heath Sep 23, 2025, 9:49 PM

#

I normally just use the first. I'm in an ai course and the prof has given us a .ipynb to work with, so I'm just going through setting it up.

agile cobalt Sep 23, 2025, 9:52 PM

#

~~oh you meant the templates?~~

serene scaffold Sep 23, 2025, 9:52 PM

#

latent heath I normally just use the first. I'm in an ai course and the prof has given us a ....

I don't recommend using conda, unless you know that you have a dependency that requires it (this is getting increasingly rare) or your professor says you must do it.

latent heath Sep 23, 2025, 9:53 PM

#

Conda isn't required, but the rest of the instruction is on that file.

serene scaffold Sep 23, 2025, 9:54 PM

#

what do you mean "that file"?

agile cobalt Sep 23, 2025, 9:54 PM

#

does it shows anything if you just try to open the jupyter notebook in it?

latent heath Sep 23, 2025, 9:55 PM

#

serene scaffold what do you mean "that file"?

The .ipynb file he's given us to work on.

serene scaffold Sep 23, 2025, 9:55 PM

#

latent heath That's the new project screen, and the options for enviroments I have.

just clicking "Create" with this menu as-is should be sufficient.

latent heath Sep 23, 2025, 9:55 PM

#

agile cobalt does it shows anything if you just try to open the jupyter notebook in it?

Welp. When I tried to at the start, it gave me some apps to try, but I was still under the impression at the time that jupyter was an app and not a package, so I haven't retried it since correcting that idea and installing it.

urban vine Sep 23, 2025, 9:56 PM

#

latent heath Welp. When I tried to at the start, it gave me some apps to try, but I was still...

I remembered having jupyter lab. Now that I have a new laptop, I just have Visual Studio Code.

serene scaffold Sep 23, 2025, 9:56 PM

#

latent heath Welp. When I tried to at the start, it gave me some apps to try, but I was still...

jupyter is a python package that can be used to run the jupyter notebook browser app, which is a way of editing jupyter notebooks.

urban vine Sep 23, 2025, 9:58 PM

#

serene scaffold jupyter is a python package that can be used to run the jupyter notebook browser...

I remembered taking a class based on Jupyter Lab. If I one day decide to complete the course, I suppose I could perhaps do so. Afterwards, given that I had experience in cv2, I could then apply for a job with that udemy certificate

latent heath Sep 23, 2025, 9:58 PM

#

And it does just open as it should. Welp. One of the dumber mistakes I've made. It says community edition only supports read only, but I should just be able to do it in any browser?

serene scaffold Sep 23, 2025, 9:59 PM

#

urban vine I remembered taking a class based on Jupyter Lab. If I one day decide to complet...

if your only credential is a udemy certificate, you will not be able to out-compete degree holders.

urban vine Sep 23, 2025, 9:59 PM

#

serene scaffold if your only credential is a udemy certificate, you will not be able to out-comp...

I thought udemy certificates plus experience I gain from that Computer Vision certificate program could guarantee a job

serene scaffold Sep 23, 2025, 9:59 PM

#

latent heath And it does just open as it should. Welp. One of the dumber mistakes I've made. ...

if you have a python environment with jupyter installed, doing python -m jupyter notebook in a terminal will start the juptyer notebook browser app

urban vine Sep 23, 2025, 10:00 PM

#

Like I believe it isn't just the certificate itself that guarantees the job, but also the knowledge I gained/retained while working toward that certificate

latent heath Sep 23, 2025, 10:01 PM

#

serene scaffold if you have a python environment with jupyter installed, doing `python -m jupyte...

Neat. Thanks. Now hopefully this is straightforward enough.

serene scaffold Sep 23, 2025, 10:01 PM

#

urban vine I thought udemy certificates plus experience I gain from that Computer Vision ce...

No, if there's a job listing for an AI/ML job, and more people with relevant degrees apply to that position than they can interview (which will happen), they won't bother interviewing anyone who doesn't.

urban vine Sep 23, 2025, 10:01 PM

#

So a degree is still needed, huh?

#

So like a bachelors degree or masters I suppose

serene scaffold Sep 23, 2025, 10:02 PM

#

urban vine So a degree is still needed, huh?

yes. this won't stop being the case in the forseeable future.
a masters is usually required for these positions.

urban vine Sep 23, 2025, 10:02 PM

#

Well, if I were to get a masters, I must not have a social life if possible

#

Well, I should try not to socialize with anyone outside of my career interest

serene scaffold Sep 23, 2025, 10:03 PM

#

uh, what?

latent heath Sep 23, 2025, 10:04 PM

#

Why's that? I'm only in my undergrad rn, but I'm considering a masters in some area of discrete math, but generally wouldn't it be better to have collegues with varied backgrounds?

urban vine Sep 23, 2025, 10:04 PM

#

latent heath Why's that? I'm only in my undergrad rn, but I'm considering a masters in some a...

So people with different career interests?

latent heath Sep 23, 2025, 10:05 PM

#

I mean, if you're doing data science and ml, who are you doing it for? Like somewhere along the way you're gonna encounter people in different fields and have to work with them.

urban vine Sep 23, 2025, 10:06 PM

#

I see

latent heath Sep 23, 2025, 10:06 PM

#

Also, just make friends with cool people? I can't speak on the purely utilitarian aspect of how you pick your friends, but I don't see a reason to just aim for people who want the exact same career as you. You'll find those anyway.

urban vine Sep 23, 2025, 10:06 PM

#

I see. So social skills are important?

latent heath Sep 23, 2025, 10:06 PM

#

Correct.

#

If you want it for utilitarian purposes, modern science and professions are rarely solo or signle discipline endevours. You are a social being. Be social.

urban vine Sep 23, 2025, 10:09 PM

#

I will try

latent heath Sep 23, 2025, 10:09 PM

#

👍

serene scaffold Sep 23, 2025, 10:11 PM

#

there's more to life than your career. it's a good thing to have varied interests and to have friends who share those interests

urban vine Sep 23, 2025, 10:12 PM

#

So I shouldn't try and graduate as fast as possible?

serene scaffold Sep 23, 2025, 10:12 PM

#

urban vine So I shouldn't try and graduate as fast as possible?

this is a non sequitur.

urban vine Sep 23, 2025, 10:13 PM

#

serene scaffold this is a non sequitur.

Like, should I take my time in getting the degree I need for my career?

serene scaffold Sep 23, 2025, 10:13 PM

#

urban vine Like, should I take my time in getting the degree I need for my career?

what country is this? in the US, a bachelors degree usually takes four years. so do it in four.

urban vine Sep 23, 2025, 10:14 PM

#

Alright. Once I get the finances needed for my degree, I will go the four years

serene scaffold Sep 23, 2025, 10:14 PM

#

urban vine Alright. Once I get the finances needed for my degree, I will go the four years

what country are you in?

urban vine Sep 23, 2025, 10:14 PM

#

serene scaffold what country are you in?

The US.

serene scaffold Sep 23, 2025, 10:16 PM

#

urban vine The US.

so it's pretty much impossible to pay for a degree up-front. when you say "get the finances you need", what are you talking about?

latent heath Sep 23, 2025, 10:16 PM

#

urban vine So I shouldn't try and graduate as fast as possible?

I mean, I'm not. Between work and not being able to confirm that there won't be scheduling conflicts, I dropped my course load and have seen my grades go up for it. Something to think about if you want a masters.

latent heath Sep 23, 2025, 10:17 PM

#

serene scaffold so it's pretty much impossible to pay for a degree up-front. when you say "get t...

This is a good point. I lived with my parents for the first 3 years of my degree, and only paying tuition and textbooks I'm still over 24k.

fallow yacht Sep 24, 2025, 3:33 AM

#

has anyone got access to WRDS CRSP data via an institution subscription and would be willing to share the AAPL series?

#

I've been trying to reproduce the "Tidy Finance with Python" beta calculations, and my attempts are close but not quite the same.

Tidy Finance

Beta Estimation with Python

An opinionated approach on empirical research in financial economics

#

My colab notebook is here: https://colab.research.google.com/drive/1UIIBMfx-BHro_MAX2ZwZ7tA1Zd4EZNbG?usp=sharing

with yf data i get Intercept 0.009941 and beta 1.376236 , however the article is quoting Intercept 0.010093 and beta 1.387103 , which is very close but not quite. I am interested to know whether CRSP is doing something additional when making adjustments to prices, or whether I missed something

Google Colab

fallow yacht Sep 24, 2025, 4:54 AM

#

I'd like to compare the outpuit of this:

crsp_monthly_query = (
  "SELECT msf.permno, date_trunc('month', msf.mthcaldt)::date AS date, "
         "msf.mthret AS ret, msf.shrout, msf.mthprc AS altprc, "
         "ssih.primaryexch, ssih.siccd "
    "FROM crsp.msf_v2 AS msf "
    "INNER JOIN crsp.stksecurityinfohist AS ssih "
    "ON msf.permno = ssih.permno AND "
       "ssih.secinfostartdt <= msf.mthcaldt AND "
       "msf.mthcaldt <= ssih.secinfoenddt "
   f"WHERE msf.mthcaldt BETWEEN '{start_date}' AND '{end_date}' "
          "AND ssih.sharetype = 'NS' "
          "AND ssih.securitytype = 'EQTY' "  
          "AND ssih.securitysubtype = 'COM' " 
          "AND ssih.usincflg = 'Y' " 
          "AND ssih.issuertype in ('ACOR', 'CORP') " 
          "AND ssih.primaryexch in ('N', 'A', 'Q') "
          "AND ssih.conditionaltype in ('RW', 'NW') "
          "AND ssih.tradingstatusflg = 'A'"
)

to the yahoo data for AAPL, so see where the discrepency arises

opaque condor Sep 24, 2025, 2:12 PM

#

How can I train a multi-model?
What going take catastrophic forgetting even if it has a large data set?

bright crypt Sep 24, 2025, 3:56 PM

#

I am trying to build a movie recommendation system, and i don't have much knowledge about RecSys apart from the basics of SVD and came across criticker , it looks like a good interface and close to what i want to do, are there any specific resources that will come in handy or any tips to start with the project will be highly appreciated..

viscid urchin Sep 25, 2025, 3:33 PM

#

Hey folks, is anybody willing to do a neutral evalution of a Data Science B.S. degree program I am looking at? I have some personal biases here that I would like to calibrate out.

If so: https://datascience.fsu.edu/students/combined-pathways
Specifically, what you get when you click on BS in Computer Science (BS-CS to MS-IDS)

#

The program director is an old friend of mine, and I can't really expect myself to not have some rose tint when I review his choices etc.

#

The base-level Comp. Sci. B.S. flow this uses is here: https://www.cs.fsu.edu/files/Course_Flowcharts_2024/2020_CS_BS_Updated_2024.pdf

mossy blaze Sep 25, 2025, 4:17 PM

#

mossy blaze I added a test with the Syracuse suite 🙂 . Here is the associated graph. Enjoy!

Has anyone looked at my work on GitHub? I'd love to hear some feedback on it 🙂 .

north thistle Sep 25, 2025, 5:03 PM

#

Im here to learn data science and AI. I'm a biomedical engineering student

viscid urchin Sep 25, 2025, 5:28 PM

#

(re: the above, feel free to @ me if you end up taking a look, many thanks.)

agile cobalt Sep 25, 2025, 5:34 PM

#

viscid urchin Hey folks, is anybody willing to do a neutral evalution of a Data Science B.S. d...

Computer Security Fundamentals for Data Science sounds a bit weird?.. shouldn't that be part of the "base-level Comp. Sci. B.S."?

viscid urchin Sep 25, 2025, 5:46 PM

#

agile cobalt Computer Security Fundamentals for Data Science sounds a bit weird?.. shouldn't ...

You’re right, that is weird. Conway’s Law suggests that means there is some org chart weirdness

vague heron Sep 25, 2025, 7:27 PM

#

I'm working on a project where I integrate all the standard stuff I think should be in any Pytorch project: MLFlow, Optuna, seperation between settings and logic using config files, cross validation, and making the core training script as generic as possible while supporting multiple model repo's like huggingface,ollama,monai. Are there any other projects that attempt writing a similar unified "template" code?

teal gate Sep 26, 2025, 12:49 AM

#

Hello I want to start to do Machine learning and AI can anyone tell me how i do it im kinda a begginer in python

hazy compass Sep 26, 2025, 4:41 AM

#

teal gate Hello I want to start to do Machine learning and AI can anyone tell me how i do ...

Me too , I'm a begginer

teal gate Sep 26, 2025, 4:42 AM

#

Nice

spring field Sep 26, 2025, 8:16 AM

#

beginner* 🙂

main citrus Sep 26, 2025, 10:13 AM

#

teal gate Hello I want to start to do Machine learning and AI can anyone tell me how i do ...

Hi, in my opinion start with learning python basis, (loops arrays and functions)
After that move to the analasis (EDA and data engineer)- master pandas and seaborn.
After you finish that you can move to machine learning and start learn the basic models (such as knn and lenear regression) and use them for your data with sklearn

waxen kindle Sep 26, 2025, 1:12 PM

#

teal gate Hello I want to start to do Machine learning and AI can anyone tell me how i do ...

focus on strong python bases, then array manipulation, linear algebra and statistics, with numpy and pandas

delicate trench Sep 26, 2025, 1:12 PM

#

i need to learn, alot of my 12th marks depends on it and i hope to make a career in AI engineering which requires atleast basic python knowledge

waxen kindle Sep 26, 2025, 1:13 PM

#

delicate trench i need to learn, alot of my 12th marks depends on it and i hope to make a career...

ok? do you need help with anything ?

delicate trench Sep 26, 2025, 1:13 PM

#

waxen kindle ok? do you need help with anything ?

i need python teacher

waxen kindle Sep 26, 2025, 1:14 PM

#

yeah we don't do that here

#

we have tons of good resources however

delicate trench Sep 26, 2025, 1:14 PM

#

waxen kindle yeah we don't do that here

oh why? is there a rule against it?

waxen kindle Sep 26, 2025, 1:14 PM

#

!res

arctic wedgeBOT Sep 26, 2025, 1:14 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

delicate trench Sep 26, 2025, 1:14 PM

#

waxen kindle we have tons of good resources however

so you think, that i am playing the games?

waxen kindle Sep 26, 2025, 1:14 PM

#

what ?

delicate trench Sep 26, 2025, 1:15 PM

#

i am serious about this python stuff dude.

waxen kindle Sep 26, 2025, 1:15 PM

#

if you wanna learn, check the resources and ask when you have specific questions

#

but we don't do teaching/tutoring here

delicate trench Sep 26, 2025, 1:15 PM

#

waxen kindle but we don't do teaching/tutoring here

wheres the rule against it

waxen kindle Sep 26, 2025, 1:17 PM

#

it kinda falls into the rule 9

#

!rule 9

arctic wedgeBOT Sep 26, 2025, 1:17 PM

#

Rules

9. Do not offer or ask for paid work of any kind.

waxen kindle Sep 26, 2025, 1:17 PM

#

(and if you plan to have someone doing it for free, you'll simply don't find anyone)

delicate trench Sep 26, 2025, 1:19 PM

#

arctic wedge

I never pay?

#

wdym? when did i offer a payment?

agile cobalt Sep 26, 2025, 1:19 PM

#

delicate trench wheres the rule against it

it is not against the rules, but nobody has time to teach you personally

you can ask questions and whoever's available may reply, but we are not home tutors

delicate trench Sep 26, 2025, 1:19 PM

#

do you think, i am son of jeff bezos?

waxen kindle Sep 26, 2025, 1:21 PM

#

No one will spend hours for free to teach you

worldly kelp Sep 26, 2025, 1:21 PM

#

@delicate trench i have a tutor for you... https://www.youtube.com

YouTube

Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.

agile cobalt Sep 26, 2025, 1:21 PM

#

delicate trench do you think, i am son of jeff bezos?

do you think, we have nothing else to do with our time?..

again, you can ask questions here, but we don't do 1:1 tutoring

waxen kindle Sep 26, 2025, 1:21 PM

#

Why would I spend hours doing it for free when I could be paid by someone to do so

delicate trench Sep 26, 2025, 1:23 PM

#

agile cobalt do you think, we have nothing else to do with our time?.. again, you can ask qu...

See, it seems you are entitled

You dont speak for 400k people in this server sir

delicate trench Sep 26, 2025, 1:23 PM

#

waxen kindle Why would I spend hours doing it for free when I could be paid by someone to do ...

Thats your choice, i respect whatever decision you make

#

Thats your decision and not mine to make

#

but dont try to force others

worldly kelp Sep 26, 2025, 1:24 PM

#

delicate trench Thats your choice, i respect whatever decision you make

been here for 5 years mate my friend, good luck finding anyone who will help you regularly and reliably for free, you will be much better off research on stack, github, youtube etc.

delicate trench Sep 26, 2025, 1:25 PM

#

worldly kelp been here for 5 years mate my friend, good luck finding anyone who will help you...

dude, dont speak for the 400k people here. Dont be entitled. i respect your choice if u dont wanna do it

waxen kindle Sep 26, 2025, 1:25 PM

#

Yeah really, I never saw someone accept such a thing here, everyone always share the resource page

#

bc that's how developers learn

delicate trench Sep 26, 2025, 1:25 PM

#

Yeah dude, but like chill

#

do ur own thing

worldly kelp Sep 26, 2025, 1:25 PM

#

you're entitled expecting people to give you their time and effort for free to teach you things you can very easily teach yourself

waxen kindle Sep 26, 2025, 1:25 PM

#

tbh if you wanna go into dev and Data science, you'll need to learn to use resources

#

better starting now

delicate trench Sep 26, 2025, 1:27 PM

#

worldly kelp you're entitled expecting people to give you their time and effort for free to t...

Dude like i said. I wont listen to you but I RESPECT your decision

#

I only listen to parents, God, teachers, and then whoever i want to listen to

#

ok?

#

And surely, i wont be paying a money to anyone. So no server rules are being broken

#

so chill out, and dont play the games with me

waxen kindle Sep 26, 2025, 1:29 PM

#

With your attitude, I would be very unlucky to have you as a student

delicate trench Sep 26, 2025, 1:30 PM

#

waxen kindle With your attitude, I would be very unlucky to have you as a student

because i know my rights and dont bend to your will>?

#

the ego is insane

waxen kindle Sep 26, 2025, 1:32 PM

#

It's not about rights and will, it's about people telling you it's gonna happen

worldly kelp Sep 26, 2025, 1:32 PM

#

<@&831776746206265384> can we perhaps get someone to tone this guys attitude down a bit, fresh addition to the server and already being combative/rude

waxen kindle Sep 26, 2025, 1:33 PM

#

I mean, we can simply stop talking and wait that hopefully someone come and accept, but you'll better start using the resources we gave you or you'll never learn anything

#

bc noone is coming to teach

#

realistically, noone will

delicate trench Sep 26, 2025, 1:34 PM

#

worldly kelp <@&831776746206265384> can we perhaps get someone to tone this guys attitude dow...

i just want to be left alone

delicate trench Sep 26, 2025, 1:34 PM

#

waxen kindle I mean, we can simply stop talking and wait that hopefully someone come and acce...

Yes, lets do the first one. We can stop talking and hopefully someone come and accept.

#

Thats the best option

serene scaffold Sep 26, 2025, 1:34 PM

#

!shh

arctic wedgeBOT Sep 26, 2025, 1:34 PM

#

✅ silenced current channel for 4 minute(s).

serene scaffold Sep 26, 2025, 1:34 PM

#

I need a few minutes to get caught up

#

@delicate trench in all my years here, I've never seen anyone commit to an ongoing mentor-student relationship with another user. if someone wants to do that (for free), they absolutely can, but that's so unlikely to happen that the best way to learn and get help is to use self-guided resources and ask specific questions in this server when you have them. there are lots of people here who are excited to answer one-off questions.

arctic wedgeBOT Sep 26, 2025, 1:38 PM

#

✅ unsilenced current channel.

serene scaffold Sep 26, 2025, 1:38 PM

#

we can now put that to bed

worldly kelp Sep 26, 2025, 1:39 PM

#

thankyou good sir 🙂

delicate trench Sep 26, 2025, 1:41 PM

#

serene scaffold <@1263062824721322024> in all my years here, I've never seen anyone commit to an...

can you please ask the others to stay out of my business tho?

#

They think they are slick man.

serene scaffold Sep 26, 2025, 1:41 PM

#

yeah, I said we're done talking about that, so they will.

delicate trench Sep 26, 2025, 1:42 PM

#

alright thanks man!

#

i will use resource, but i still will continue search the master

serene scaffold Sep 26, 2025, 1:42 PM

#

there is no master

worldly kelp Sep 26, 2025, 1:43 PM

#

facepalm

delicate trench Sep 26, 2025, 1:43 PM

#

serene scaffold there is no master

dude i just said chill out lets mind our own business

serene scaffold Sep 26, 2025, 1:43 PM

#

especially in AI. everyone is running around acting ike they know what they're doing, but everyone is trying to figure out what's going on

delicate trench Sep 26, 2025, 1:43 PM

#

!res

arctic wedgeBOT Sep 26, 2025, 1:43 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

tulip drift Sep 26, 2025, 9:49 PM

#

assalamualaikum guys
I'm learning AI but the problem is machine compatibility can someone share information for it except Cloud Computing Paid virtual machines or any method

serene scaffold Sep 26, 2025, 9:53 PM

#

tulip drift assalamualaikum guys I'm learning AI but the problem is machine compatibility ca...

What OS do you have

coral hollow Sep 26, 2025, 11:12 PM

#

How hard would it be to train a model to convert speech to text?

jagged jasper Sep 26, 2025, 11:25 PM

#

coral hollow How hard would it be to train a model to convert speech to text?

Not very difficult, considering you could just download one from huggingface.

coral hollow Sep 26, 2025, 11:26 PM

#

jagged jasper Not very difficult, considering you could just download one from huggingface.

How much would the cost be to run it locally and is it free for comercial use?

jagged jasper Sep 26, 2025, 11:26 PM

#

Good question, let me look.

coral hollow Sep 26, 2025, 11:26 PM

#

Sorry, I am new to the space and I am asking as a complete beginner

#

I need to find out if this would make sense for me to do

jagged jasper Sep 26, 2025, 11:28 PM

#

I would look at whisperer-large-v3

openai/whisper-large-v3 · Hugging Face

#

It only has 1.54B params

coral hollow Sep 26, 2025, 11:29 PM

#

whisper should not be free for comercial use IIRC

jagged jasper Sep 26, 2025, 11:29 PM

#

Ohh that's right...

coral hollow Sep 26, 2025, 11:30 PM

#

Wait could be free for commercial use

jagged jasper Sep 26, 2025, 11:30 PM

#

No, it is not.

#

But this version is! https://huggingface.co/openai/whisper-large-v3-turbo

openai/whisper-large-v3-turbo · Hugging Face

#

MIT license!

coral hollow Sep 26, 2025, 11:31 PM

#

Nice

jagged jasper Sep 26, 2025, 11:31 PM

#

And it only has 800M params

coral hollow Sep 26, 2025, 11:31 PM

#

However does it make sense to train the model?

#

I don't want it to think I am saying the wrong words

jagged jasper Sep 26, 2025, 11:32 PM

#

coral hollow However does it make sense to train the model?

Probably not.

#

You could if you wanted, but it would be a lot of work.

coral hollow Sep 26, 2025, 11:33 PM

#

Hm..

#

It does not even have to understand a lot, it is just supposed to convert speech to the correct letters. If it sounds right its already enough for me

jagged jasper Sep 26, 2025, 11:35 PM

#

coral hollow It does not even have to understand a lot, it is just supposed to convert speech...

That's the hard part. 'Converting speech to the correct letters'

agile cobalt Sep 26, 2025, 11:35 PM

#

coral hollow It does not even have to understand a lot, it is just supposed to convert speech...

That """just""" is a giant hurdle

coral hollow Sep 26, 2025, 11:35 PM

#

it would be fine if he thinks: "apple" is "abble" but not fine if it thinks its "train"

jagged jasper Sep 26, 2025, 11:35 PM

#

@coral hollow Do you want me to write a script for whisper v3 large for you?

coral hollow Sep 26, 2025, 11:35 PM

#

jagged jasper <@1283868127201067099> Do you want me to write a script for whisper v3 large for...

no no

#

just wondering about the expected accuracy

jagged jasper Sep 26, 2025, 11:37 PM

#

I'm not sure, but probably pretty good.

#

I'm testing it now

agile cobalt Sep 26, 2025, 11:38 PM

#

coral hollow just wondering about the expected accuracy

assuming that the input is clear, fluent and loud enough, it is pretty good (comparable to assistants like Siri or Alexa)

coral hollow Sep 26, 2025, 11:40 PM

#

Essentially what I need is whatever google is using to convert spoken words to text, like the small microphone button to talk

#

which then just converts whatever language is spoken to letters

jagged jasper Sep 26, 2025, 11:40 PM

#

coral hollow which then just converts whatever language is spoken to letters

What do you mean 'letters'?

coral hollow Sep 26, 2025, 11:40 PM

#

i wonder how they are doing it

coral hollow Sep 26, 2025, 11:40 PM

#

jagged jasper What do you mean 'letters'?

words

#

text

jagged jasper Sep 26, 2025, 11:41 PM

#

Ok, I've made a small script:

import torch
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
from datasets import load_dataset


device = "cuda:0" if torch.cuda.is_available() else "cpu"
torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32

model_id = "openai/whisper-small"

model = AutoModelForSpeechSeq2Seq.from_pretrained(
    model_id, dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
)
model.to(device)

processor = AutoProcessor.from_pretrained(model_id)

pipe = pipeline(
    "automatic-speech-recognition",
    model=model,
    tokenizer=processor.tokenizer,
    feature_extractor=processor.feature_extractor,
    torch_dtype=torch_dtype,
    device=device,
)

result = pipe(["audio_1.mp3"], batch_size=1)
print(result["text"])

#

Make sure there is a mp3 file labeled 'audio_1.mp3' in the same directory.

coral hollow Sep 26, 2025, 11:42 PM

#

very cool

#

how well does it perform?

jagged jasper Sep 26, 2025, 11:42 PM

#

coral hollow very cool

It's from the website 😜

jagged jasper Sep 26, 2025, 11:42 PM

#

coral hollow how well does it perform?

You can see I've chosen the small version so it can download faster. It's almost downloaded and I can see how it works.

#

@coral hollow It looks like it has an accuracy of 'up to 99% in some cases'!

coral hollow Sep 26, 2025, 11:44 PM

#

jagged jasper <@1283868127201067099> It looks like it has an accuracy of 'up to 99% in some ca...

that sounds hard to believe but yea lets see

jagged jasper Sep 26, 2025, 11:44 PM

#

yeah ik

#

'99%'

#

ugh now I need ffmpeg, one minute.

coral hollow Sep 26, 2025, 11:47 PM

#

I'd test it with music in the background, speaking quitly and like somone who dropped out of school

#

Then we'll see how good it really is

jagged jasper Sep 26, 2025, 11:48 PM

#

coral hollow I'd test it with music in the background, speaking quitly and like somone who dr...

Good idea. Remember to replace whisper-small with whisper-large-v3-turbo

coral hollow Sep 26, 2025, 11:49 PM

#

sure

#

Try talking while eating

#

At this point its trolling the ai though

#

I am not at home, so I can't test it myself rn

#

@jagged jasper did you test it?

jagged jasper Sep 27, 2025, 12:00 AM

#

@coral hollow I'm having a problem with ffmpeg right now; I'm not on my main computer. You can try it yourself; it's not a big download.

#

You'll need to get a particular version of ffmpeg, ffmpeg 7 I think.

coral hollow Sep 27, 2025, 12:01 AM

#

ok

jagged jasper Sep 27, 2025, 12:33 AM

#

@coral hollow I've tested it, and it seems really good!

#

I recorded a few clips with a poor mic and it translated perfectly.

#

One small mistake I made, you need to make this change to the definition of model:

model = AutoModelForSpeechSeq2Seq.from_pretrained(
    model_id, torch_dtype=torch_dtype, low_cpu_mem_usage=True, use_safetensors=True
)

#

Anyway, this definitely seems like a good idea for your project.

tulip drift Sep 27, 2025, 5:55 AM

#

serene scaffold What OS do you have

Window 10

woven prairie Sep 27, 2025, 6:30 AM

#

I want to learn RAG I have seen a few tutorials on YouTube but they uses , langchain mostly , is there any resources where they make a RAG FROM scratch, especially Retriever part.

west wing Sep 27, 2025, 9:00 AM

#

Any good documentation for statistics used in data science

meager iron Sep 27, 2025, 9:03 AM

#

Where can i find MIMIC-III and MIMIC-IV datasets?

#

Need these datasets for NLP model.

gusty iris Sep 27, 2025, 10:52 AM

#

Hey im looking for recommendations on the best LLM for generating ML code, specifically for a computer vision task. My goal is to train a facial expression recognition model that beats an old paper's accuracy by at least 1%.

I'm a novice and initially used DeepSeek Coder R1, which performed well but didn't meet the target accuracy. Are there any other powerful LLMs you guys can suggest? Im currently torn between Claude $20 a month, Expanse Ai or Open router.

woven prairie Sep 27, 2025, 1:14 PM

#

gusty iris Hey im looking for recommendations on the best LLM for generating ML code, speci...

Claude write better code try to generate code from Claude use perplexity to refactor according to your requirements.

mystic heron Sep 27, 2025, 2:00 PM

#

yo guys

#

I need help with RL stuff.

serene scaffold Sep 27, 2025, 2:01 PM

#

mystic heron I need help with RL stuff.

You have to ask your actual question to get help

mystic heron Sep 27, 2025, 2:01 PM

#

My bad bro

dawn wyvern Sep 27, 2025, 2:22 PM

#

hloo

steel jetty Sep 28, 2025, 12:13 AM

#

does anyone know a good website where i can download datasets? Im working on a homework where i need to find a real-world data set and create a plot to display it

gusty iris Sep 28, 2025, 12:23 AM

#

steel jetty does anyone know a good website where i can download datasets? Im working on a h...

Kaggle?

steel jetty Sep 28, 2025, 12:24 AM

#

gusty iris Kaggle?

i was thinking of that, but my professor recommends not to use kaggle

agile cobalt Sep 28, 2025, 12:56 AM

#

steel jetty i was thinking of that, but my professor recommends not to use kaggle

why?

#

you can also find a bunch in Hugging Face, https://datasetsearch.research.google.com/, government websites, and random places around the web though

edit; also public data in Google BigQuery though that is a bit of a pain to work with

#

some examples of government websites where you can find data:

steel jetty Sep 28, 2025, 1:15 AM

#

agile cobalt why?

I honestly don’t know

steel jetty Sep 28, 2025, 1:15 AM

#

agile cobalt you can also find a bunch in Hugging Face, <https://datasetsearch.research.googl...

Thank youu

agile cobalt Sep 28, 2025, 1:19 AM

#

overall just remember to check the size, scope and license of any dataset before you download it

you don't want to try to download something larger than your computer's available storage space by mistake,
you probably don't want something that only covers things you are unfamiliar with,
and some datasets require attribution (and while not applicable for this, may also restrict commercial usage and redistribution)

steel jetty Sep 28, 2025, 1:20 AM

#

agile cobalt overall just remember to check the size, scope and license of any dataset before...

got it, thank you so much

pine prism Sep 28, 2025, 2:38 AM

#

@small wedge so how would i add the weitghts first so i have a better understanding on how they can influence my ants

#

i know that the weights multiplied do something

small wedge Sep 28, 2025, 2:40 AM

#

that's up to you as the person designing the sim, if you want to do it with a neural network like you're describing then you need to do two things, first is decide how to turn your inputs the ants will get as a vector of numbers, and then decide how you want to interpret the output of the nn, which will also be a vector of numbers

#

for example one of my projects taught some ai's to aim at a moving target, the numbers they were given as input were the position and velocity of the target, their outputs were the x,y coordinate to aim their shots at

pine prism Sep 28, 2025, 2:41 AM

#

small wedge that's up to you as the person designing the sim, if you want to do it with a ne...

like use the output of the numbers as a sort of way to determine what the program wants to do next based on what happens?

small wedge Sep 28, 2025, 2:41 AM

#

yes

#

then you score the different ants (assuming you want them all to be agents and do a genetic sim), the best ones survive and cross over then their children mutate a bit

#

neat part about this is you can kind of avoid all the math of gradient descent etc that you would need in a policy gradient method like deep q learning or ppo

pine prism Sep 28, 2025, 2:43 AM

#

should i store their decisions somewhere in the program so that the ants remember previous decisions so they know how to work next

small wedge Sep 28, 2025, 2:45 AM

#

nah, there are algorithms that do that like q-learning where you score actions and then your agent basically picks the best actions based on their q-scores, but that would be seperate from this

#

another simple way of doing this that doesn't require a neural network at all is like a string genome, you could represent actions as just plain letters like L for left and R for right, you can run the same process here with choosing the best and crossing over their brains without any sort of actual weights

#

there are an infinite number of ways to do it really, you could add as many extra things as you want

pine prism Sep 28, 2025, 2:47 AM

#

small wedge there are an infinite number of ways to do it really, you could add as many extr...

alright that makes more sense ive made a kind of AI program a while ago that had a memory factor i might try and use it in a differnt way i used it so it can apply to this project

#

all the program did was store each response or prompt i gave it to a list named memory and it output responses based on what was in its memory

gusty iris Sep 28, 2025, 5:16 AM

#

coral hollow How hard would it be to train a model to convert speech to text?

As another beginner i can let you know its a complete pain in the ass

#

Can anyone recommend me between claude opus and gpt-5 in improving competent ML code

coral hollow Sep 28, 2025, 7:05 AM

#

gusty iris As another beginner i can let you know its a complete pain in the ass

y?

bronze wyvern Sep 28, 2025, 12:40 PM

#

Hello, can someone explain what moving average is, how is it calculated and how it differs from "normal" average pls. I know it's a maths stuff but I don't have that knowledge, would really appreciate if someone can explain.

I did google but all I'm seeing is application of it, like for forecasting etc but they aren't explaining why it's used there, what is it's benefit and why not use just normal average

In real word analysis, what can moving average demonstrate to us that normal average can't pls

Also, if we need to plot the moving average on a graph, this average is for multiple years, so which year do we choose? I read that's it's the middle year which is chosen, would really appreciate if someone can explain why

left tartan Sep 28, 2025, 1:05 PM

#

bronze wyvern Hello, can someone explain what moving average is, how is it calculated and how ...

An average is great at representing data where the distribution doesn't change over time... but, how many things in the real world are stable?

#

But, take global warming, for instance. Is the average temperature of the earth over past 1000 years useful?

#

You could chop the period into fixed intervals chunks, and compute 100 year averages, sure

#

That would produce a discontinuous graph, almost appearing like the average jumps around every 100 years

#

A rolling average would produce a more intuitive view of the changing temperature: showing how the average is changing over time

left tartan Sep 28, 2025, 1:09 PM

#

bronze wyvern Hello, can someone explain what moving average is, how is it calculated and how ...

If you're plotting rolling average, you'd plot it for each year, with the average over that year and the preceding N values

#

There's also ways to 'weight' the more recent averages higher, so a weighted average but one where the older events are given less significance (Google 'EWMA')

bronze wyvern Sep 28, 2025, 1:11 PM

#

yepp, I see, question though, when we plot the graph of moving average, since we are using multiple years, how do we choose which year correspond to the computed average?

#

I mean, we just take the middle year?

left tartan Sep 28, 2025, 1:17 PM

#

bronze wyvern I mean, we just take the middle year?

No, the moving average is for (as of) the last date.

#

For example, a 3 day moving average for past three days would be wed-fri for Friday, thur to sat for sat, and fri-sun for Sunday

bronze wyvern Sep 28, 2025, 1:20 PM

#

yepp I see

#

Thanks !

frosty mountain Sep 28, 2025, 1:29 PM

#

how do you mask out noise before calculating the silhouette score for DBSCAN?

vague heron Sep 28, 2025, 3:10 PM

#

steel jetty does anyone know a good website where i can download datasets? Im working on a h...

https://www.openml.org/search?type=data&sort=runs&status=active

calm cipher Sep 28, 2025, 3:25 PM

#

steel jetty I honestly don’t know

I don't know about your professor but I've noticed Kaggle has had more and more low-quality synthetic datasets that don't make for good analysis projects recently

#

some of them disclose they're synthetic, but a lot of them don't, and you won't really know anything is wrong until you try to do anything useful with the data

calm cipher Sep 28, 2025, 3:32 PM

#

steel jetty does anyone know a good website where i can download datasets? Im working on a h...

also I didn't see anyone post the UCI Machine Learning Repository, it's a little older but I think they do more vetting of the datasets https://archive.ics.uci.edu/

UCI Machine Learning Repository

Discover datasets around the world!

cursive chasm Sep 28, 2025, 4:56 PM

#

#

is he on the right path?

agile cobalt Sep 28, 2025, 5:37 PM

#

cursive chasm

did you open that course to check its description before asking?
It has that on the linked page

#

and yes, I'd also recommend that course

autumn perch Sep 28, 2025, 6:48 PM

#

I built Data-Cent because I often need to explore CSVs quickly without firing up pandas or writing custom code. It’s a Streamlit-based web app where you can: • Upload CSV files (no setup) • Auto-filter and explore the data • Create interactive charts (line, bar, scatter, etc.) • Run quick stats (mean, median, std) • Download a PDF/HTML report of your analysis
Live demo: https://data-cent.streamlit.app/ Source: https://github.com/data-centt/Data-Analytics

Would love feedback on performance and UI/UX — especially from folks who explore data often or build Streamlit apps.

If you find it interesting please help me star the repo. TY

Streamlit

data cent

Data-Cent is an interactive data visualization and management app built using Streamlit. It allow...

GitHub

GitHub - data-centt/Data-Analytics

Contribute to data-centt/Data-Analytics development by creating an account on GitHub.

crude hedge Sep 28, 2025, 8:58 PM

#

bronze wyvern Hello, can someone explain what moving average is, how is it calculated and how ...

it just means you always calculate average of x time periods (e.g. 5 minutes) while you always get new points

#

Normal average is a calculation which is not getting updated

pine prism Sep 28, 2025, 10:02 PM

#

why did no one tell me how hard ML actually is i thought i could do this without other libraries like pytorch or numpy

#

im starting to get it now but what was i thinking last night where i could make a machine learning project by myself

#

also using cursor code editor

serene scaffold Sep 28, 2025, 10:14 PM

#

pine prism why did no one tell me how hard ML actually is i thought i could do this without...

ML jobs pay well precisely because they're difficult to train for

iron basalt Sep 28, 2025, 10:56 PM

#

pine prism im starting to get it now but what was i thinking last night where i could make ...

You can, but you need to be pretty comfortable with programming new ideas from scratch in general to do that. And that is a skill most acquire over a decade or so.

#

That is in addition to the mathematical knowledge needed and then specific ML knowledge.

#

Cursor can't do that for you, it will only accelerate you if you already know what you are doing (almost all time is spent debugging, and you can't do that without understanding it all).

pine prism Sep 29, 2025, 12:42 AM

#

serene scaffold ML jobs pay well precisely because they're difficult to train for

yeah, i hope they do because so far most of my time has been spent making prototypes simulations without RL/ML not to mention the notetaking and document reading but i'm slowly understanding this more because the formulas are surprisingly easy to read
the hard part so far which is what i didnt expect to be hard is make the program make decisions on its own first before adding the machine learning aspect to it but other than that im making decently good progress in numpy its just a matter of can i understand pytorch libraries and documents

#

my first project is training a ant colony to maintain a good healthy state over time by making good decisions

serene scaffold Sep 29, 2025, 12:47 AM

#

pine prism my first project is training a ant colony to maintain a good healthy state over ...

let me know how that goes

pine prism Sep 29, 2025, 12:47 AM

#

serene scaffold let me know how that goes

i'll let you know how it goes by the end of the week because i still have to go to school unfortunately

gusty iris Sep 29, 2025, 1:14 AM

#

i just asked Claude Opus to generate ML code and bro costed $2.5 for the single prompt

serene scaffold Sep 29, 2025, 1:26 AM

#

gusty iris i just asked Claude Opus to generate ML code and bro costed $2.5 for the single ...

how many tokens were the input and the output?

gusty iris Sep 29, 2025, 1:40 AM

#

500 input and around 6000 output

spice tartan Sep 29, 2025, 2:10 AM

#

Anyone knows a way to make f strings format as normal string in newer versions of jupyterlab?

#

I don't like that colour

agile cobalt Sep 29, 2025, 1:26 PM

#

gusty iris 500 input and around 6000 output

!e unless you meant R$2.5 or some other currency that sounds off```py
from decimal import Decimal
input_cost = Decimal("15") / Decimal("1_000_000") # USD per Million tokens
output_cost = Decimal("75") / Decimal("1_000_000") # USD per Million tokens
cost = 500 * input_cost + 6000 * output_cost
print(cost)

arctic wedgeBOT Sep 29, 2025, 1:26 PM

#

agile cobalt !e unless you meant R$2.5 or some other currency that sounds off```py from decim...

:white_check_mark: Your 3.13 eval job has completed with return code 0.

0.457500

agile cobalt Sep 29, 2025, 1:26 PM

#

but yeah Claude is ridiculously expensive

agile cobalt Sep 29, 2025, 1:28 PM

#

spice tartan Anyone knows a way to make f strings format as normal string in newer versions o...

you should be able to change the Theme under Settings, not sure if you can change that in specific or if you would need to create a new theme and modify it though

pallid badge Sep 29, 2025, 4:12 PM

#

Hi
Could I ask for some input, please?
How could one develop a AI tool that shows me gaps or trends, for example with cooking recipes. Let's assume there are databases with public API and no API (this would mean webscaping)
Now I would like to aggregate data in a structured way, I could query the data bases (or maybe later web scraping).
But then what is next? Maybe I want to find a trend in pasta recipes, are currently ingredients more popular then others ?
My question is if I have the data , I would need first to develop rules when something is popular, missing, trending? Am I right?

agile cobalt Sep 29, 2025, 4:27 PM

#

you'd need of some structured way of determining what each recipe covers, then you can create some simple models to identify what "normal" looks like for each ingredient and look for outliers (values significantly above or under the normal)

pallid badge Sep 29, 2025, 4:43 PM

#

Hi etrotta, thank you for the reply. For example, cooking utentils,number of ingredients, preparation time, type of ingredients maybe ?
When I thought about it, I arrived to the conclusion that I would to structure my data.

#

My introductions to ML showed me often the IRIS dataset, several properties, and finally a label for y. Based on those properties it was possible to sort the petals.
But with the recipes, the story is different? I don't have this "y" parameter.

agile cobalt Sep 29, 2025, 5:00 PM

#

there are a lot of different 'tasks', I'd guess that most of what you have seen falls under supervised learning like regression and classification, but there are also a lot of techniques for unsupervised learning, in which you don't have clear labels

take a look at https://scikit-learn.org/stable/unsupervised_learning.html - specially https://scikit-learn.org/stable/modules/outlier_detection.html

scikit-learn

2. Unsupervised learning

Gaussian mixture models- Gaussian Mixture, Variational Bayesian Gaussian Mixture., Manifold learning- Introduction, Isomap, Locally Linear Embedding, Modified Locally Linear Embedding, Hessian Eige...

scikit-learn

2.7. Novelty and Outlier Detection

Many applications require being able to decide whether a new observation belongs to the same distribution as existing observations (it is an inlier), or should be considered as different (it is an ...

rich walrus Sep 29, 2025, 7:08 PM

#

a clause is a database query that is a command to get something out of a database?

eager lance Sep 29, 2025, 7:18 PM

#

is brocode's pandas 1h video solid?

slender crown Sep 29, 2025, 7:33 PM

#

Guys i got a question, i'm currently 15 and interested in ML. I know the math behind ML algos, Neural Networks and more. And working on personal projects. And i'm using Python for that, but got a question. On university, are they only going to teach math behind this? Or also teach libraries like Pytorch? Also if i'm graduated from university, is it easy to find a job in this field?

waxen kindle Sep 29, 2025, 7:34 PM

#

It depends what courses you follow, but you will probably learn how to use the libraries too

#

Any decent DL course will explain how to use pytorch or tensorflow

#

also bc practicing things is part of the learning process

slender crown Sep 29, 2025, 7:36 PM

#

waxen kindle also bc practicing things is part of the learning process

I'm doing this like 2 years

#

Wanted to start 4 years ago but my math couldn't handle it.

waxen kindle Sep 29, 2025, 7:37 PM

#

For the job, it's hard to answer, as it's hard to predict what the job market will be once you graduate

#

in like, 6-7 years....

#

Today I would not call it easy, because you need to have good grades and show a strong interest, but the job market is (for now) quite open in this field, at least it's what I feel, where I live. That will depend on where you live too

slender crown Sep 29, 2025, 7:41 PM

#

waxen kindle Today I would not call it easy, because you need to have good grades and show a ...

I'm interested in math and programming but the university exam in my country is a bit hard. I can speak english like a usual person does. Also thinking to go abroad. (Sorry if took a bit long to write)

waxen kindle Sep 29, 2025, 7:43 PM

#

I you think you can't pass the exams in your country, idk what to suggest. Believe in yourself, if you are interested enough, and know how to study right, you'll get it !

#

If it doesn't work for you, studying abroad is also a great opportunity

#

there is a lot of pros and cons for all decisions, at the end it's for you to make them

#

If the exams are hard and you succeed where most people fail, you won't have any issue finding a job

agile cobalt Sep 29, 2025, 8:00 PM

#

eager lance is brocode's pandas 1h video solid?

I'd strongly recommend reading the official User Guide above anything else

eager lance Sep 29, 2025, 8:01 PM

#

is brocode's pandas 1h video solid?

agile cobalt Sep 29, 2025, 8:07 PM

#

seems mid
see https://pandas.pydata.org/docs/user_guide/index.html instead

bronze wyvern Sep 29, 2025, 8:17 PM

#

Hello, can someone explain how image processing works in general pls.

I need to answer these questions using pullow in python:

b.    Swap Red and Blue → how does the image change?
c.    Extract the Green channel and compute its average value.
d.    Convert image to grayscale by averaging R, G, B.
e.    Image cropping – cut out the center 100×100 region.
f.    Blurring – apply Gaussian blur.

But I first wanted to understand the theoretical aspect of how images are processed. I know that images are sequences of bits and are made up using multi-dimensional matrix/vectors.

I know we need to use libraries like numpy so that we can upload the image to be processed.

First question, when we upload the image into that array, do we have pixels to work with?

I know images are made of 3 colors, RGB, how do they work?
Like if I need to swap red with blue, what's the idea behind that, convey all bits holding blue into red?

The colors have an average value, what does that mean pls

spring field Sep 29, 2025, 8:34 PM

#

bronze wyvern Hello, can someone explain how image processing works in general pls. I need to...

First question, when we upload the image into that array, do we have pixels to work with?
Basically yes, you get either a 2D (mapped/palettized values or grayscale) or a 3D (RGB(A) values) array where the innermost dimension typically represents a particular pixel's color

I know images are made of 3 colors, RGB, how do they work?
You can think of them as color components, you have a bit of red, a bit of green, a bit of blue and when you mix them together you get a new color (and the value of a component tells you how much it contributes to the resulting color)

Like if I need to swap red with blue, what's the idea behind that, convey all bits holding blue into red?
with swapping you'd essentially write the original value of the red pixels to the blue pixels and then write the original value of the blue pixels to the red pixels, as in, overwrite those values with the original values of the color you're swapping with
if you work with an array interface, you'd essentially just extract all values of a particular color channel and then inject them into the other color channel, though there might be a method with pillow that already abstracts this away from you

The colors have an average value, what does that mean pls
In the context of grayscale, you take any single pixel and calculate the average value of its 3 components (RGB), just an arithmetic mean, for example if the pixel's value is [128, 64, 120], you get (128 + 64 + 120) / 3 = 104, so you just replace the pixel's value with [104, 104, 104]
in the context of blurring, you take the average of each color channel for all pixels in a certain area around your center pixel and then replace all of those pixels' color channel values with that single average for that channel (and this is a weighted average in the case of something like a gaussian blur)

spice tartan Sep 29, 2025, 9:02 PM

#

eager lance is brocode's pandas 1h video solid?

Could be

#

Enough to get the ball rolling

#

There's Udemy courses for more in-depth

#

Or more on youtube

eager lance Sep 29, 2025, 11:04 PM

#

spice tartan Or more on youtube

what do u think he's missing?

eager lance Sep 29, 2025, 11:04 PM

#

agile cobalt seems mid see https://pandas.pydata.org/docs/user_guide/index.html instead

ty

spice tartan Sep 29, 2025, 11:07 PM

#

eager lance what do u think he's missing?

There's more in pandas than this

#

And just watching a couple of minutes of these won't help u know about all the other cool functions that exist in pandas

wheat umbra Sep 30, 2025, 8:06 AM

#

What is the common practice for pushing data-manipulation jupyter notebooks to github? Do you just push it as is or do you convert it to a python script first ? I have had some weird problems when pulling an .ipynb from a github repo.

grand minnow Sep 30, 2025, 8:42 AM

#

wheat umbra What is the common practice for pushing data-manipulation jupyter notebooks to g...

just push them as is

#

what kind of problem are you getting when pulling one?

wheat umbra Sep 30, 2025, 8:44 AM

#

Sometimes im having issues with the cells loading. Some take very long to appear properly. Tried this with multiple IDEs.

grand minnow Sep 30, 2025, 8:50 AM

#

how complex is it? will the cells load if its something as simple as print("hello world")?

wheat umbra Sep 30, 2025, 9:21 AM

#

I'm basically working with local datasets via pandas an NumPy. Also in the github repository the cell-outputs are cleaned so it does not automatically load the outputs when i pull the notebook.

left tartan Sep 30, 2025, 11:56 AM

#

wheat umbra I'm basically working with local datasets via pandas an NumPy. Also in the githu...

What is in the cell? 'Long to appear properly' could be any number of things.

wheat umbra Sep 30, 2025, 12:02 PM

#

im loading a dataset from a json file, normalizing it, and building a relational scheme. In another Cell im using the featuretools library to extract custom features via dfs from my relational data scheme. With "long to appear properly" i mean in VSCode for example the notebook is completely blank after pulling from the repo and the cells only appear one by one very slowly. Simmilarily to when youre loading a web page with a really bad internet connection. That's why i was wondering if it's even common practice to push jupyter notebooks to github instead of converting it to a python script, which fixes all these issues with a loss of control ofc.

agile cobalt Sep 30, 2025, 12:10 PM

#

wheat umbra im loading a dataset from a json file, normalizing it, and building a relationa...

it varies, some projects rather keep the outputs such that users can preview it without running anything, others just clear the outputs, others always convert to Python

some alternatives to Jupyter (namely marimo) use .py files with slightly custom syntax (e.g. decorators or comments) instead of json-based files

left tartan Sep 30, 2025, 12:12 PM

#

wheat umbra im loading a dataset from a json file, normalizing it, and building a relationa...

For me, I 'strip' notebooks before committing them to GitHub. My repo is just the notebook code, because I can regenerate what I need fairly cheaply. Anything 'expensive' gets saved separately, ie: to a parquet file or a model file

#

I use a precommit to do this, so I don't forget

wheat umbra Sep 30, 2025, 12:16 PM

#

yeah in my case the usage of this is more like a script. Im modifying a json file, building a new dataframe with more complex features and then saving the "clean" data to a parquet file. The actual use-case for it would be to just run it once on a raw-data-lake to convert it into a clean dataset. The notebook format is more like a debugging thing to inspect dataframes etc.. so i guess im going to convert this into a python script for the final version of my project.

left tartan Sep 30, 2025, 12:18 PM

#

I'd just be curious what step is slow though, it could be that it's loading js assets for rendering/etc. Are you opening it in Jupyter directly or via vscode? I usually open my notebooks in vscode

wheat umbra Sep 30, 2025, 12:18 PM

#

I open it in vscode

lapis sequoia Sep 30, 2025, 12:22 PM

#

I couldn't find 1 GPU of H100/A100 on AWS, only the 8 GPUs of it so does anyone know an alternative I could use?

serene scaffold Sep 30, 2025, 1:32 PM

#

lapis sequoia I couldn't find 1 GPU of H100/A100 on AWS, only the 8 GPUs of it so does anyone ...

how much VRAM do you need?

eager lance Sep 30, 2025, 3:29 PM

#

spice tartan There's more in pandas than this

what about kaggle's course on pandas?

#

or what about this? https://github.com/Asabeneh/30-Days-Of-Python/blob/master/25_Day_Pandas/25_pandas.md

GitHub

30-Days-Of-Python/25_Day_Pandas/25_pandas.md at master · Asabeneh/...

30 days of Python programming challenge is a step-by-step guide to learn the Python programming language in 30 days. This challenge may take more than100 days, follow your own pace. These videos m...

bronze wyvern Sep 30, 2025, 4:19 PM

#

spring field > First question, when we upload the image into that array, do we have pixels to...

yepp I see, thanks for the explanation !

One thing, how does "blurring" occurs? I mean when we see a blurred picture, under the hood, we have multiple "center pixels" and the neighbouring pixels intnesities / values are decreased?

#

also one thing :c, this is more of a general question, why would we represent images using multidimensional arrays, like 2D arrays? What flexibility does it gives us? Is it because of the row x col structure? (if so, what is special with that)?

night cove Sep 30, 2025, 5:07 PM

#

Hey everyone, I need some help with running an older ML project called ECINN (Electrochemical-Inspired Neural Network).

I’ve been trying to run the example code (main.py) for Fe ion detection, but I keep running into compatibility issues with TensorFlow, Keras, and Python versions.

Here’s what I’ve tried so far:

Environment: WinPython 3.12.4.1
TensorFlow version: 2.20.0
Keras: the one bundled with TF
The codebase was originally written for TensorFlow 2.3.0 (2020 era).
On Windows, I keep hitting errors like:
- TypeError: unsupported format string passed to list.__format__ (fixed manually)
- ValueError: by_name only supports loading legacy '.h5'
- DLL load failures for TensorFlow on WinPython
- Pandas: "Invalid file path or buffer object type: <class 'list'>"

I even tried Colab, but it doesn’t support TF 2.3.0 anymore (only ≥2.16).

Question: What’s the best way to get ECINN running in 2025? Should I:

Use Docker with an old TF 2.3.0 image?
Patch the code fully for TF 2.20.0 (new Keras saving/loading API, etc.)?
Or is there a smarter way to emulate the old environment?

Ultimately, I just want to run the Fe ion example (ECINN-BV for Fe Ion on GCE) and get the trained weights + plots it should output.

Any advice or working setup instructions would be amazing 🙏

https://zenodo.org/records/10246052

Zenodo

nmerovingian/ECINN: Added comparison with conventional method

Comparisons with brute force finite difference fitting, Tafel region analysis and Randles-Sevcik equation are added.

agile cobalt Sep 30, 2025, 5:17 PM

#

docker with the original version is probably your best bet as far as compatibility goes

night cove Sep 30, 2025, 5:23 PM

#

I have tried everything else except that
i will do that and see if it works
if it doesnt ill probably have sit and make the whole thing again

vivid nimbus Sep 30, 2025, 7:55 PM

#

if anybody wants to work on modeling the economy within hypixel skyblock (minecraft), please dm me.

bronze wyvern Sep 30, 2025, 8:32 PM

#

Hi, can someone explain how pre-processing techniques like gaussian blur and grayscale make images reduce "noise" pls

vague heron Sep 30, 2025, 8:38 PM

#

Things like Gaussian blur filter/dampen out large variations (for example neighboring pixels with very different grayscale values

#

These large variations often relate to noise, but of course some of it is part of the image so it comes out blurry if you use it too aggressively.

iron basalt Sep 30, 2025, 8:40 PM

#

bronze wyvern Hi, can someone explain how pre-processing techniques like gaussian blur and gra...

High frequency details are lost during blur.

spice tartan Sep 30, 2025, 9:11 PM

#

wheat umbra I'm basically working with local datasets via pandas an NumPy. Also in the githu...

I think you can change that

#

There's some extension or something I remember in GitHub that make notebooks look cleaner and shows output clearly with diffs too.

lean seal Oct 1, 2025, 4:12 AM

#

yo anyone learned bayesian networks Probabilistic graphical modelling

#

I am about to do research with my professor about it and i started to learn a bit but i feel like i am not yet comfortable w the math side of it the probability and side of it is just so weird any advice

vague heron Oct 1, 2025, 4:13 AM

#

I hear there is a good course on Coursera about it

lean seal Oct 1, 2025, 4:13 AM

#

yea the stanford one

#

i am a sophomore and it's a graduate course...

vague heron Oct 1, 2025, 4:14 AM

#

https://www.coursera.org/specializations/probabilistic-graphical-models

Coursera

Probabilistic Graphical Models

Offered by Stanford University. Probabilistic Graphical Models. Master a new way of reasoning and learning in complex domains Enroll for free.

lean seal Oct 1, 2025, 4:15 AM

#

yea this is the one i am doing right now

#

when it comes to the tests i fail it

#

i feel like it doesn't help much and it just expects us to already be familiar with it

vague heron Oct 1, 2025, 4:17 AM

#

I see 🙁

wooden hill Oct 1, 2025, 4:32 AM

#

Good evening fellas

viscid urchin Oct 1, 2025, 6:01 AM

#

This looks pretty good at first glance https://mmids-textbook.github.io/

bronze wyvern Oct 1, 2025, 10:03 AM

#

spring field > First question, when we upload the image into that array, do we have pixels to...

hi, quick question, say I loaded a RGB image, I only show the Red channel, when I open the copy of the image, the image is now white/blackish, is there a reason for that pls

spring field Oct 1, 2025, 10:18 AM

#

bronze wyvern hi, quick question, say I loaded a RGB image, I only show the Red channel, when ...

well, apparently you just essentially got rid of the other color channels (green, blue), but you should have set them to 0 instead if what you wanted to see was like a very red image

bronze wyvern Oct 1, 2025, 10:19 AM

#

spring field well, apparently you just essentially got rid of the other color channels (green...

oh ok I see, if I only use the red channel (which I did), why our image becomes kind of grayscale? is there a reason for that pls

spring field Oct 1, 2025, 10:23 AM

#

well, that's similar to taking the average of all channels to convert it to grayscale(ish), but you only used the value of one of the color channels (red in this case)

#

like you went from some pixel value like [128, 64, 120] to [128, 128, 128] instead of [128, 0, 0] (or [128, 255, 255])

bronze wyvern Oct 1, 2025, 10:25 AM

#

when I only use r, like consider this:


r,g,b = cat_img.split()

Normaly, we have 3 instance of an image? Each r,g and b would each have 3 channels? Displaying r will show the average intensities across each channel? Like [128,128,128]?

vague heron Oct 2, 2025, 3:55 AM

#

You can check this by comparing the first few pixels of the original with your modified 'red' one. Then it will become clear what happens.

mellow vector Oct 2, 2025, 6:44 AM

#

Can someone say how I'd collect column headers from polars? atm I'm using column_list = list(headers_lf.collect_schema().names())

#

collect_schema is throwing a bunch of warnings at me

agile cobalt Oct 2, 2025, 2:53 PM

#

mellow vector collect_schema is throwing a bunch of warnings at me

show which warnings

it's somewhat discouraged overall as it can be expensive depending on your query though, i.e. if you can avoid it just use expressions/selectors instead

pallid badge Oct 2, 2025, 5:09 PM

#

Could I ask you if you heard about SPECTER2? https://huggingface.co/allenai/specter2
To the best of my understanding, this is an encoder for scientific text. Are there maybe better ones out there?
I would like to fine-tune this on a scientific domain.
What I have not yet full understand, I couple this with a LLM like LLama and somehow I can query with this help on my embeddings?
And how would I couple this with scikitlearn functionality, e.g. clustering?

allenai/specter2 · Hugging Face

twilit geode Oct 2, 2025, 6:14 PM

#

I know, i know this is python based discussioons but is there a server to discuss how ot get into ai? Besides just youtubing it and being a drift on what is right and wrong approach? Tutorial hell. T-T, is that a better place toi ask this? JUST to start bc idk wnywhere to start with it, and I guess learn to use it, no mak stuff i guess..yet.

calm cipher Oct 2, 2025, 6:34 PM

#

twilit geode I know, i know this is python based discussioons but is there a server to discus...

lots of resources in the pinned messages for this channel, in general things go better here if you come with a specific question about a problem you're having

spice mason Oct 2, 2025, 7:27 PM

#

hi

fallen basin Oct 3, 2025, 12:18 AM

#

spice mason hi

hi

warm flame Oct 3, 2025, 1:32 AM

#

heeey

lost beacon Oct 3, 2025, 6:14 AM

#

Hi Hugging face transformers mutex.cc lock error . Has anyone faced this ?

twilit geode Oct 3, 2025, 2:21 PM

#

I’m still absolute beginner & still learning Python, my dad was like learn ai. Which again broad af. But I did like the concept of data analyst when I picked this up years ago, just dunno how I could use ai to help with that.

spring field Oct 3, 2025, 3:24 PM

#

bronze wyvern when I only use r, like consider this: ```py r,g,b = cat_img.split() ``` Norma...

no, RGB are the 3 channels

mellow vector Oct 3, 2025, 4:26 PM

#

agile cobalt show which warnings it's somewhat discouraged overall as it can be expensive de...

I opted to use readline(), though I have no idea how it compares, I read that it's lazy.

agile cobalt Oct 3, 2025, 4:48 PM

#

mellow vector I opted to use `readline()`, though I have no idea how it compares, I read that ...

collect_schema may or may not need to load some data and execute parts of the query depending on the query

for example, ```py

import polars as pl
lz = pl.LazyFrame({'x': [1, 2, 3]})
unknown_schema = pl.col('x').map_elements(print)
informed_schema = pl.col('x').map_elements(print, return_dtype=pl.Int64)

No need to run any parts of the query (it can determine without running the query itself)

lz.select(informed_schema).collect_schema()
Schema({'x': Int64})

It needs to run it for a part of the query to know what the final schema will be (unknown return_dtype for map_elements)

lz.select(unknown_schema).collect_schema()
1
1
Schema({'x': Int64})

(not sure why it is printing 1 twice though)

mellow vector Oct 3, 2025, 5:07 PM

#

agile cobalt `collect_schema` may or may not need to load some data and execute parts of the ...

My aim was to cast everything to float64 by looping over the headers.

#

It worked with collect_schema().names() which now that I think about it, worked when inferring the datatypes failed before, so it must not be terribly expensive (vertically)

bronze wyvern Oct 3, 2025, 5:12 PM

#

spring field no, RGB are the 3 channels

yeah I see, when I display only the image with channel r, this mean channel g and b have a bunch of 0s? Didn't understand why we have the gray image and not the red though, what is the maths here pls

crisp edge Oct 3, 2025, 5:26 PM

#

Chat, I'm fed up of web development it's boring just designing and making it real. Resources that I have learned gives me edge fir machine learning and AI like python ..... So can anyone provide me a roadmap or structured plan on how to become a ML engineer to land a job at MAANG companies????

serene scaffold Oct 3, 2025, 5:52 PM

#

crisp edge Chat, I'm fed up of web development it's boring just designing and making it rea...

jobs in AI development require a lot of specialized training for you to be valuable to a company. You would probably need to go back to university to get a masters degree in CS that's focused on AI.

crisp edge Oct 4, 2025, 3:54 AM

#

serene scaffold jobs in AI development require a lot of specialized training for you to be valua...

I'm still in uni doing bachelor's but focus is on core cs but yeahh I'm making a lot of projects like Netflix recommendation system and AI chatbots.

twilit geode Oct 4, 2025, 8:02 AM

#

Are there some video/course to help with general knowledge how to just get started?

small wedge Oct 4, 2025, 8:13 AM

#

twilit geode Are there some video/course to help with general knowledge how to just get start...

https://see.stanford.edu/Course/CS229
https://developers.google.com/machine-learning/crash-course

#

Assuming you want ml

#

If you're just looking for like the most basic surface level intro

#

Id recommend the 3 blue 1 brown series on neural networks

#

https://youtu.be/aircAruvnKk?si=S1_3aGf0pxhbd3V9

astral sun Oct 4, 2025, 11:30 AM

#

I have a project name NCl or can be called: SSC 🙂 I'm working on its parser/lexer:

📎 SSC.py

arctic wedgeBOT Oct 4, 2025, 11:30 AM

#

astral sun I have a project name NCl or can be called: SSC 🙂 I'm working on its parser/lex...

There was an error uploading your paste.

mellow vector Oct 4, 2025, 1:40 PM

#

So I'm at a fork, altair or plotly? Until now I've used mostly matplotlib and I hate it.

#

Tempted to just flip a coin.

#

1000 times (naturally)

thorny umbra Oct 4, 2025, 2:21 PM

#

Hello everyone, im a bs data science student, i just completed a 12 hours python course and learnt basic stuff and also did some basic projects as well. now i just want to ask what should be the next thing to work on for me. related to data science.

sweet verge Oct 4, 2025, 3:48 PM

#

Pytorch or TenserFlow?

serene scaffold Oct 4, 2025, 4:47 PM

#

sweet verge Pytorch or TenserFlow?

Pytorch.
Tensorflow is only used in outdated tutorials

sweet verge Oct 4, 2025, 4:49 PM

#

serene scaffold Pytorch. Tensorflow is only used in outdated tutorials

Thanks a lot!

#

I needed that..

agile cobalt Oct 4, 2025, 5:36 PM

#

mellow vector So I'm at a fork, altair or plotly? Until now I've used mostly matplotlib and I ...

if plotly express works with minimum configuration use it
otherwise (if it lacks in performance or customizability) consider altair

jovial ravine Oct 4, 2025, 6:28 PM

#

i wanna make a AI chatbot with python using torch library

serene scaffold Oct 4, 2025, 6:47 PM

#

jovial ravine i wanna make a AI chatbot with python using torch library

Don't start with a chatbot. Those are so challenging that you'll give up before making any progress

#

A classifier would be more approachable to start. By orders of magnitude

jovial ravine Oct 4, 2025, 7:35 PM

#

alright

lapis sequoia Oct 4, 2025, 7:45 PM

#

viscid urchin The base-level Comp. Sci. B.S. flow this uses is here: <https://www.cs.fsu.edu/f...

Is that Florida State?

viscid urchin Oct 4, 2025, 7:48 PM

#

lapis sequoia Is that Florida State?

Yep

lapis sequoia Oct 4, 2025, 7:49 PM

#

viscid urchin Yep

I thought the did it with FAU or something.

bronze wyvern Oct 4, 2025, 7:57 PM

#

Hi, can someone suggest where I can get an image data set containing at least six of the following office items pls:

chair, bin, mug, bottle, book, keyboard, mouse, stapler, notebook, phone

viscid urchin Oct 4, 2025, 7:57 PM

#

lapis sequoia I thought the did it with FAU or something.

There are a lot of overall engineering connections to FAMU, but this is the Math department basically so not here

lapis sequoia Oct 4, 2025, 7:58 PM

#

viscid urchin There are a lot of overall engineering connections to FAMU, but this is the Math...

I graduated from UCF, but does FSU still have the Actuary Department? I just remembered that.

viscid urchin Oct 4, 2025, 8:00 PM

#

lapis sequoia I graduated from UCF, but does FSU still have the Actuary Department? I just rem...

Looks like it! https://www.math.fsu.edu/~paris/actmath.math

Actuarial Science

Welcome to the Department of Mathematics. Our mission is to preserve, expand, and disseminate mathematical knowledge. Pursue a degree in the fields of Financial, Pure, Applied, Biomathematics, and Data Science.

#

Never run into it but cool.

viscid urchin Oct 4, 2025, 9:12 PM

#

I’m aiming to do their “Interdisciplinary Data Science”

merry oak Oct 5, 2025, 5:43 PM

#

!rule 6 | We're not a job board. Your message has been removed.

arctic wedgeBOT Oct 5, 2025, 5:43 PM

#

Rules

6. Do not post unapproved advertising.

lucid elbow Oct 5, 2025, 5:44 PM

#

suggest me things I can improve in this programs

📎 accounting.py

arctic wedgeBOT Oct 5, 2025, 5:44 PM

#

lucid elbow suggest me things I can improve in this programs

~~Please react with ✅ to upload your file(s) to our paste bin, which is more accessible for some users.~~

gritty vessel Oct 5, 2025, 6:26 PM

#

Hey I have a doubt what's the difference between training model for 200 epochs and training model for 100 epoch and then fine-tuning it with same data for 100epochs

serene scaffold Oct 5, 2025, 6:31 PM

#

gritty vessel Hey I have a doubt what's the difference between training model for 200 epochs...

those are equivalent

#

"fine tuning" is just "more training, possibly for a different task"

gritty vessel Oct 5, 2025, 6:32 PM

#

Got it

#

So like optimizer momentum, learning rate schedule will be lost right in case of fine tuning

#

As we are starting again ?

#

When compared to going for 200 epochs on one go or saving all this info while saving the best model

#

In these cases both will be same?

serene scaffold Oct 5, 2025, 7:52 PM

#

gritty vessel So like optimizer momentum, learning rate schedule will be lost right in case of...

yeah

serene scaffold Oct 5, 2025, 7:52 PM

#

gritty vessel When compared to going for 200 epochs on one go or saving all this info while sa...

yeah, I guess so

#

what I'm getting at is that "fine tuning" isn't a fundamentally different thing from training

waxen kindle Oct 5, 2025, 8:26 PM

#

Usually you finetune something that has been trained for a different task

#

Or with different data

waxen kindle Oct 5, 2025, 8:27 PM

#

gritty vessel Hey I have a doubt what's the difference between training model for 200 epochs...

This is equivalent, with very little difference if you reinitialize the optimzer and hyperparameters

opaque condor Oct 6, 2025, 12:20 AM

#

What would the architecture for a multi model look like?

serene scaffold Oct 6, 2025, 12:24 AM

#

opaque condor What would the architecture for a multi model look like?

what do you mean by multi model?

opaque condor Oct 6, 2025, 12:31 AM

#

A multiple model that can generate text and object detection

agile cobalt Oct 6, 2025, 1:12 AM

#

opaque condor A multiple model that can generate text and object detection

it varies, if by text you mean arbitrary LLM-like text messages, at one extreme you could have a ""normal"" multimodal llm trained to do object detection via tool calling, representing the detection as normal text formatted as JSON

another case could be having a shared base model, then one head that predicts the text and another head that predicts the bounding box for the object
(this second case making more sense for classification with fixed text labels)

autumn perch Oct 6, 2025, 7:03 AM

#

Hi Guys.

I built an automated data analysis using Python and its open-source.

Check it out; https://github.com/data-centt/Data-Analytics

Open to contributions

GitHub

GitHub - data-centt/Data-Analytics

Contribute to data-centt/Data-Analytics development by creating an account on GitHub.

knotty dagger Oct 6, 2025, 12:42 PM

#

Hi, i am new to ML and from non-tech bg. I have a doubt. When working with outliers and resampling , do we work with the entire dataset or just training data

serene scaffold Oct 6, 2025, 12:44 PM

#

knotty dagger Hi, i am new to ML and from non-tech bg. I have a doubt. When working with outli...

there are probably different opinions about this, but I wouldn't remove outliers from the test set. the outliers are still part of the data and we need to be honest about what consequences that will have.
you can remove outliers from the training data to help the model train more easily.

knotty dagger Oct 6, 2025, 1:12 PM

#

serene scaffold there are probably different opinions about this, but I wouldn't remove outliers...

Got it. Thank you

void cape Oct 6, 2025, 1:22 PM

#

Im planning to buy ISLR I only know python but should I buy the R or python version?
Some say to buy the R version while you build it in python so you can also pick up R comprehension along the way.

agile cobalt Oct 6, 2025, 1:30 PM

#

void cape Im planning to buy ISLR I only know python but should I buy the R or python vers...

seems like the Python version is more recent so I'd go with it

looks like both are available for free as PDF downloads on the official website though? so you can download both and check before purchasing

sly isle Oct 6, 2025, 2:59 PM

#

Does someone use GitHub Education?

serene scaffold Oct 6, 2025, 3:00 PM

#

sly isle Does someone use GitHub Education?

always ask your actual question. what would you ask someone who does?

sly isle Oct 6, 2025, 3:00 PM

#

serene scaffold always ask your actual question. what would you ask someone who does?

Do I have to use my university email address to gain access to GitHub Education? Is it only part of the verification step?

bronze wyvern Oct 6, 2025, 3:51 PM

#

Hello, quick question, I saw the word epoch quite frequently when we talk about training, what is that?

serene scaffold Oct 6, 2025, 3:54 PM

#

bronze wyvern Hello, quick question, I saw the word `epoch` quite frequently when we talk abou...

a full pass over the training data

#

usually when you train, you let the model train on each instance in the training set once.
every time you do that, that's an epoch

bronze wyvern Oct 6, 2025, 4:00 PM

#

oh ok, so let's imagine I have 1000 images. I need to train my model to classify those images, let's say between cats and dogs.

1 epoch means "looking" at the dataset only once? ML algorithm try to infer some features during that first pass but this 1 epoch might not be sufficient to deduce all underlying features, so we try to increase the number of epochs?

(But if more epochs means better accuracy, does that mean, it should be as big a possible? )

serene scaffold Oct 6, 2025, 4:04 PM

#

bronze wyvern oh ok, so let's imagine I have 1000 images. I need to train my model to classify...

more epochs just means more training. that doesn't automatically translate to better performance.
are you familiar with loss?

bronze wyvern Oct 6, 2025, 4:05 PM

#

ah ok, loss, loss function? Yeah heard that term, I know we use backpropagation and gradient descent to minimize the loss

serene scaffold Oct 6, 2025, 4:06 PM

#

bronze wyvern ah ok, loss, loss function? Yeah heard that term, I know we use backpropagation ...

ideally, the average loss will decrease over each epoch. but eventually you'll get a diminishing rate of return, at which point additional epochs won't really make a difference.

bronze wyvern Oct 6, 2025, 4:07 PM

#

yeah exactly, at this point, we don't really need to do more training, we assume it's a compromise and that adding more epoch will just increase accuracy by only a very little amount?

serene scaffold Oct 6, 2025, 4:09 PM

#

bronze wyvern yeah exactly, at this point, we don't really need to do more training, we assume...

it's not a forgone conclusion that a lower loss translates directly to better performance. but in either case, if the loss is decreasing by a very small amount between epochs, that might not make a noticable difference at all.

#

like, if your test set has 1,000 instances, a loss change of 0.00000001 probably won't influence the model's decision for any of those 1000.

bronze wyvern Oct 6, 2025, 4:10 PM

#

yeah I see

#

a lower loss translates directly to better performance
like overfitting you mean?

serene scaffold Oct 6, 2025, 4:10 PM

#

bronze wyvern > a lower loss translates directly to better performance like overfitting you me...

you quoted that in a way where it sounds like I'm saying the opposite of what I said

#

is this how politicians feel?

#

anyway, if a model performs poorly despite gradually decreasing loss, that would mean that the model overfit to the training data.

bronze wyvern Oct 6, 2025, 4:13 PM

#

yep I see, thanks !

bronze wyvern Oct 6, 2025, 5:19 PM

#

Hello quick question. Say someone understood the basics of how ML/DL works, like the theoretical concept but now this person needs to apply it. While the later knows the concept, he still needs to implement that through code.

So my question is, what is a correct approach here? How does that person decide which library/framework to use?

Say we pick a library/framework. Now, in order to understand, for e.g, how to implement an RNN in tensorflow, we would expect tensorflow documentation to talk about that?

serene scaffold Oct 6, 2025, 5:22 PM

#

bronze wyvern Hello quick question. Say someone understood the basics of how ML/DL works, like...

Always pick pytorch over tensorflow. That part is easy.

You can usually look at code that implements similar architectures and figure it out from there

bronze wyvern Oct 6, 2025, 5:23 PM

#

alright noted, thanks !

#

by the way is there a reason why pytorch is prefered over tensorflow?

serene scaffold Oct 6, 2025, 5:23 PM

#

The community has coalesced around pytorch and no one uses tensorflow except the authors of outdated tutorials.

#

I've never seen a coworker use tensorflow a single time for anything

bronze wyvern Oct 6, 2025, 5:25 PM

#

yep noted, thanks !

bronze wyvern Oct 6, 2025, 7:56 PM

#

Hello, quick question, how do we know that a model we have trained is ready? Like it's not overfitted etc and we can actually use it to do real stuff?

agile cobalt Oct 6, 2025, 7:58 PM

#

depends on the task, for many you'll want to keep track of some metrics like its accuracy in addition to the loss, then stop training a bit after it stops improving

#

for some cases it could never become good enough to do 'real stuff' depending on what it is, or you could need to retrain it a few times using different data & hyperparameters configurations

bronze wyvern Oct 6, 2025, 7:59 PM

#

yep I see, question though

#

when I was doing a project for uni, the teacher said that we should split our data into 80% trainint and 20% testing I think. But I read recently that we have training, validation and testing set

#

I'm confused, validation and testing set are different things?

agile cobalt Oct 6, 2025, 8:03 PM

#

with the 3 sets, you split some data that will only be used after your entire project is over - you never evaluate with it until right before you decide whenever or not to put it into production / publish your results

if you 'retrain it a few times using different data & hyperparameters configurations' too much, some configurations may be better on your test data by chance, similarly to over-fitting to the train data

the separation of test & validation data helps to avoid overoptimistic results which then fail in production

viscid urchin Oct 6, 2025, 8:03 PM

#

I see validation as a way to check on progress, and testing as a way to check outcomes, lemme know if anyone thinks that is crazy.

#

Maybe it means something different in the data science context.

bronze wyvern Oct 6, 2025, 8:10 PM

#

yeah I see, hmm I will read a bit on hyperparameter tuning and came back, but with the validation data set, this also is unseen, no?

#

ah it's used indirectly with hyperparameters tuning?

#

with the test set, we don't do anything with that, no hyperparameters tuning etc?

agile cobalt Oct 6, 2025, 8:17 PM

#

I didn't specify which is which because I sometimes get confused and swap them derp

yeah, you only run the later a single time after it's done training, no more tuning after you get your score on it, ideally no selecting which model to use based on it, just "this is your expected score with real data" after picking the final model

untold bloom Oct 6, 2025, 8:18 PM

#

course materials are the training set, past year exams are the validation set, to-be-taken exam is the test set

bronze wyvern Oct 6, 2025, 8:20 PM

#

yep I see, thanks !

unkempt wigeon Oct 7, 2025, 3:35 AM

#

https://youtu.be/UYq7KY90i4M?si=-PBWWJRVjjIrsjup

What would the code for this type of simulation look like

YouTube

cozmouz

I Tortured this AI Humanoid in a Simulation for 1000 Years

The first 500 people to use my link https://skl.sh/cozmouz05251 will receive 20% off their first year of Skillshare! Get started today!
This video is sponsored by Skillshare. Thanks a lot for the support!

2nd Channel: https://www.youtube.com/@cozmouzlabs
Discord: https://disc...

▶ Play video

worldly dawn Oct 7, 2025, 5:43 AM

#

unkempt wigeon https://youtu.be/UYq7KY90i4M?si=-PBWWJRVjjIrsjup What would the code for this t...

they literally explain it in the video?
Is this an ad?

woven prairie Oct 7, 2025, 6:33 AM

#

Has anyone worked with RAG base memory for a llm

#

Instead of maintaining the last 5-6 queries as conversation history we can use the Rag based approach for memory.

unkempt wigeon Oct 7, 2025, 12:13 PM

#

worldly dawn they literally explain it in the video? Is this an ad?

No but I have tried to understand and I'm confused

worldly dawn Oct 7, 2025, 12:38 PM

#

unkempt wigeon No but I have tried to understand and I'm confused

which part specifically?

unkempt wigeon Oct 7, 2025, 12:39 PM

#

The reward function and the agent it's self

worldly dawn Oct 7, 2025, 12:40 PM

#

what about it?

unkempt wigeon Oct 7, 2025, 1:18 PM

#

The very small reward and how does the AI use the joints of the model

bronze wyvern Oct 7, 2025, 3:20 PM

#

Hello, quick question, why are histograms vital in image processing? For example say we are plotting frequency against pixel values, what can we infer?

If say we have different histograms with R,G,B colors, if we draw 3 bell curve on them, we can try to deduce the tendency which pixel is more dominant?

silk pendant Oct 7, 2025, 3:29 PM

#

import matplotlib.pyplot as plt
import numpy as np

plt.rcParams['text.usetex'] = True

fig = plt.figure()

ax = fig.add_subplot(projection="3d")
ax.view_init(elev=-21, azim=153, roll=-79.5)
ax.set_box_aspect((1, 1, 1), zoom=0.95)

x, y, z = np.array([[-1,0,0],[0,-1,0],[0,0,-1]])
u, v, w = np.array([[1,0,0],[0,1,0],[0,0,1]])
ax.quiver(x,y,z,u,v,w,arrow_length_ratio=0.1, color="black", length=5)

ax.text(3.9, 0.1, 0, '$x$', size='x-large')
ax.text(0, 3.9, 0.1, '$y$', size='x-large')
ax.text(0, 0.1, 3.9, '$z$', size='x-large')

ax.plot([0, 1], [0, 2], [0, 3], marker='o')

ax.set_axis_off()

plt.savefig('Figure-4.svg', bbox_inches='tight')

plt.show()

Why does my code above produce arrows of different lengths?

long locust Oct 7, 2025, 3:31 PM

#

That is what ax.quiver produces

#

Or do you mean the main axes themselves?

#

It is likely to do with the default projection and rotation

silk pendant Oct 7, 2025, 3:32 PM

#

it does seem like the scale of each axes itself changes

#

even if I comment out my code for setting the default view angle, one of the axes is still noticeably longer than the rest

long locust Oct 7, 2025, 3:34 PM

#

Hmmm.

#

Could be the way you are passing parameters to the quiver function

#

From the docs:

quiver([X, Y], U, V, [C], /, **kwargs)

silk pendant Oct 7, 2025, 3:34 PM

#

I think that's for 2d

long locust Oct 7, 2025, 3:35 PM

#

So for 3D I would guess [X, Y, Z]

silk pendant Oct 7, 2025, 3:36 PM

#

long locust Oct 7, 2025, 3:37 PM

#

The plot thickens

silk pendant Oct 7, 2025, 3:37 PM

#

as if it wasn't thick enough already

#

judging by how the text placements relative to the arrows are correct, I'd say the scale of each axis is what's changing

#

so in other words, the actual space is warping

#

🤔

long locust Oct 7, 2025, 3:41 PM

#

If I turn off the rcParams it gets pretty close to what I think you want

#

silk pendant Oct 7, 2025, 3:42 PM

#

import matplotlib.pyplot as plt
import numpy as np

# plt.rcParams['text.usetex'] = True

fig = plt.figure()

ax = fig.add_subplot(projection="3d")
# ax.view_init(elev=-21, azim=153, roll=-79.5)
ax.set_box_aspect((1, 1, 1), zoom=0.95)

x, y, z = np.array([[-1,0,0],[0,-1,0],[0,0,-1]])
u, v, w = np.array([[1,0,0],[0,1,0],[0,0,1]])
ax.quiver(x,y,z,u,v,w,arrow_length_ratio=0.1, color="black", length=5)

ax.text(3.9, 0.1, 0, '$x$', size='x-large')
ax.text(0, 3.9, 0.1, '$y$', size='x-large')
ax.text(0, 0.1, 3.9, '$z$', size='x-large')

ax.plot([0, 1], [0, 2], [0, 3], marker='o')

ax.set_axis_off()

plt.savefig('Figure-4.svg', bbox_inches='tight')

plt.show()

#

?

long locust Oct 7, 2025, 3:42 PM

#

Does the saved figure look different from your shown figure?

silk pendant Oct 7, 2025, 3:42 PM

#

the same

long locust Oct 7, 2025, 3:42 PM

#

long locust

import matplotlib.pyplot as plt
import numpy as np

# plt.rcParams['text.usetex'] = True

fig = plt.figure()

ax = fig.add_subplot(projection="3d")
ax.view_init(elev=-21, azim=153, roll=-79.5)
ax.set_box_aspect((1, 1, 1), zoom=0.95)

x, y, z = np.array([[-1,0,0],[0,-1,0],[0,0,-1]])
u, v, w = np.array([[1,0,0],[0,1,0],[0,0,1]])
ax.quiver(x,y,z,u,v,w,arrow_length_ratio=0.1, color="black", length=5)

ax.text(3.9, 0.1, 0, '$x$', size='x-large')
ax.text(0, 3.9, 0.1, '$y$', size='x-large')
ax.text(0, 0.1, 3.9, '$z$', size='x-large')

ax.plot([0, 1], [0, 2], [0, 3], marker='o')

ax.set_axis_off()

# plt.savefig('Figure-4.svg', bbox_inches='tight')

plt.show()

silk pendant Oct 7, 2025, 3:43 PM

#

ohhhhh

#

from that view angle it looks fine

#

but moving it around you realize the z axis is absurdly longer than the other axes

#

x axis I mean

long locust Oct 7, 2025, 3:45 PM

#

That is odd, but I gotta run

silk pendant Oct 7, 2025, 3:46 PM

#

https://tenor.com/djygn1f5zIp.gif

Tenor

#

there goes my one ray of hope

#

time to go back down the google/stack overflow rabbit hole

#

ax.set_xlim3d(0, 5)
ax.set_ylim3d(0, 5)
ax.set_zlim3d(0, 5)

adding this seems to work

molten hamlet Oct 7, 2025, 5:14 PM

#

import numpy as np
import matplotlib.pyplot as plt

# Parameter
a = 0.1

# Time array
t = np.linspace(0, 50, 400)

# Compute X and Y
X = np.sin(t * a)
Y = np.cos(t * a)

# Create 2D grid for contour plot
X_grid, Y_grid = np.meshgrid(X, Y)

# Define function F(X, Y)
F = X_grid + Y_grid

# Plot filled contour
plt.figure()
contour = plt.contourf(X_grid, Y_grid, F, levels=8, cmap='plasma')
# plt.colorbar(contour, label='F(X, Y)')
# plt.title('Filled Contour plot of F(X, Y) = X + Y')
# plt.xlabel('X = sin(t*a)')
# plt.ylabel('Y = cos(t*a)')
plt.show()
``` I think matplotlib has some bugs

#

viscid urchin Oct 7, 2025, 5:39 PM

#

Pretty dank: https://alexiajm.github.io/2025/09/29/tiny_recursive_models.html

Game Generation and Evaluation

Multi-Agent Game 🎮 Generation and Evaluation via Audio-Visual Recordings 📹

spring field Oct 7, 2025, 8:47 PM

#

Indeed, especially love the exploration away from LLMs

serene scaffold Oct 7, 2025, 9:11 PM

#

I wish I could explore not-LLMs

#

but that's not where the money is

opaque condor Oct 7, 2025, 9:17 PM

#

Play 47 images of cast would be good data set of cat images

waxen kindle Oct 7, 2025, 9:18 PM

#

for which task ?

opaque condor Oct 7, 2025, 9:19 PM

#

Image recognition of animals I still have dogs and gerbils

waxen kindle Oct 7, 2025, 9:20 PM

#

I think you would need a at least few thousand images per class to get decent accuracy

opaque condor Oct 7, 2025, 9:21 PM

#

I could apply transforms to all the images to test the model to robustness plus I this is why I could get from scouring both the internet and some of discord

waxen kindle Oct 7, 2025, 9:23 PM

#

yes but still

opaque condor Oct 7, 2025, 9:23 PM

#

waxen kindle I think you would need a at least few thousand images per class to get decent ac...

Right and get more images for the price of 47

#

But truly it would also train the ai for robustness

spring field Oct 7, 2025, 9:29 PM

#

48, you have 48 images

#

also 48 images is how you overfit the model

#

have you considered getting a dataset from somewhere like huggingface?

waxen kindle Oct 7, 2025, 9:30 PM

#

opaque condor But truly it would also train the ai for robustness

Well, no

#

it will train it to recognize 48 images of cats, and some very similar images

#

remember that cat vs dog vs gerbils are very similar, it may be hard to spot differences even for a human if low-resolution or bad lighting

opaque condor Oct 7, 2025, 9:35 PM

#

spring field have you considered getting a dataset from somewhere like huggingface?

Yes but I'm going to go to college for AI so might as well make a dataset from scratch because it may be a requirement

waxen kindle Oct 7, 2025, 9:35 PM

#

opaque condor Yes but I'm going to go to college for AI so might as well make a dataset from s...

it won't

spring field Oct 7, 2025, 9:36 PM

#

opaque condor Yes but I'm going to go to college for AI so might as well make a dataset from s...

a requirement for what?

waxen kindle Oct 7, 2025, 9:36 PM

#

I don't think you realize how many samples are required to train ML/DL/AI algorithm

spring field Oct 7, 2025, 9:36 PM

#

and what does "from scratch" mean anyway? are you going around with a camera, to people's houses and taking photos of their cats?

opaque condor Oct 7, 2025, 9:39 PM

#

No I mean is from scratch taking photos that people have shared I'm putting them in a folder labeled cats and naming each file

And to answer your statement 2x tanguy
I do realize how many images are needed I tried to make a image scraper (mindful dev) took me off of that route because he said it was against a sites policy I know but I can go to hugging face or kaggle but if I need to understand why is so hard to train my as well learn a little bit of it right

waxen kindle Oct 7, 2025, 9:41 PM

#

ifyou do realise you wouldn't be saving a few dozen random pics from the internet

opaque condor Oct 7, 2025, 9:45 PM

#

I'm going to add more

I don't exactly sleep and when I do I don't exactly want to get up so might as well use that to my advantage aim to get by the next two days scrape enough images that are not AI made to make my own dataset

waxen kindle Oct 7, 2025, 9:46 PM

#

You can 100% use some AI-made images

#

of course not 100% of the dataset should be made of it, but you can have some

opaque condor Oct 7, 2025, 9:48 PM

#

I'm trying to use as much pure data as I can

eager lance Oct 7, 2025, 11:36 PM

#

any good resources for data science?

worldly dawn Oct 8, 2025, 1:39 AM

#

unkempt wigeon The very small reward and how does the AI use the joints of the model

The reward is cumulative. They use two parts: the alignment and matching with the target and then increment the reward each time the robot touches a target.
It's all done in unity and unity provides such capabilities

primal pewter Oct 8, 2025, 1:37 PM

#

Hello, Im thinking about doing a project that would involve training an ai model. Im a beginner still, but a cs student so in any case it will be a good learning experience. Now because Im a beginner, I dont really have an idea where to start and I was thinking about using GPT, not to code for me, but to point me in the right directions to start with, perhaps what I need and must do and generally just sets me up to go? Im not asking out of ethical concerns, but purely for if Language models like GPT are in a state sufficient enough to do that.

serene scaffold Oct 8, 2025, 1:41 PM

#

primal pewter Hello, Im thinking about doing a project that would involve training an ai model...

you can use ChatGPT for that, yes.

primal pewter Oct 8, 2025, 1:42 PM

#

serene scaffold you can use ChatGPT for that, yes.

great to hear! thank you ❤️

mellow vector Oct 8, 2025, 2:48 PM

#

Been going over pl.LazyFrames, working toward a memory cheap pipeline. I have it written from csv (-> to parquet) though the preprocessing operations and am at a point where I need to import it into pytorch. I'm not quite sure where to start, I'm compelled to .collect for everything in torch but I sus there's a cheaper method to load batches.

agile cobalt Oct 8, 2025, 3:00 PM

#

mellow vector Been going over pl.LazyFrames, working toward a memory cheap pipeline. I have i...

polars 1.34 added a collect_batches() method, either use it, consider map_batches(), or really just collect into memory

mellow vector Oct 8, 2025, 3:33 PM

#

agile cobalt polars 1.34 added a `collect_batches()` method, either use it, consider `map_bat...

What about train_test_split? I googled it but everything is eager

agile cobalt Oct 8, 2025, 3:41 PM

#

mellow vector What about train_test_split? I googled it but everything is eager

either take the head and tail, or do it for each batch after collecting into memory

mellow vector Oct 8, 2025, 4:06 PM

#

agile cobalt either take the head and tail, or do it for each batch after collecting into mem...

oh, duh! that's perfect, I just need to work a shuffle in somewhere, thanks!

agile cobalt Oct 8, 2025, 4:07 PM

#

shuffling in lazy mode is also awkward, but if you can do it after collecting each batch that should work

runic glacier Oct 8, 2025, 4:46 PM

#

primal pewter Hello, Im thinking about doing a project that would involve training an ai model...

find solution by yourself and never ask here again

viscid urchin Oct 8, 2025, 5:03 PM

#

primal pewter Hello, Im thinking about doing a project that would involve training an ai model...

Check out the "Guided Learning" modes; that's what Gemini calls theirs, can't remember what it's termed in ChatGPT.

#

Instead of giving you the answer, they explain the context and then ask you a question etc.

mellow vector Oct 8, 2025, 5:26 PM

#

runic glacier find solution by yourself and never ask here again

this kind of "humor" isn't really appropriate for this server, if you wouldn't behave that way in a library it's probably not a good idea, someone who wants help might be turned off to the server by your behavior and that is basically the opposite of what we want here. Everyone is trying to be helpful.

long locust Oct 8, 2025, 5:27 PM

#

They won't be able to reply

gray saffron Oct 8, 2025, 6:42 PM

#

anybody help me develop an ai

viscid urchin Oct 8, 2025, 6:44 PM

#

gray saffron anybody help me develop an ai

Check this out, this is the jam IMO https://alexiajm.github.io/2025/09/29/tiny_recursive_models.html

Less is More

Recursive Reasoning with Tiny Networks

gray saffron Oct 8, 2025, 6:45 PM

#

no like a proper medical chatbot ai

viscid urchin Oct 8, 2025, 6:45 PM

#

Yep, that's how I would build that.

high heron Oct 8, 2025, 6:46 PM

#

gray saffron anybody help me develop an ai

as we said in #python-discussion , check out #❓｜how-to-get-help and perhaps open a thread in #1035199133436354600 .

bronze wyvern Oct 8, 2025, 7:43 PM

#

Hello, quick question, what's the purpose of thresholding in image processing? Like I was told to apply "otsu" thresholding, what it is its purpose, how does that benefit image processing techniques pls

viscid urchin Oct 8, 2025, 8:01 PM

#

My understanding is that there are two main reasons: To reduce the amount of data you are processing, and to "converge" similar images into the same result if they differ in ways that just seem "noisy".

#

But I'm not an expert, hopefully someone can improve on that.

hybrid shard Oct 9, 2025, 8:47 AM

#

Hey 👋 , I need 1-2 team members for Amazon ML Challenge urgently ( registration closing in 4 hours ) : https://unstop.com/hackathons/amazon-ml-challenge-2025-amazon-1560375

Eligibility: participiants should be from India, pursuing PhD/ M.E./M.Tech./ M.S./MS by Research/B.E./B.Tech. full-time degree, with graduation in 2026-27

mossy blaze Oct 9, 2025, 9:57 AM

#

I'm pleased to share with you the final results of my approach on the ARC AGI benchmark, which are as follows:

Total number of tasks solved: 446/1000
Success rate: 44.6%
Total execution time (on a CPU with 12 logical processors): 15 hours
Data size to analyze: 170 MB

hard brook Oct 9, 2025, 11:35 AM

#

someone from Indonesia?

vale umbra Oct 9, 2025, 12:00 PM

#

https://github.com/SusanBhattarai/StringableInference-py

im just a beginner, any suggestions guys?

GitHub

GitHub - SusanBhattarai/StringableInference-py: Free Python client ...

Free Python client for the StringableInference API, supporting chat completions and model retrieval - SusanBhattarai/StringableInference-py

barren fractal Oct 9, 2025, 12:36 PM

#

hey all, is this a good place to ask a question about data science specifically (no relation to Python) or are there better places for it?

#

I like discord or other chatroom-like apps over something like stackoverflow because it's easier to make conversation

grand minnow Oct 9, 2025, 2:19 PM

#

barren fractal hey all, is this a good place to ask a question about data science specifically ...

Im pretty sure its fine to talk about data science. It will eventually gets translated to Python afterwards probably. Whats on your mind?

barren fractal Oct 9, 2025, 2:21 PM

#

grand minnow Im pretty sure its fine to talk about data science. It will eventually gets tran...

there's a pretty cool statistics overview about Japanese travel: https://statistics.jnto.go.jp/en/graph (official numbers by a Japanese organisation) and I was wondering if the given data is enough to get a sense of the "average" itinerary of a tourist

Japan Tourism Statistics | 日本の観光統計データ

Data list | 日本の観光統計データ

#

specifically, given the "breakdown by length of stay" and "overnight stays by region/prefecture" would it be possible to make any meaningful inferences about how many regions the average tourist visits over the course of their trip

grand minnow Oct 9, 2025, 2:25 PM

#

barren fractal specifically, given the "breakdown by length of stay" and "overnight stays by re...

probably but it also depends on the flight, no? I have booked a family trip to Japan for early Jan 2026 and the return flight wont happen for a week. So we're basically forced to fill up our itinerary for a week (I have to admit that is too short to truly enjoy Japan but we'll make the most of it). So flight availability returns may influence the stats somewhat

#

interesting stats nevertheless

barren fractal Oct 9, 2025, 2:26 PM

#

yeah and that info isn't available, I imagine it's very challenging to collect

grand minnow Oct 9, 2025, 2:26 PM

#

I bet so too

barren fractal Oct 9, 2025, 2:27 PM

#

a few people I know have gone between 10 days and two weeks, they did get around a bit more but I'm sure that for business purposes it's entirely plausible that some people stay in the same city for months on end

grand minnow Oct 9, 2025, 2:29 PM

#

barren fractal there's a pretty cool statistics overview about Japanese travel: https://statist...

Skimming the data briefly, and to answer your original query, yes, it does look like you can get a very good general sense of average itinerary of a tourist from this dataset. Really nice find

barren fractal Oct 9, 2025, 2:30 PM

#

how would you go about it? I'm not a stats guy myself personally

grand minnow Oct 9, 2025, 2:34 PM

#

barren fractal how would you go about it? I'm not a stats guy myself personally

Me neither. Still a noob at it. But here's my take. Start by a question like "how many tourists would go do Osaka by end of the year and what would they be doing" or something. Then I would dive into each stats and find correlation that may help answer that question. Organize and sort. And that should answer that question. Another hypothetical question might be, "I want to go when its not peak tourist season but still has events to attend". Then find the relevant data that shows and answers that.

barren fractal Oct 9, 2025, 2:35 PM

#

hmm I see, that sounds like an interesting approach yeah

#

I tried ChatGPT but obviously it's not gonna teach me data science from a few questions

grand minnow Oct 9, 2025, 2:38 PM

#

barren fractal I tried ChatGPT but obviously it's not gonna teach me data science from a few qu...

If you wanted a course on Data Analytics, theres a lot of resource on that. Like from Google is one way: https://grow.google/certificates/data-analytics/

Data Analytics Certificate & Training - Grow with Google

The Data Analytics Certificate, developed by Google, can help you learn how to use AI to process, analyze, and visualize data.

barren fractal Oct 9, 2025, 3:03 PM

#

grand minnow If you wanted a course on Data Analytics, theres a lot of resource on that. Like...

oh nice, I'll check that out, thanks

trail zodiac Oct 9, 2025, 4:01 PM

#

Greetings. If anyone here is familiar with the img2table library, I'm getting an error that I need to install img2table[paddle] (despite it being installed). I found an issue for it here- https://github.com/xavctn/img2table/issues/243 - but I don't understand the solution. Can anyone provide some direction here?

GitHub

Paddle import issue · Issue #243 · xavctn/img2table

Hello, I got this error: Missing dependencies, please install 'img2table[paddle]' to use this class I created a virtual environment to navigate through the files, and I’m 100% sur I install...

agile cobalt Oct 9, 2025, 4:03 PM

#

trail zodiac Greetings. If anyone here is familiar with the img2table library, I'm getting an...

PaddlePaddle installs itself with setuptools if setuptools is not installed the error will be raised. I think it would make sense to add setuptools to the requirements.txt file
assuming that is right, you can try installing setuptools before you install it

trail zodiac Oct 9, 2025, 4:05 PM

#

I have done that, and I still get the error. Part of why I assume there's something here I'm not understanding.

agile cobalt Oct 9, 2025, 4:06 PM

#

odds are it's just broken then

trail zodiac Oct 9, 2025, 4:06 PM

#

damn. Aight, thanks. o7

mellow vector Oct 9, 2025, 7:57 PM

#

agile cobalt shuffling in lazy mode is also awkward, but if you can do it after collecting ea...

for random sampling for train/test split, the method gpt is suggesting is hashing

test_lazy = (
    lazy_df
    .with_columns((pl.col("id").hash(seed=42) % 10).alias("bucket"))
    .filter(pl.col("bucket") >= 8)
)

just wondering if this reasonable

#

It clearly works but gpt's advice is about the last place I'd take advice from if I have a choice

agile cobalt Oct 9, 2025, 7:59 PM

#

mellow vector for random sampling for train/test split, the method gpt is suggesting is hashin...

I guess it works? not sure if I'd really recommend it though
I feel like splitting into different files makes more sense

#

it's not unreasonable though

mellow vector Oct 9, 2025, 7:59 PM

#

hmmm

#

that's not a bad idea

#

Thinking I might polish this for my portfolio if I can get it streaming straight from parquet into the data loader

agile cobalt Oct 9, 2025, 8:01 PM

#

you could also take the head/tail or every Nth row instead of doing it based on the ID

void stone Oct 9, 2025, 10:18 PM

#

Hi,
I wanted to know what kind of projects I can make in order to secure a summer internship for a data scientist role
By the way I'm a beginner; currently working on a credit card fraud system; but I'm unsure if it would be enough

viscid urchin Oct 9, 2025, 10:21 PM

#

void stone Hi, I wanted to know what kind of projects I can make in order to secure a summe...

Building a classifier like that is a good start IMO

void stone Oct 9, 2025, 10:21 PM

#

viscid urchin Building a classifier like that is a good start IMO

Alright thanks
I've learned a lot
If I have enough time I was thinking building a regressor next

bronze wyvern Oct 10, 2025, 6:54 AM

#

Hi, just wondering, is there any resource that explains us how to train a multi-classification models using recommended ai/ml frameworks pls. Like from data cleaning, data split, hyperparameters tunning, metrics, model evaluation etc

odd meteor Oct 10, 2025, 2:28 PM

#

bronze wyvern Hi, just wondering, is there any resource that explains us how to train a multi-...

https://kaggle.com/learn is usually a good place to start. If you're looking into deep learning, then check fast.ai course

Learn Python, Data Viz, Pandas & More | Tutorials | Kaggle

Practical data skills you can apply immediately: that's what you'll learn in these no-cost courses. They're the fastest (and most fun) way to become a data scientist or improve your current skills.

bronze wyvern Oct 10, 2025, 2:55 PM

#

yep, will have a look at fast.ai, ty !

bronze wyvern Oct 11, 2025, 12:53 PM

#

Hi, quick question, when it comes to LLMs, how are they keeping track of latest things? For instance, say there is a new article published or a new technology released, what happens?

I know that LLMs learn from us, from our data, but question. Do they learn from anything we type in the LLM itself? Is there some kind of filtering before storing somewhere like in a vector database? Do the AI engineers actually have time to filter those info? Seem unrealistic, no?
I know there is the concept of RAG, but even, that database used is updated at some point no?

lapis sequoia Oct 11, 2025, 1:14 PM

#

in the offensive language dataset based, the one from twitter tweets, is the point of it that it lacks context severely? Like, the word "yellow" without attention, will be classified as neutral, because it will more than likely be interpreted as a color, when it means something else based on the context(which is ignored). Is that kind of the reasoning for that dataset?

lapis sequoia Oct 11, 2025, 1:19 PM

#

bronze wyvern Hi, quick question, when it comes to LLMs, how are they keeping track of latest ...

Yes.

bronze wyvern Oct 11, 2025, 1:27 PM

#

hmm yes in the sense that filtering does occur?

serene scaffold Oct 11, 2025, 3:10 PM

#

bronze wyvern Hi, quick question, when it comes to LLMs, how are they keeping track of latest ...

an LLM only "knows" things that are stated in its training data, which is always going to be somewhat outdated. If you use ChatGPT and you see "Searching the web..." on the screen, it's doing RAG.

training on dialogues with users would be risky, because users could just enter a bunch of nonsense, and that would mess up the model's understanding of how converstaions work.

bronze wyvern Oct 11, 2025, 3:11 PM

#

yeah so basically, each time we see "searching the web", it's using an AI agent behind the scene to scrap the web and look for the info?

lapis sequoia Oct 11, 2025, 3:17 PM

#

bronze wyvern yeah so basically, each time we see "searching the web", it's using an AI agent ...

no, go ask something to chatgpt like : """ answer the follwoing question based on the following pieces of context without crawling the entire internet: {context} Question: when did Hulk Hogan die? Answer:""" You should get something like this.

#

bronze wyvern Oct 11, 2025, 3:18 PM

#

ohhh ok

lapis sequoia Oct 11, 2025, 3:18 PM

#

its RAG

bronze wyvern Oct 11, 2025, 3:18 PM

#

no but

#

if it's RAG, it should have give us a valid answer, no?

lapis sequoia Oct 11, 2025, 3:21 PM

#

bronze wyvern if it's RAG, it should have give us a valid answer, no?

it depends on the documents being fed. ChatGPT is taking a snapshot of the internet for any data regarding the prompt that it was not trained on or is not common knowledge.

bronze wyvern Oct 11, 2025, 3:21 PM

#

yupp I see

#

so LLMs are bad for new/recent things

lapis sequoia Oct 11, 2025, 3:25 PM

#

bronze wyvern so LLMs are **bad** for new/recent things

no, it depends on the quality of data. if you are using chatgpt on the openai site, it is ok. LLMs in general make up stuff or just go on forever and spew nonsense if it doesn't know the answer. LLM's with direct docuements on your PC through a api key are pretty great if you know what you doing.

serene scaffold Oct 11, 2025, 4:01 PM

#

lapis sequoia no, go ask something to chatgpt like : """ answer the follwoing question based o...

what is the {context} part for?

lapis sequoia Oct 11, 2025, 4:02 PM

#

serene scaffold what is the {context} part for?

I know, just a habit. It had no context to go from.

craggy kettle Oct 11, 2025, 4:46 PM

#

fellow devs. anyone having experience working with pyspark to resolve deeply nested XML? I have XML files of different schema which are both nested and deeply nested (struct array stuct), I would be using a mapping csv to resolve the data. But I have not been able to do so far.

lavish skiff Oct 11, 2025, 4:46 PM

#

I wanted to know if I can use deepface to train an AI

lapis sequoia Oct 11, 2025, 7:54 PM

#

any of you do RL?

frosty mountain Oct 11, 2025, 10:38 PM

#

When working with a dataframe, how do you deal with incorrect data points? For example a column 'age' having values such as -1 and 225, while ordinal columns like 'Thalassemia' have values outside the range of 3, 6, 7?

Like how can you set those out of range values to NaN for each column

carmine ridge Oct 12, 2025, 5:26 AM

#

Hey I am trying to educate myself. Can someone explain to me what are

Gradient decent
loss function
learning rate

I am so confused rn. I just know they are used to optimize an algorithm but how

viscid urchin Oct 12, 2025, 5:30 AM

#

carmine ridge Hey I am trying to educate myself. Can someone explain to me what are Gradient ...

Ok I’m gonna use an analogy that’s so old, it’s probably older than me even. Imagine you are hiking on some mountains, but it’s suddenly really foggy, and you can barely see. You want to find your way down to the lowest point, the valley in between the peaks or whatever…

#

The loss function is the mountain. Your altitude is how “wrong” your current position is. High means you are not close to your goal, low means you might be.

#

Gradient descent is how you find your way downhill. You stop, check out the ground where you are standing, and then figure out which way slopes down and go that direction.

#

Learning rate is how long the steps you take are. A long stride means you travel faster, but you also might step over an edge if you’re not careful. A short stride means you are shuffling forward and it might be slow, and you might not make it down before nightfall, when the ice weasels come out.

carmine ridge Oct 12, 2025, 5:37 AM

#

viscid urchin Learning rate is how long the steps you take are. A long stride means you travel...

Thank you sire can you bless my soul further by enlightening me about their mathematical relation

viscid urchin Oct 12, 2025, 5:39 AM

#

carmine ridge Thank you sire can you bless my soul further by enlightening me about their math...

Have you studied any calculus? In particular differential calculus is really super related, it’s the same kind of “incremental” approach it seems to me.

#

If you think about a curving line drawn on a 2D plot, the process of using calculus to find the lowest point is exactly what is going on with “gradient descent”

#

Gradient descent is just the “extension” of that idea to more dimensions, like you end up with in machine learning

#

The “slope” of the higher dimensional “curve” is called the “gradient vector”

#

So I think the intuition of running your hand over a surface to find the lowest point is pretty ok to use

#

Just remember occasionally that it’s a bunch of dimensions not just three

#

Beyond that it’s just learning what the “update” formula looks like, but all it’s doing is the stuff described above.

carmine ridge Oct 12, 2025, 6:01 AM

#

viscid urchin If you think about a curving line drawn on a 2D plot, the process of using calcu...

The curving line is the loss function right?
Will the loss function always be a declining curve?

viscid urchin Oct 12, 2025, 6:02 AM

#

The Y value of the curving line is the loss function

#

the gradient vector is the slope at any given point

#

the loss function isn't the declining part, that's the gradient vector. the loss function is just a value, in this case 'how high off the ground are you'

#

This paper/book is the best explanation of LLMs I've seen so far, if you want to see how the full catastrophe is currently put together: https://arxiv.org/abs/2501.09223

arXiv.org

Foundations of Large Language Models

This is a book about large language models. As indicated by the title, it primarily focuses on foundational concepts rather than comprehensive coverage of all cutting-edge technologies. The book is structured into five main chapters, each exploring a key area: pre-training, generative models, prompting, alignment, and inference. It is intended f...

#

(To be clear I'm not saying you were asking about LLMs, just that they certainly use these ideas.)

#

Another way to look at this stuff is as an application of this: https://en.wikipedia.org/wiki/Free_energy_principle

Free energy principle

The free energy principle is a mathematical principle of information physics. Its application to fMRI brain imaging data as a theoretical framework suggests that the brain reduces surprise or uncertainty by making predictions based on internal models and uses sensory input to update its models so as to improve the accuracy of its predictions. Th...

#

The "least surprise" idea here is useful etc.

carmine ridge Oct 12, 2025, 6:21 AM

#

viscid urchin The Y value of the curving line is the loss function

Correct me if I am wrong:
So we start with OUR weights and a fixed technique of the loss function(eg the sum of sq diff)
We then calculate the Gradient (derivative)of the loss function wrt the weights used.
This tells us the side of the slope we should move(increase or decrease our weights, And the learning rate tells us by what extent we will change the weights).

And we continue this until we reach the weight that gives us the least value of the loss function

#

And now just have to apply this in a multi dimensional world

viscid urchin Oct 12, 2025, 6:22 AM

#

100% yes

#

Start with the weights, calculate the gradient, update the weights, repeat

#

You got it

carmine ridge Oct 12, 2025, 6:23 AM

#

Thanks man ily

#

I would vote for you if you ever ran for the president

viscid urchin Oct 12, 2025, 6:24 AM

#

Thanks! I plan to run unopposed though when the time comes. ABBATH

thick heart Oct 12, 2025, 11:31 AM

#

where to find models? i dont wanna do the ml
i just want the model
im a swe

jaunty helm Oct 12, 2025, 11:33 AM

#

thick heart where to find models? i dont wanna do the ml i just want the model im a swe

you want a model for what exactly

waxen kindle Oct 12, 2025, 11:34 AM

#

hugging face maybe

obtuse acorn Oct 12, 2025, 12:42 PM

#

im trying to find the right type of chart for displaying the proportion of a groups subcategories

#

wow that actually sounds like gibberish

#

its probably easier to just show what i mean

#

#

i found that a sankey diagram kinda works but only if i have the subcategories have different names which is doesnt look great

#

so like if i remove the prefixes it joins the categories together

viscid urchin Oct 12, 2025, 12:47 PM

#

Maybe this is a situation for a "Sunburst Chart"? You're right that Sankey isn't great at hierarchies

#

or a Treemap maybe

#

both are designed for nesting

#

plotly can just do px.sunburst(df,...) on a dataframe from pandas or similar.

waxen kindle Oct 12, 2025, 12:49 PM

#

What about a network

viscid urchin Oct 12, 2025, 12:49 PM

#

assuming your data was like:

data = {
    'group': ['Adult', 'Adult', 'Child', 'Child', 'Child', 'Child'],
    'gender': ['Male', 'Male', 'Male', 'Male', 'Female', 'Female'],
    'speed': ['Fast', 'Slow', 'Fast', 'Slow', 'Fast', 'Slow'],
    'value': [4, 2, 1, 3, 3, 3]
}

#

plotly.sunburst would just "eat" that etc

waxen kindle Oct 12, 2025, 12:50 PM

#

With nodes being names and the number of instances from a set to another is written on the edges ?

viscid urchin Oct 12, 2025, 12:50 PM

#

Yeah, the numbers could be weights of the connections etc I guess.

#

But I think a Treemap kinda "just does that"? Not sure if they are totally equivalent.

waxen kindle Oct 12, 2025, 12:51 PM

#

Yeah i think works too

obtuse acorn Oct 12, 2025, 12:53 PM

#

basically im wanting to show the proportions of each subcategory

#

let me try and word this right

viscid urchin Oct 12, 2025, 12:55 PM

#

I think I get what you're saying, and IMO both Starburst and Treemap do it

#

with the 'pie wedge size' and the 'rectangle size', respectively

obtuse acorn Oct 12, 2025, 12:55 PM

#

i looked at treemaps and i dont think they do

#

i might be looking at bad example tho

viscid urchin Oct 12, 2025, 12:55 PM

#

How not? a treemap view of your hard drive for example makes each box sized to the file

#

#

(bad-looking example but you get the idea)

#

#

#

area of labeled section = size/value/whatever

#

It's old:

obtuse acorn Oct 12, 2025, 12:58 PM

#

i think the starburst one would work for what im after

#

idk if it would look great

viscid urchin Oct 12, 2025, 12:58 PM

#

Yeah, might have to play with the 'style' a lot

obtuse acorn Oct 12, 2025, 12:58 PM

#

oh yeah it probably helps if i share what this is actually for

viscid urchin Oct 12, 2025, 12:58 PM

#

but I think it can clearly represent what you've got schema-wise

obtuse acorn Oct 12, 2025, 1:00 PM

#

obtuse acorn

im making a crowd crush simulator and i was using this chart as an example of how you might set the statistics of the crowd

#

basically recursively setting how different properties are distributed

#

splitting the categories into smaller subcategories

gilded pebble Oct 12, 2025, 1:36 PM

#

where can i learn mathematics for AI and i have no background

viscid urchin Oct 12, 2025, 1:42 PM

#

obtuse acorn basically recursively setting how different properties are distributed

These words (recursively setting, splitting categories) are like EXACTLY the reason Sunburst was developed, as far as I can tell. I think it's gonna work great, once you pick a 'style' to make it look pretty and readable.

#

The sunburst chart is perfect because it directly maps to the nested properties of a crowd.

#

imagine you have "age group", "temperament", "goal", and "count" (number of people in each sub-category)

#

plotly.sunburst(
    your_df,
    path=['age_group', 'temperament', 'goal'], # The hierarchy
    values='count',
    title='Crowd Distribution Sim'
)

#

and whammo

#

Use everything but the thing you want to have control the size of the section be part of the 'path', and then use the thing that should map to area be the 'values'.

somber willow Oct 12, 2025, 2:46 PM

#

gilded pebble where can i learn mathematics for AI and i have no background

just do a "college algebra, with python code" by freecodecamp, and practice with "Hall and Knight"

vale field Oct 12, 2025, 3:04 PM

#

Hi, quick question, I started learning n grams in nlp. I did scraped 9 wikipedia pages e.g. one on algorithms and software engineering etc, I just wanna ask after I have 1 gram - 5 gram, does the n grams need to be ordered by frequency (which is most common appearing)? Is it important if I need to make visualisations of each extracted gram e.g 1 gram, 2 gram etc?

dense lava Oct 12, 2025, 3:05 PM

#

what do you mean by that do they need to be ordered by frequency ?

vale field Oct 12, 2025, 3:06 PM

#

Like the word that appears the most in a page e.g. artificial is on the top

dense lava Oct 12, 2025, 3:06 PM

#

can anyone help me with model selection for a time series forecasting. I have 20 time series with an upward trend and with a seasonality.

#

I tried using LSTMs , but the error is still too high

#

the evaluation metric is RMSLE

wide carbon Oct 12, 2025, 6:28 PM

#

hello

mellow vector Oct 12, 2025, 8:21 PM

#

Instructor is normalizing the entire dataset in a course I'm following, does this result in a leak from the testing set?

#

He hasn't included a validation set at any point yet, I wonder if he's just loosely combining them for simplicity's sake. I should probably just complete the course before I go crazy writing a pipeline

#

was thinking it might be cool to write a train_test_split suite with marimo ui elements, as lazily as possible, I'd love to hear peoples thoughts on that

waxen kindle Oct 12, 2025, 8:35 PM

#

mellow vector Instructor is normalizing the entire dataset in a course I'm following, does thi...

Yes

#

It probably doesn't matter, but practically yes

mellow vector Oct 12, 2025, 8:37 PM

#

That's what I was thinking, was nice to normalize it on the fly but it's not that hard to set some stats aside

waxen kindle Oct 12, 2025, 8:38 PM

#

What is usually done is to normalize both datasets woth the stats of the training one only

#

But in practice, if you get both subsets from the same dataset, they should have the same distribution

#

So you would get the same result

mellow vector Oct 12, 2025, 8:41 PM

#

the random seeds have a pretty noticeable impact on the toy sets I'm using but I imagine that isn't an issue with larger sets

near rose Oct 12, 2025, 9:50 PM

#

https://github.com/brentleythegreat13694/basic-python-bot rate my basic python bot please

GitHub

GitHub - brentleythegreat13694/basic-python-bot: Hello! My name is ...

Hello! My name is Brentley, I'm 13, and I love computers. I know a bit of C, C++, and Python. My dream college is MIT, and I also plan to make more projects! If you want to contact me, my D...

cloud apex Oct 13, 2025, 1:14 AM

#

I’m looking for someone who knows machine learning and deep learning for a few coaching sessions. I’m currently learning and need help with a few things, as well as someone to review my code. If anyone’s interested, hit me up in the DMs.

serene scaffold Oct 13, 2025, 1:19 AM

#

cloud apex I’m looking for someone who knows machine learning and deep learning for a few c...

it would be great if someone volunteers to do this, but it's pretty unlikely--you're more likely to get help if you ask specific questions or post the code that you want to have reviewed.

cloud apex Oct 13, 2025, 1:26 AM

#

I will pay

cloud apex Oct 13, 2025, 1:26 AM

#

serene scaffold it would be great if someone volunteers to do this, but it's pretty unlikely--yo...

Thanks, where should I post the code?

serene scaffold Oct 13, 2025, 1:29 AM

#

cloud apex I will pay

It is not allowed to offer payment in this server.
You can post it in a paste bin or link to the github

#

!paste

arctic wedgeBOT Oct 13, 2025, 1:29 AM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

cloud apex Oct 13, 2025, 1:29 AM

#

serene scaffold It is not allowed to offer payment in this server. You can post it in a paste bi...

Oh im sorry I didnt know that

queen silo Oct 13, 2025, 3:55 AM

#

Hey I was trying to implement a Multilayered Perceptron from scratch using numpy on Iris dataset but I can't implement the back propagation part so need help

twilit geode Oct 13, 2025, 5:08 AM

#

Are there some video or course to help with general knowledge how to just get started with ai? I don’t think machine learning. But ya probably have to know the basics of that too. I’ve also heard the term ai agent thrown around too.

tropic edge Oct 13, 2025, 9:22 AM

#

twilit geode Are there some video or course to help with general knowledge how to just get st...

Check out AI tracks on roadmap.sh
Also there is plenty of good courses available for free from Google, Nvidia, and Intel. Although, if you are just looking for high level AI integration on apps, most YouTube courses will fulfill your needs

pine heron Oct 13, 2025, 10:23 AM

#

Hello everyone, I wrote some optimizers for TensorFlow. If you're using TensorFlow, they should be helpful to you.

https://github.com/NoteDance/optimizers

GitHub

GitHub - NoteDance/optimizers: This project implements optimizers f...

This project implements optimizers for TensorFlow and Keras, which can be used in the same way as Keras optimizers. Machine learning, Deep learning - NoteDance/optimizers

bronze wyvern Oct 13, 2025, 1:34 PM

#

Hello, quick question. I need to work on a multi-image classification ML project.

I need to do some preprocessing with my dataset, I wanted to know how should I proceed.

So first, I should perform some data cleaning, like normalizing categories to numbers, removing nan values if any etc...?

Then after that, say I have my images. My question is:

I need to perform image augmentation/preprocessing, how should I proceed?
I should perform image rotation, transformation, grayscale, blurr? All of these operations or some specific pls, how do I choose which one?

Then after that, I would need to do my data split, say 80 10 10.

Then train, validate and test my model.

At the very end, I would need to calculate some metric to my model? What kind of metrics should we use for such task, the confusion matrix thing?

serene scaffold Oct 13, 2025, 1:41 PM

#

bronze wyvern Hello, quick question. I need to work on a multi-image classification ML project...

the metrics you'd want to use are precision, recall, and f1
at an aggregate level, you'd want to do the micro and macro averages of all three.

bronze wyvern Oct 13, 2025, 1:42 PM

#

serene scaffold the metrics you'd want to use are precision, recall, and f1 at an aggregate leve...

yeah heard of these, I will need to read on that, will do so and come back

#

concerning the image processing, is there anything I should cater about?

serene scaffold Oct 13, 2025, 1:42 PM

#

bronze wyvern concerning the image processing, is there anything I should cater about?

I've never actually done image processing.

bronze wyvern Oct 13, 2025, 1:43 PM

#

ahh, no problem :c, I was wondering, normally in a dataset, should we have grayscale image/blurred image or these come down to the pre-processing

#

I think it comes to the pre-processing, no?

serene scaffold Oct 13, 2025, 1:43 PM

#

idk

bronze wyvern Oct 13, 2025, 1:44 PM

#

:c, hopefully someone may have an answer

#

I'm looking for resources online for these but it's very limited 🥲

tropic edge Oct 13, 2025, 1:46 PM

#

Look for LangChain courses

bronze wyvern Oct 13, 2025, 2:14 PM

#

Hello, can someone explain why when it comes to model evaluation, we can't only rely on accuracy, what would this imply if we did so?

agile cobalt Oct 13, 2025, 2:24 PM

#

bronze wyvern Hello, can someone explain why when it comes to model evaluation, we can't only ...

it varies a lot based on which kind of model you're working with, but for starters

are some classes over/under represented in the training data & in evaluation?
does it generalises for unseen data?

for example, if a model trained to detect rare problems just says everything is OK 100% of the time it may still get >99% of the results right, but is completely useless

bronze wyvern Oct 13, 2025, 2:25 PM

#

yep you are right, just read that

#

ty !

bronze wyvern Oct 13, 2025, 2:25 PM

#

serene scaffold the metrics you'd want to use are precision, recall, and f1 at an aggregate leve...

small question, I just had a look at the evaluation metrics, normally f1 score would be an aggregate of precision and recall, no?

serene scaffold Oct 13, 2025, 2:35 PM

#

bronze wyvern small question, I just had a look at the evaluation metrics, normally f1 score w...

yes

bronze wyvern Oct 13, 2025, 2:37 PM

#

you'd want to do the micro and macro averages of all three.
What do you mean the micro and macro averages pls :c

serene scaffold Oct 13, 2025, 2:38 PM

#

bronze wyvern > you'd want to do the micro and macro averages of all three. What do you mean t...

look into it and tell me what you find

bronze wyvern Oct 13, 2025, 2:38 PM

#

yep will do so

bronze wyvern Oct 13, 2025, 3:03 PM

#

from what I've understood, macro average treats all class size equally and so we perform the average on precision/recall and f1 separately.

On the other hand, for micro average, we sum up all individual fn/tp/fp then calculate the mean for each metrics, I didn't understand quite where this is used though, when we need an overall metric for our model ? But we can achieve same with macro average, no? Why micro?

viscid urchin Oct 13, 2025, 3:31 PM

#

Anybody used "Google Vertex AI Studio" for anything yet? I'm considering doing a thing with it, because it lets you directly get feedback about the "perplexity" of your prompts. The setup is a bit tedious though so I figured I'd ask first before going through the checklist.

serene scaffold Oct 13, 2025, 3:34 PM

#

bronze wyvern from what I've understood, `macro average` treats all class size equally and so ...

looks like you pretty much understand it.

twilit geode Oct 13, 2025, 5:20 PM

#

tropic edge Look for LangChain courses

From which creator/s??? like idk anything & still learning.

bronze wyvern Oct 13, 2025, 6:17 PM

#

Hello, can someone explain the difference between evaluation and regression metrics and where to use them pls.

From what I've read, evaluation metrics is used when we have built our model entirely, while, regression metrics can be used for each epoch? See how to minimize loss for e.g?

mellow vector Oct 13, 2025, 7:19 PM

#

Trying to write some lazyframe code to return mean and standard deviation of the frame, it seems really clunky to produce a 1 row frame with alternating columns for the values.

gritty vessel Oct 13, 2025, 7:23 PM

#

Hey how these big models are train? Like gpt models,stable diffusion and all?

#

How do they decide which arch is best as it takes so much time to train

#

So trying different combinations will take lots of time and resources

agile cobalt Oct 13, 2025, 8:35 PM

#

mellow vector Trying to write some lazyframe code to return mean and standard deviation of the...

you could either select two frames then collect together

import polars.selectors as cs

mean_lz = lz.select(cs.numeric().mean())
std_lz = lz.select(cs.numeric().std())

mean, std = pl.collect_all([mean_lz, std_lz])

or suffix/prefix the columns

stats = lz.select(
    cs.numeric().mean().name.suffix('_mean'),
    cs.numeric().std().name.suffix('_std'),
).collect()

mellow vector Oct 13, 2025, 8:39 PM

#

agile cobalt you could either select two frames then collect together ```py import polars.sel...

oh neat, i haven't looked at selectors. was trying to work out how to make

train_mean, train_std = aggstats_lf.mean(), aggstats_lf.std()

play with my data nicely

#

still wrapping my head around dataframes tbh, everything was a vlookup during my years as an excel spec

runic parcel Oct 13, 2025, 9:49 PM

#

Does anyone over here has the Machine Learning Specialization and Deep Learning Specialization course by andrew ng?

jaunty helm Oct 14, 2025, 1:50 AM

#

gritty vessel How do they decide which arch is best as it takes so much time to train

I honestly don't think they do, it's just try to make your best educated guess and hope that it works
otherwise we wouldn't get massive flops like llama4 400b / 2T (whose release was literally walked back, presumably due to it underperforming)

#

stable diffusion 3 also had issues on release; if you've seen the "woman lying on grass" abominations yeah that's sd3.
3.5 did fix some of those issues, but by then the community has moved on to flux

cedar veldt Oct 14, 2025, 2:06 AM

#

hi guys nice to meet everyone

fringe marsh Oct 14, 2025, 2:49 AM

#

cedar veldt hi guys nice to meet everyone

hello nice to meet you too

cedar veldt Oct 14, 2025, 2:51 AM

#

helows , whats up?

#

I've been doing a soft robot simulator xd but sometimes it's hard to concentrate while working alone in a project

#

that's why I joined this server

fringe marsh Oct 14, 2025, 2:53 AM

#

cedar veldt I've been doing a soft robot simulator xd but sometimes it's hard to concentrate...

what's the project about? seems interesting

fringe marsh Oct 14, 2025, 2:53 AM

#

cedar veldt that's why I joined this server

yeah this is a good server. Glad you joined

#

it has helped me couple of times here and there

cedar veldt Oct 14, 2025, 2:55 AM

#

I have a video ... is it possible to share here videos? , its a little language built with python that lets you prototype and test voxel based robots

#

its very hard to sell something like this so my goal is to make it super fun to work with xd almost like a game

fringe marsh Oct 14, 2025, 2:58 AM

#

cedar veldt I have a video ... is it possible to share here videos? , its a little language ...

i am not sure if its possible to share vidoes here. You could give it a try 🙂

fringe marsh Oct 14, 2025, 2:58 AM

#

cedar veldt its very hard to sell something like this so my goal is to make it super fun to ...

love it!

cedar veldt Oct 14, 2025, 2:58 AM

#

his name is fernando , he likes to walk but never gets too far : P

#

hahah I have another video with the IDE but I don't want to spam so I will share later

#

its very easy to do shapes because the shapes are defined by scalar fields , so you can do any implicit shape , I wanted to do this one first because it was the easiest

fringe marsh Oct 14, 2025, 3:04 AM

#

cedar veldt

oh wow!! this is so cool

cedar veldt Oct 14, 2025, 3:05 AM

#

thank you!! uwu

honest obsidian Oct 14, 2025, 3:07 AM

#

cedar veldt

so cool

cedar veldt Oct 14, 2025, 3:08 AM

#

the robot behaves like this because of different material properties , and the oscillations I defined owo

#

you can change the frequency for example and that can make it move twice as fast

#

when I get something more stable I will share it so people can write their own robots (its opensource), I wanna make a contest to see who can write the best bots for specific tasks uwu

fringe marsh Oct 14, 2025, 3:27 AM

#

this is a really nice presentation @cedar veldt looks fun. Hopefully it will be open source some day. 😎

cedar veldt Oct 14, 2025, 3:27 AM

#

it is already opensource

#

I haven't shared yet because I'm embarrassed of my messy code 😛

rugged spindle Oct 14, 2025, 4:17 AM

#

tropic edge Look for LangChain courses

Hi, @tropic edge

#

Do you find LangChain developer?
I am senior AI/ML engineer.

grand minnow Oct 14, 2025, 4:34 AM

#

rugged spindle Do you find LangChain developer? I am senior AI/ML engineer.

What have you made?

calm cipher Oct 14, 2025, 5:41 AM

#

oof, I just traced what I thought was a bug in my data preprocessing code to a couple of bad blocks on the drive storing my data

gritty vessel Oct 14, 2025, 7:32 AM

#

jaunty helm I honestly don't think they do, it's just try to make your best educated guess a...

Yeah it always surprises me or they got like insane Resources

#

Using which they can speed up and try different archs

bronze wyvern Oct 14, 2025, 8:06 AM

#

Hello quick question, I know that both standardization and normalization are part of the feature scaling process in data preparation. My question is, why do we use one over the other?

Their main goal is just to convert some values into some other numerical values, like 0 and 1.

I read that normalization is preferred when we know that our dataset doesn't follow the gaussian distribution, so maybe when there are lots of outliers/skewness?

But what abour standardization when do we use it and why pls.

agile cobalt Oct 14, 2025, 12:25 PM

#

take a look at https://scikit-learn.org/stable/auto_examples/preprocessing/plot_all_scaling.html

scikit-learn

Compare the effect of different scalers on data with outliers

Feature 0 (median income in a block) and feature 5 (average house occupancy) of the California Housing dataset have very different scales and contain some very large outliers. These two characteris...

bronze wyvern Oct 14, 2025, 1:00 PM

#

I want to learn the maths associated with gradient descent and stochastic gradient descent, anyone knows where I can get a reference to pls. I know it's just basic thing like y= mx + c but I don't really know which parameter represent what

#

I'm trying to understand the problem of vanishing and exploding gradient and I wanted to have an overview of the maths related to gradient first

wooden sail Oct 14, 2025, 1:23 PM

#

bronze wyvern I want to learn the maths associated with gradient descent and stochastic gradie...

what part of it troubles you?

#

i'd say wikipedia offers a pretty good introduction, but the notation is already a little technical

#

https://en.wikipedia.org/wiki/Gradient this is a good starting point

bronze wyvern Oct 14, 2025, 1:25 PM

#

The thing is I know the theoretical concept but not really how the maths work

bronze wyvern Oct 14, 2025, 1:25 PM

#

wooden sail <https://en.wikipedia.org/wiki/Gradient> this is a good starting point

yup will give it a go, ty !

wooden sail Oct 14, 2025, 1:26 PM

#

bronze wyvern The thing is I know the theoretical concept but not really how the maths work

https://en.wikipedia.org/wiki/Chain_rule this is also probably going to come in handy, then

#

vanishing and exploding gradients are usually something that pops up in the context of using the "chain rule"

bronze wyvern Oct 14, 2025, 1:27 PM

#

yup noted, I have some knowledge of the chain rule I think, I will have a look how this give rise to these problems, ty !

wooden sail Oct 14, 2025, 1:28 PM

#

if you want to try this out yourself by hand, something like khan academy should have simple examples with a step-by-step on how it works

bronze wyvern Oct 14, 2025, 2:08 PM

#

Hi, has anyone ever use YOLO for image recognition and classification? I don't understand, under the hood it uses ResNet or ResNet is completely another CNN architecture?

I need to train a multi class classification model both for image recognition and classification, am confused which framework/library to use. I was told to use YOLO though, don't know the reason though, anyone here has experience with it pls

carmine ridge Oct 14, 2025, 2:25 PM

#

Hey, i am new to deep learning and i am confused which library should be the best to start? I started with tensorflow but i also read abt pytorch and now I am confused

gritty vessel Oct 14, 2025, 3:42 PM

#

jaunty helm I honestly don't think they do, it's just try to make your best educated guess a...

https://arxiv.org/abs/2001.08361#openai check this out

arXiv.org

Scaling Laws for Neural Language Models

We study empirical scaling laws for language model performance on the cross-entropy loss. The loss scales as a power-law with model size, dataset size, and the amount of compute used for training, with some trends spanning more than seven orders of magnitude. Other architectural details such as network width or depth have minimal effects within ...

gritty vessel Oct 14, 2025, 3:43 PM

#

carmine ridge Hey, i am new to deep learning and i am confused which library should be the bes...

Start with pytorch

sterile leaf Oct 14, 2025, 4:06 PM

#

yo

#

is this the place to ask for reinforcement learning tutorials ?

wary geode Oct 14, 2025, 4:21 PM

#

Hello everyone..
I havs finished python tutorial video and i have done some exercises of all topic and my end aim is either data science or ai enginner.
Now I am planning to do DSA in python but people are saying don't do DSA with python.

What should I do? I would be pleased if someone share their insight on my problem.

serene scaffold Oct 14, 2025, 4:44 PM

#

wary geode Hello everyone.. I havs finished python tutorial video and i have done some exer...

sounds like a question for #algos-and-data-structs, but I disagree with the people saying not to do DSA in python.

viscid urchin Oct 14, 2025, 4:45 PM

#

Yeah, DSA isn't about micro-performance issues like language choice, it's about different orders of efficiency.

#

micro-performance is Progamming Language Theory territory, not DSA

wary geode Oct 14, 2025, 4:48 PM

#

What should i do?

serene scaffold Oct 14, 2025, 4:51 PM

#

wary geode What should i do?

whatever you want. we're both telling you that you can do DSA in python if you want.

gritty vessel Oct 14, 2025, 4:54 PM

#

wary geode Hello everyone.. I havs finished python tutorial video and i have done some exer...

Dsa is used everywhere

#

You can do in python as well if you are gonna use python in your career ahead

viscid urchin Oct 14, 2025, 4:55 PM

#

wary geode What should i do?

Acquire this book and grind through it, emerge as a true warrior: https://webperso.info.ucl.ac.be/~pvr/book.html

Concepts, Techniques, and Models of Computer Programming

A comprehensive programming textbook that
covers all important programming paradigms in a unified framework
that is both practical and theoretically sound.
Special attention is given to concurrent programming and data abstraction.
The textbook uses the Oz multiparadigm programming language for its examples.

gritty vessel Oct 14, 2025, 4:55 PM

#

And yeah logic stays same between the languages

mighty lake Oct 14, 2025, 5:02 PM

#

is there any websites that can teaches me python for free?

#

or DSA?

serene scaffold Oct 14, 2025, 5:09 PM

#

mighty lake is there any websites that can teaches me python for free?

this channel is for talking about data science and AI. not general python or DSA.
resources page: https://www.pythondiscord.com/resources/

Python Discord | Resources

We're a large, friendly community focused around the Python programming language. Our community is open to those who wish to learn the language, as well as those looking to help others.

bronze wyvern Oct 14, 2025, 5:36 PM

#

hello quick question, say I have a system that takes as input multiple images but these images are all of different resolution and now I need to standardize them, say I need to make them 500 by 500pixels.

Now my question is, does the new size of the resolution we want matter, like if I want to use 300 by 300 or 400 by 400?

Now, I was wondering we would need to keep aspect ratio the same, no?

Now, all images might not have same aspect ratio so, in code we can't hard-code to keep aspect ratio of say 16:9, maybe I would need to find aspect ratio of original image for each sample of my data set then for that aspect ratio use pixel say 500 by 500, no?

agile cobalt Oct 14, 2025, 5:46 PM

#

bronze wyvern hello quick question, say I have a system that takes as input multiple images bu...

it varies

you can just crop or expand (e.g. add black borders) to adjust the aspect ratio, then scale up or down to a fixed resolution like 512x512 or whatever is the most common for your dataset

the new size does matters, you'll want to minimise artefacts caused by scaling if possible, but larger inputs = more operations (although the difference could be small depending on how you model it)

viscid urchin Oct 14, 2025, 5:49 PM

#

ImageMagick's CLI tools can do this in a one-liner

#

(depending on the settings you want etc)

bronze wyvern Oct 14, 2025, 5:54 PM

#

agile cobalt it varies you can just crop or expand (e.g. add black borders) to adjust the as...

oh ok, will investigate about what you said and come back, ty !

bronze wyvern Oct 14, 2025, 5:54 PM

#

viscid urchin `ImageMagick`'s CLI tools can do this in a one-liner

didn't know about that, will have a look, thanks for the tip !

viscid urchin Oct 14, 2025, 6:07 PM

#

Actually lemme see if I can just craft that example

#

Assuming you've installed the base ImageMagick package for your OS (which has various differently-named CLI entry points)...

mkdir -p ./conform_output
mogrify -path conform_output -resize '800x800>' -background black -gravity center -extent 800x800 *.png
``` (example uses 800x800 as max size, pick whatever you need.)
I think that's right?

#

(would work for *.jpg also etc)

#

The -resize syntax is advanced, you can do lots fancier stuff than that, which is just saying "limit max dimension to 800 pixels"

#

montage is great too, I used this recently to lay up a directory of pngs into 4-column posterboard style:

montage *.png -tile 4x -geometry +10+10 posterboard.png

#

(+10+10 says all-around 10px spacing between each)

bronze wyvern Oct 14, 2025, 6:19 PM

#

oh ok, will have a look at that, seem really powerful and useful, ty !

lean oriole Oct 14, 2025, 6:44 PM

#

hello may i ask a small question?

#

i am trying to start learning robotics and automation but i am confused about what content should i follow like which topic to be focused on and what will be a fun way to keep progressing while learning.

viscid urchin Oct 14, 2025, 6:50 PM

#

lean oriole i am trying to start learning robotics and automation but i am confused about wh...

I'm not in robotics (yet?) but I have friends who are (massively, at NASA, etc) and it seems SO BROAD, there's SO MUCH to learn.. so maybe just pick some part that seems interesting to you and start diving into it?

#

my buddy Trey's title at NASA before his recent promotion was:

The Solver-in-Residence (SiR) program is a one-year detail position with the chief technologist in NASA’s Office of Technology Policy and Strategy. The program enables a NASA civil servant to propose a one-year investigation on a specific technology challenge and then work to identify solutions to address those challenges.

#

"AI and Autonomy Solver-in-Residence"

#

crazy smart kid

#

My main responsibility is conducting a study I formulated on how Modular Open Systems Approaches could be used at NASA, both broadly and with a focus on how autonomy and robotics software interoperability could be improved using the Space Robot Operating System (Space ROS) framework. Conducting the study involves meeting with a broad range of experts across government, industry, and academia, organizing workshops, managing technical investigations, and briefing findings to senior NASA leadership.

#

Learning Robot Operating System stuff might be a good place to start actually.

lean oriole Oct 14, 2025, 7:05 PM

#

wow

#

i'd look into it

mellow vector Oct 14, 2025, 9:55 PM

#

this warning about column names is bugging me, I recall now that collect_schema.names appears to do what I need but this code is really verbose and it feels wrong ```py
lf = lf.select([(pl.col(c) - mean_df[c][0]) / std_df[c][0] for c in lf.HeresWhereTheNamesAre])

#

something like

(lf - mean_df) / std_df

would be so much prettier

opaque condor Oct 14, 2025, 10:41 PM

#

paste.pythondiscord.com/F5BQ

Do I need more imports

serene scaffold Oct 14, 2025, 11:17 PM

#

opaque condor paste.pythondiscord.com/F5BQ Do I need more imports

Looks like that url is incomplete. But you don't just "need imports".

mellow vector Oct 14, 2025, 11:19 PM

#

Import * from *

opaque condor Oct 14, 2025, 11:20 PM

#

https://paste.pythondiscord.com/F5BQ

#

Got it to working and I need to know what imports I might need to add so I can train an AI to do all of abilities above

covert granite Oct 15, 2025, 1:54 AM

#

are you a pytz enjoyer or a zoneinfo embracer?

opaque condor Oct 15, 2025, 2:47 AM

#

https://paste.pythondiscord.com/C7HA

bronze wyvern Oct 15, 2025, 6:12 AM

#

hello, quick question

#

say I have train an image classification model. During the standardization process, I converted my images into 512 x 512 pixels.

Now say I build some interface that require us to upload the image we want to process. Now behind the scenes, we must first convert this image into 512 x 512 pixels then process it, right?

viscid urchin Oct 15, 2025, 6:17 AM

#

Check out what I said above re: imagemagick and its “mogrify” command.

bronze wyvern Oct 15, 2025, 6:48 AM

#

yep so basically I can apply a vast range of transformations using that command but the thing is I would use imageMagick on images that are already on disk, right?

Say I have a website and a user decided to upload its own image, I would still need to do that processing... hmm do you think there are some sort of api that would allow me to write code to interact with imageMagick (I should investigate). For example the idea is:

User upload picture.
Picture goes into /images folder or something like that.
Before verifying/classifying which image it is, runs the imageMagick commands for that image.
Overwrite that image and classify the new image based on what was trained.

gritty vessel Oct 15, 2025, 8:07 AM

#

You can apply it before training no need for overwriting the image

bronze wyvern Oct 15, 2025, 8:48 AM

#

no but, I will do so, I will apply them on my dataset, but my system will be like a website where we can upload images and these uploaded images are not preprocessed

agile cobalt Oct 15, 2025, 12:11 PM

#

yeah you must pre-process it identically to how you process training images