#data-science-and-ml | Python | Page 156

tidal bough Jan 5, 2025, 11:02 PM

#

Looks like you shuffle the dataset (should use shuffle=False for time series). Also I'm a bit confused where the dataset is split into smaller timeseries - or does your LSTM take sequences of length 1?

gray slate Jan 5, 2025, 11:02 PM

#

what goes in here? like, is the data set [0, 1, 2, 3, 4, 5], [1, 2, 3, 4, 5, 6], [2, 3, 4, 5, 6, 7] and when split you have overlapping segments?

iron basalt Jan 5, 2025, 11:03 PM

#

This goes back to the exposure I mentioned earlier. You just need to see a lot.

#

Same with math.

gray slate Jan 5, 2025, 11:04 PM

#

iron basalt Same with math.

Yeah I've seen a lot of procedural things I guess, so have a kind of process-based outlook on things. Too old to recalibrate myself for functional now :/

gray slate Jan 5, 2025, 11:05 PM

#

gray slate what goes in here? like, is the data set [0, 1, 2, 3, 4, 5], [1, 2, 3, 4, 5, 6],...

Is that what you mean by shuffling messing it up?

tidal bough Jan 5, 2025, 11:05 PM

#

tidal bough Looks like you shuffle the dataset (should use `shuffle=False` for time series)....

(I think what's happening is that you're giving your model 80% of randomly chosen points along a curve, from which it's not hard to learn to predict the (few) gaps - the model just needs to learn to linearly interpolate between points it has seen. Whereas for timeseries prediction you need to, nonrandomly, use the first 80% of dataset for training and then the last 20% for prediction - then the model actually needs to learn how to extrapolate far into the future.)

rancid sorrel Jan 5, 2025, 11:07 PM

#

-|-|-|-|-
| 202401311456 | 0.686809    | 0.685685   | 0.693007  | 0.684981    | 0.00159292  |
| 202401311457 | 0.68398     | 0.685171   | 0.693259  | 0.685494    | 0.00108766  |```

#

is the input data first colmun is being dropped as its the date column

#

crap

gray slate Jan 5, 2025, 11:10 PM

#

columns are "start, end, high, low, ??"

rancid sorrel Jan 5, 2025, 11:11 PM

#

open,high,low,close,volume

#

aiming for close

#

created_by: parquet-cpp-arrow version 18.1.0
num_columns: 7
num_rows: 8601
num_row_groups: 1
format_version: 2.6
serialized_size: 4036


############ Columns ############
Date
Open
High
Low
Close
Volume
__index_level_0__```

#

if you ant the full description

#

and yeah this is normalized data using min/max scaling

gray slate Jan 5, 2025, 11:13 PM

#

are you predicting current close? or close of next minute?

rancid sorrel Jan 5, 2025, 11:14 PM

#

honestly not sure, its probably prediciting the next one

gray slate Jan 5, 2025, 11:15 PM

#

because it seems to me like "current low" and "current high" are gonna flail around until the end, you'll only be able to predict it if it stays within the window?

rancid sorrel Jan 5, 2025, 11:16 PM

#

using non shuffeled data i got scarrly same data

gray slate Jan 5, 2025, 11:17 PM

#

from what I understand, trading is largely based on psychology and superstition. people believe in technical analysis so they become self-fulfilling prophecies. traders use specific time windows like 1, 5, 15 minutes, 1 hr, 4hr etc, and look for patterns in the graphs

#

so you'd need to take as much data as is fits on a trader's screen, at multiple scales, and use that as your input data

rancid sorrel Jan 5, 2025, 11:19 PM

#

"Train Precision": 49.99, 
"Test Precision": 100.00, 
"Train Recall": 50.00, 
"Test Recall": 100.00, 
"Train F1 Score": 49.99, 
"Test F1 Score": 100.00, 
"Train Accuracy": 99.99, 
"Test Accuracy": 100.00, 
"Train MSE": 0.0473, 
"Test MSE": 0.2405, 
"Train RMSE": 2.17, 
"Test RMSE": 4.90, 
"Train MAE": 1.81, 
"Test MAE": 4.89, 
"Train MAPE": 1895412317.84, 
"Test MAPE": 6.56, 
"Train R^2": 98.90, 
"Test R^2": -78.00 ```

#

this is also with dropping in tensroflow to elimate overfitting

#

those are %

#

i dont trust data that good

tidal bough Jan 5, 2025, 11:21 PM

#

I'd look very carefully at what data the model actually gets in fit_model

#

like, what shapes are X_train and y_train and their first few values and where they came from

gray slate Jan 5, 2025, 11:33 PM

#

gray slate from what I understand, trading is largely based on psychology and superstition....

I think expanding on this, if you looked at technical analysis and drew boxes for support and resistance, elliot waves, retractions, types of patterns and predicted those from the candle data, then used the outputs of that to predict price, I reckon you'd have a chance of predicting price movement direction

#

you'd need a ton of data for that I guess. you could maybe scrape it from tradingview or youtube though 🤔

rancid sorrel Jan 5, 2025, 11:36 PM

#

(8601, 6)
total data shape of input
date is being droped competly as LTSM dosnt need it
train_x (ie open,high,low,volume)
(6880, 1, 4)
test_x
(1721, 1, 4)
train_y (close)
(6880,)
test_y
(1721,)

#

looks fine tbh

gray slate Jan 5, 2025, 11:38 PM

#

what's len(set(", ".join(row) for row in data))

#

also drop the first column and check that. see if you have duplicate data

tidal bough Jan 5, 2025, 11:39 PM

#

rancid sorrel (8601, 6) total data shape of input date is being droped competly as LTSM dosnt ...

so your model is predicting the close price for an interval (of 1 minute), given the open and high and low prices for that same interval? isn't that basically trivial, since prices don't move all that much in a minute?

#

as in, I think just guessing close=open would have a low loss here.

rancid sorrel Jan 5, 2025, 11:40 PM

#

mostly likely yes

gray slate Jan 5, 2025, 11:40 PM

#

tidal bough so your model is predicting the close price for an interval (of 1 minute), given...

it's also not useful. since if it's going up then it's likely to be = high, and down it's = to low

#

and the data is after close, and during the period you don't know high or low anyway, they're in a state of flux

rancid sorrel Jan 5, 2025, 11:41 PM

#

see i knew it was too good to be true

#

ty guys

tidal bough Jan 5, 2025, 11:41 PM

#

that probably explains your results, then. This isn't timeseries prediction. You need to transform your dataset into timeseries, so that each input to your model consists only of previous datapoints and the model has to predict the current one

gray slate Jan 5, 2025, 11:43 PM

#

yeah, you need "an average trader's screen width of data", and you want to predict a trend line from the last data point

#

if you're hoping to do high frequency trading, you're gonna get front-run by your broker's mates anyway - they're selling your data to the highest bidder who can make trades before you do

#

if you're not then you need a longer period of time

#

I think you want to predict this sort of thing:
https://youtu.be/QjSxhK-ycGA?t=379
edit: change time

#

Idea being that you're predicting the behaviour of people who are doing technical analysis, or taking advice from people who are doing it, or are running bots that are acting that way. Then you add in sentiment analysis based on news + social media, or companies that provide feeds of those things (ideally ones that other traders are using)

tidal bough Jan 5, 2025, 11:57 PM

#

Do humans actually... do technical analysis nowadays? I thought it was mostly a 1900s thing.

gray slate Jan 5, 2025, 11:58 PM

#

it's the only legal way to conspire isnt it?

#

if you have a handbook of plays that are called "technical analysis" rather than "illegally conspiring with other traders using price history as a communication channel" then you can all conspire and get fat together

#

obviously when it comes to shares sentiment analysis is a channel (we get our mates in the financial press, as a covert signal), and quarterly reports are too (dunno if it's all hard data, the conspiritard in me would expect there to be secret handshake language in them), and global political news for things like currency and commodities, they send messages for what nation states want, and those who go against them won't get free printed money

#

I'd imagine a lot of that is decodable via machine learning techniques

#

there's a nonzero chance i'm talking out of my arse though 😂

unkempt wigeon Jan 6, 2025, 12:08 AM

#

Has anyone ever thought of layer probability to add into
W*X+B

gray slate Jan 6, 2025, 12:09 AM

#

unkempt wigeon Has anyone ever thought of layer probability to add into ```W*X+B```

whaddyamean by layer probability?

unkempt wigeon Jan 6, 2025, 12:09 AM

#

gray slate whaddyamean by layer probability?

Sorry typed that wrong

gray slate Jan 6, 2025, 12:09 AM

#

lol I assumed I read it wrong, saw your edit!

#

you mean like a loss function for a specific layer?

unkempt wigeon Jan 6, 2025, 12:10 AM

#

Why mean is taking a sum of the amount of neurons within that specific layer and getting the average or the sum of what the layer is and then passing that through to tell the network which layer might need to be tweaked a bit to get a more perfect answer

gray slate Jan 6, 2025, 12:11 AM

#

do you know what the layer represents? like, did you train some of it one way and now you're locking layers during fine tuning or something?

#

or added a bunch of layers after training

unkempt wigeon Jan 6, 2025, 12:12 AM

#

gray slate do you know what the layer represents? like, did you train some of it one way an...

It locks the best probability think of it as a genetic algorithm almost but it tries to find better numbers that can better lock in to get a higher probability of that being that case

gray slate Jan 6, 2025, 12:15 AM

#

I think there's "elastic weight freezing" that you use to stop catastrophic forgetting when you're tuning or continually training

#

that's kinda "find the most important weights and make sure they don't change too much because they have the strongest effect on the chance of a decent outcome"

#

but it's not per-layer, it has to be per-weight because in general you don't really know how the network is gonna capture the transformations. well, unless you are carefully designing the architecture to have certain properties, like encoder/decoder pairs

unkempt wigeon Jan 6, 2025, 12:18 AM

#

Innocence you could make a sub neuron group like a class for each neuron there's eight layers of neurons in each one that gives you a binary coded pair one's and zeros similar to a human brain but that would require a high system just to render maybe one neuron because it needs a lot of power and processing to go through each some layer of the neuron to find out the probability or number that should come out

gray slate Jan 6, 2025, 12:20 AM

#

might be cool to have a single bit that says "if this bit is set, then don't change the power of this float. you can change its mantissa but not its power"

#

you'd get elastic weight freezing for 1 bit of RAM. But you'd also need to do some pretty low-level hacking in CUDA

#

dunno if its even possible. thought about it the other day but have no experience working at that level in CUDA

#

(I'm not a machine learning expert btw, just a graphics / performance nerd who is dipping his toe into this)

unkempt wigeon Jan 6, 2025, 12:28 AM

#

Here's what I thought as a possible way sorry for my messy handwriting

unkempt wigeon Jan 6, 2025, 12:30 AM

#

gray slate (I'm not a machine learning expert btw, just a graphics / performance nerd who i...

I've been trying to look for the right book sorry

gray slate Jan 6, 2025, 12:31 AM

#

unkempt wigeon Here's what I thought as a possible way sorry for my messy handwriting

doesn't backprop kinda do this for you anyway? I'm kinda a noob myself

unkempt wigeon Jan 6, 2025, 12:32 AM

#

gray slate doesn't backprop kinda do this for you anyway? I'm kinda a noob myself

Yes but it's more of a passing gate the higher the probability as the network is learning it allows it to go further but the lower the probability it stops it and it's tracks from learning until it gets the right probability from each layer so that it's mostly on the same page in a sense

gray slate Jan 6, 2025, 12:34 AM

#

ah okay. what's the intended outcome here?

unkempt wigeon Jan 6, 2025, 12:37 AM

#

gray slate ah okay. what's the intended outcome here?

To have the network get closer to winnable probability of it saying this is correct by locking the next layer till the probability outputs the perfect closeness sure it seems stupid but heck it might be the most probable answer that can be obtained making it so that the network might be slow but after it gets the perfect probability on each later the networks learns more about each probability it does the back tracing as it's working on going forward so that I can tweak it to get an outcome that can be quickly obtained without going through the entire layer finding out which layer needs to be changed and then completely reworked by doing the process and parallel you could probably cut down on all the processes

gray slate Jan 6, 2025, 12:37 AM

#

from what I understand, as the network learns, the first layers tend to settle down first and learn higher order patterns. then later on the deeper layers flap about more and learn more nuanced ones. You can use that to your advantage with curriculum learning - you give it easier data to start with then ramp up the complexity and reduce the learning rate as you progress

gray slate Jan 6, 2025, 12:40 AM

#

unkempt wigeon To have the network get closer to winnable probability of it saying this is corr...

so you're kinda doing it the other way round, like forward instead of backwards? is that like Hinton's "feed-forward" idea, and the way that human brains work? because a ML guru once told me that it's massively inefficient that way, and backprop puts the human brain to shame by being orders of magnitude better at learning

#

though per layer sounds interesting. like if you found parts of your dataset that caused more variance in later layers, you could perhaps run a few tests and automatically build a curriculum?

unkempt wigeon Jan 6, 2025, 12:49 AM

#

gray slate though per layer sounds interesting. like if you found parts of your dataset tha...

W•X+B(P/L of P)

P = current probability
L of P = last layers probability

I know it's most unlikely to work

gray slate Jan 6, 2025, 12:51 AM

#

Is P the loss at the current weight, and L of P some aggregate of the loss of its inputs?

unkempt wigeon Jan 6, 2025, 12:58 AM

#

gray slate Is P the loss at the current weight, and L of P some aggregate of the loss of it...

If it's the current probability and the loss would be out of p basically the probability minus the probability of the last layers probability

unkempt wigeon Jan 6, 2025, 1:00 AM

#

gray slate Is P the loss at the current weight, and L of P some aggregate of the loss of it...

Sorry

gray slate Jan 6, 2025, 1:02 AM

#

I may be the wrong person to be asking this tbh! I kinda get the general principles but lack the mathemagic and practical experience!

unkempt wigeon Jan 6, 2025, 1:14 AM

#

gray slate I may be the wrong person to be asking this tbh! I kinda get the general princip...

Do you think it might be impractical?

gray slate Jan 6, 2025, 1:15 AM

#

unkempt wigeon Do you think it might be impractical?

Dunno to be honest, I think the whole of machine learning seems impractical. It's all a matter of whether it gets results or not

#

seems like the whole field is blundering through hacks that kinda work via magic, and using empiricism to prove it

#

Do you have a specific task in mind?

unkempt wigeon Jan 6, 2025, 1:21 AM

#

gray slate Do you have a specific task in mind?

No I was trying to condense the process into one task that can be done by the computer all at once I know it seems stupid but it allows the highest form or the best data with the highest probability to go forward to next wine and then it would lock the best data strings and then take the data from that string pass it on to the next one if it's already been previously unlocked via being above the threshold

gray slate Jan 6, 2025, 1:23 AM

#

there's a lot of tasks out there, y'know

unkempt wigeon Jan 6, 2025, 1:28 AM

#

gray slate there's a lot of tasks out there, y'know

I know I want it to really learn anything it really depends on when I want to set forth on but so that the network gets the highest probability through training having this is more of a genetic algorithm but it can adapt almost if the probability is high for something specific within that neuron arrangement or layer it allows it to pass on to the next one and if so on so forth it gets to the end it learns that data because it took multiple tries for the data to be fit enough to go through and be learned

rich moth Jan 6, 2025, 2:03 AM

#

So try to imagine a single tool that can quantify the complexity of any dataset. Doesn't matter whether it’s images, text, or numbers. Using a unified metric using Phi(x), it basically quantifies the complexity of the data based on things like density, entropy, phase, and uncertainty. It doesn’t stop at just measuring complexity. Phi(x) helps uncover hidden patterns and relationships like identifying chaotic points in a time series or spotting high density regions in images or finding real relationships in tabular data. Its like an x-ray for datasets, a way to make sense of the abstract, hidden stuff in data.

rancid sorrel Jan 6, 2025, 2:16 AM

#

you should also measure nosie

rich moth Jan 6, 2025, 2:17 AM

#

rancid sorrel you should also measure nosie

thats a fantastic idea!

rancid sorrel Jan 6, 2025, 2:17 AM

#

the biggest generic one is to find the data distrabtion

gray slate Jan 6, 2025, 2:17 AM

#

compress it for starters I guess?

rancid sorrel Jan 6, 2025, 2:18 AM

#

and the clustering of data

gray slate Jan 6, 2025, 2:18 AM

#

I mean, decompress it first so it's raw. then throw it into PAQ

rancid sorrel Jan 6, 2025, 2:18 AM

#

do you wanna know the "biggest" test for data?

rich moth Jan 6, 2025, 2:18 AM

#

rancid sorrel do you wanna know the "biggest" test for data?

yes i do

rancid sorrel Jan 6, 2025, 2:18 AM

#

second order difrrential of the dataseet

gray slate Jan 6, 2025, 2:19 AM

#

rate of change of rate of change?

rancid sorrel Jan 6, 2025, 2:19 AM

#

yup

#

d^2x
dy^2 or something

gray slate Jan 6, 2025, 2:19 AM

#

you need to know dimensionality too though for that dont you?

rancid sorrel Jan 6, 2025, 2:19 AM

#

not really

#

cause you want the trend line to be junk

#

you get an actual trend line your data has been faked

#

its how they spot fake data in publications when someone with brains looks at it

gray slate Jan 6, 2025, 2:20 AM

#

well a bitmap, for example, is [y][x][channels]

rancid sorrel Jan 6, 2025, 2:20 AM

#

and it should, return junk

#

it returns a r2 of say 1 your fucked

#

cause the data has been manipulate

gray slate Jan 6, 2025, 2:22 AM

#

not sure I understand the reasoning behind that. I could make it junk by multiplying in bytes from /dev/urandom couldn't i?

rancid sorrel Jan 6, 2025, 2:23 AM

#

yeah but the ppl who submit their papers dont know that

gray slate Jan 6, 2025, 2:23 AM

#

ah ok lol

#

they fake their papers and don't understand that they shouldl find the normal distribution of the thing they want to fake and introduce some variance from it?

#

and, people actually do that? makes me very skeptical of any science that doesn't come with full datasets and runnable code; docker container / gtfo

rancid sorrel Jan 6, 2025, 2:25 AM

#

sorry trying to find the video on it

#

can cant right now

gray slate Jan 6, 2025, 2:27 AM

#

I've a theory that anything that backs up something we already believe culturally is not science, it's ethics in a lab coat, and that the truth value of something is inversely proportional to how hard it's pushed.

rich moth Jan 6, 2025, 2:28 AM

#

gray slate Jan 6, 2025, 2:29 AM

#

clustering looks pretty cool, what's it clustering?

rich moth Jan 6, 2025, 2:30 AM

#

gray slate clustering looks pretty cool, what's it clustering?

Thanks! its different data point based on their features using PCA for dim reduction. Each cluster represents , images,text, time series.

gray slate Jan 6, 2025, 2:31 AM

#

PCA = principal component analysis? what does that do exactly?

rancid sorrel Jan 6, 2025, 2:32 AM

#

gray slate clustering looks pretty cool, what's it clustering?

given point A how close is point B

#

is point B related to A

gray slate Jan 6, 2025, 2:35 AM

#

oh so that's the colour space of the image? like luminance and hue or something?

rich moth Jan 6, 2025, 2:37 AM

#

gray slate oh so that's the colour space of the image? like luminance and hue or something?

not just some features, all features then reduces the dim size while still keeping the most critical parts intact.

#

https://builtin.com/machine-learning/pca-in-python

Built In

PCA Using Python: A Tutorial | Built In

Principal component analysis (PCA) in Python can be used to speed up model training or for data visualization. Here's how to carry out both using scikit-learn.

rancid sorrel Jan 6, 2025, 2:40 AM

#

gray slate oh so that's the colour space of the image? like luminance and hue or something?

its for general clustering

#

its a classifcation thing tbh

#

https://media.geeksforgeeks.org/wp-content/uploads/20230320171738/download-(25).png

gray slate Jan 6, 2025, 2:41 AM

#

but does it operate on tensor array of pixels?

rancid sorrel Jan 6, 2025, 2:41 AM

#

this data is clearly clusterd for example

gray slate Jan 6, 2025, 2:41 AM

#

or tensor of pixels I guess. array of pixels?

rancid sorrel Jan 6, 2025, 2:42 AM

#

i mean you got a large group of one greyscale colour in an area kmeans would probably pick it up

rich moth Jan 6, 2025, 2:42 AM

#

not exactly classification. clustering is a unsupervised learning technique.

rancid sorrel Jan 6, 2025, 2:43 AM

#

i mean it tells you if say your data is a member of A,B,C

#

so yeah you can use it for classifcatoin

#

like for example
you like drama on netflix, if you like anime

#

thats one of the things kmeans can do

gray slate Jan 6, 2025, 2:43 AM

#

yeah I get k-means. it kinda draws lines through the space cutting it into bubble type things that pop and stick together until you have N left?

rich moth Jan 6, 2025, 2:44 AM

#

its results can inform or be used in conjunction with classifcation, but their two seperate beast.

gray slate Jan 6, 2025, 2:44 AM

#

with 3d it'd be planes, and with higher dimensions some n-dimensional surface line that's cropped by intersections

rancid sorrel Jan 6, 2025, 2:44 AM

#

https://www.baeldung.com/cs/k-means-for-classification

Baeldung on Computer Science

K-Means for Classification | Baeldung on Computer Science

Learn how to use K-means for classification.

gray slate Jan 6, 2025, 2:46 AM

#

k-means seems extremely computationally expensive from a gfx hacker point of view! but I guess it's generic

rancid sorrel Jan 6, 2025, 2:46 AM

#

oh absoutly its a insane computation cost

#

its like what o^e time or someshit

gray slate Jan 6, 2025, 2:47 AM

#

data science has all of the brute force and none of the ignorance lol

rancid sorrel Jan 6, 2025, 2:47 AM

#

its been a while since ive consulted the chart

#

yeah i have a chart somewhere of alot the algorthims complexity

rich moth Jan 6, 2025, 2:49 AM

#

isnt k-means a general algo?

rancid sorrel Jan 6, 2025, 2:49 AM

#

yes, but like a nuke is a long understood weapon you dont wanna pull it out

#

on big ass data ;)#

rich moth Jan 6, 2025, 2:54 AM

#

rancid sorrel yes, but like a nuke is a long understood weapon you dont wanna pull it out

I think we're both right its clustering by k-means essentially

#

🙂

#

I mean there are several things going on but thats half of it

#

PCA is the other key

gray slate Jan 6, 2025, 2:58 AM

#

hmm looking at it, k-means doesn't seem all that bad for iterations, it's splitting the space that'll hurt I guess

#

if you don't know how many clusters you have then I guess you need to have everything be a centroid then you'll get worst-case performance?

rich moth Jan 6, 2025, 3:00 AM

#

That was the last of the visuals

rancid sorrel Jan 6, 2025, 3:05 AM

#

there are entire subject on image processing tbh

#

its one of the more developed feilds, that and signal processing

wheat merlin Jan 6, 2025, 3:05 AM

#

gray slate PCA = principal component analysis? what does that do exactly?

They use PCA to create indexs for stuff. For instance you could dump 100 features and it picks out the most important smaller part that is closest to the actual feature space. Used to improve overfitting too. It's similar to autoencoders in neural nets

rich moth Jan 6, 2025, 3:16 AM

#

gray slate k-means seems extremely computationally expensive from a gfx hacker point of vie...

I did some things to address this. Creating memory effiecent versions of large datasets is important.

#

scaling and PCA

#

But since all datasets will be memory efficent it solves that.

#

well all is a big word lol i cant say that yet

gray slate Jan 6, 2025, 3:17 AM

#

The guy who invented YOLO put an entire course on computer graphics on YouTube

rancid sorrel Jan 6, 2025, 3:18 AM

#

best option you have is paqute and avro

#

parqute if you want columns avro if you want rows

gray slate Jan 6, 2025, 3:19 AM

#

https://www.youtube.com/watch?v=8jXIAWg_yHU

YouTube

Joseph Redmon

The Ancient Secrets of Computer Vision - 01 - Introduction

The Ancient Secrets of Computer Vision

https://pjreddie.com/courses/computer-vision/

An introductory course on computer vision originally held Spring 2018 at the University of Washington.

▶ Play video

rich moth Jan 6, 2025, 3:19 AM

#

!paste

arctic wedgeBOT Jan 6, 2025, 3:19 AM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

gray slate Jan 6, 2025, 3:20 AM

#

Well worth a watch, since, well, y'know, he actually invented YOLO. Knew a fair bit going in but there was a lot of good stuff in there too

rancid sorrel Jan 6, 2025, 3:20 AM

#

whats pretty funny is disney cracked "green screen" decades ago

#

then lost it for 50-60 years

#

then some youtubers worked it out again

gray slate Jan 6, 2025, 3:21 AM

#

what do you mean?

rancid sorrel Jan 6, 2025, 3:21 AM

#

so pretty much all VFX now gonna be recorded in video streams

#

https://www.youtube.com/watch?v=UQuIVsNzqDk

YouTube

Corridor Crew

This Invention Made Disney MILLIONS, but Then They LOST It!

Squarespace ► Head to http://squarespace.com/corridorcrew to save 10% off your first purchase!
Our videos are made possible by Members of CorridorDigital, our Exclusive Streaming Service! Try a membership yourself with a 14-Day Free Trial ► http://corridordigital.com/

Four years ago, we learned about Disney's magic prism that created the best t...

▶ Play video

rich moth Jan 6, 2025, 3:22 AM

#

incase you like to check out the metrics https://paste.pythondiscord.com/XEPQ

gray slate Jan 6, 2025, 3:23 AM

#

rancid sorrel https://www.youtube.com/watch?v=UQuIVsNzqDk

oh that's a cool hack I like that

rancid sorrel Jan 6, 2025, 3:23 AM

#

you should also use the python profiler

rich moth Jan 6, 2025, 3:23 AM

#

rancid sorrel you should also use the python profiler

you mean me?

rancid sorrel Jan 6, 2025, 3:23 AM

#

yeah will tell you whats going on with how many times stuff has been called ect

rich moth Jan 6, 2025, 3:24 AM

#

ok let me check that out thanks

rancid sorrel Jan 6, 2025, 3:24 AM

#

https://docs.python.org/3/library/profile.html

Python documentation

The Python Profilers

Source code: Lib/profile.py and Lib/pstats.py Introduction to the profilers: cProfile and profile provide deterministic profiling of Python programs. A profile is a set of statistics that describes...

gray slate Jan 6, 2025, 3:24 AM

#

snakevis

rancid sorrel Jan 6, 2025, 3:24 AM

#

cprofile

gray slate Jan 6, 2025, 3:24 AM

#

snakeviz? can't remember. it's good for profile outputs though

rancid sorrel Jan 6, 2025, 3:24 AM

#

i used it to test dataclasses

#

and holy fuck they are effiecnt

#

gray slate Jan 6, 2025, 3:35 AM

#

reckon I can train Whisper or similar to keylog using this? https://paste.pythondiscord.com/HZCA

#

if I had enough data

rich moth Jan 6, 2025, 3:39 AM

#

gray slate reckon I can train Whisper or similar to keylog using this? https://paste.python...

For sure . It loosk good and Whisper is a great pick. Hmm, I would focus on your data , like lots of it and good feature engineering.

gray slate Jan 6, 2025, 3:40 AM

#

might be better as a web app actually, and have people type data in and post it to something that uploads it to archive.org

#

pretty sure milint and the likes already have this data. would be cool to let everyone else have access to it

rich moth Jan 6, 2025, 3:46 AM

#

gray slate might be better as a web app actually, and have people type data in and post it ...

its a reall good idea actually. What are you thinking for the backend like Node.js or Python? Serverless or cloud functions? And what have you considered about the quality of data coming in on the web ? maybe like client side checks to validate the audio recording lengths and format before its uploaded.

gray slate Jan 6, 2025, 3:48 AM

#

Well, I'll run it for a bit locally and see if I can get something working I guess. If I can then javascript -> fastapi -> internetarchive -> tag the uploads so they can be found by anyone who wants to train it. release as a docker container

rich moth Jan 6, 2025, 3:49 AM

#

gray slate Well, I'll run it for a bit locally and see if I can get something working I gue...

I like it dude, you got a sound game plan

gray slate Jan 6, 2025, 3:50 AM

#

I doubt HuggingFace would allow it lol

rich moth Jan 6, 2025, 3:51 AM

#

What about streamlit?

#

I guess thats not the best option

gray slate Jan 6, 2025, 3:53 AM

#

Dunno, decentralized is probably the way to go for something that might upset the safety crowd

rich moth Jan 6, 2025, 3:59 AM

#

How bout FASTAPI for the backend React.js for the frontend and leverage services like elastic cloud or AWS Elasticsearch and Grafana.

gritty vessel Jan 6, 2025, 3:59 AM

#

Hey guys I wanted to ask I am working with conv3d

#

So if data is of height 300 width 300 depth let's say I stacked all images so it's 150

#

H=300,w=300,d=150

#

So If I keep kernel size x,y,10 so it will be capturing temporal features right? Till 10 time steps

gray slate Jan 6, 2025, 4:03 AM

#

rich moth How bout FASTAPI for the backend React.js for the frontend and leverage services...

Well, with Internet Archive I don't need to pay anyone. I don't run it on anyone else's machine, I'm not responsible for it as a whole. It just runs when people decide to run it, they need an archive.org account - which is free. It collects and tags data, dropping it into a folder that can be downloaded by anyone, and Brewster's bunch aren't gonna delete data collected like that either. And some people will upload good data, others bad, and you can filter them by their archive.org username

gray slate Jan 6, 2025, 4:04 AM

#

gritty vessel So If I keep kernel size x,y,10 so it will be capturing temporal features right?...

haven't used it personally

rich moth Jan 6, 2025, 4:05 AM

#

gray slate Well, with Internet Archive I don't need to pay anyone. I don't run it on anyone...

i was honestly unaware. it sounds aweome, ill check it out. well dont let me misguide you! I like that idea

#

wow this is awesome thanks for sharing i didnt know something like this existed.

rich moth Jan 6, 2025, 4:08 AM

#

gritty vessel Hey guys I wanted to ask I am working with conv3d

Are you building a CNN?

gritty vessel Jan 6, 2025, 4:08 AM

#

rich moth Are you building a CNN?

Yeah

#

Encoder-decoder arch

rich moth Jan 6, 2025, 4:11 AM

#

gritty vessel Yeah

Do you plan on maintaininn the dims throughout the nnetwork or will you collapse them at some point?

#

I mean what kind of output are you trying to gennerate with the encoder-decoder?

gritty vessel Jan 6, 2025, 4:21 AM

#

It's unsupervised classification

#

So what I am trying to so is if I can reconstruct an image properly

#

If can reconstruct the images properly I will apply clustering on the latent space

gritty vessel Jan 6, 2025, 4:22 AM

#

rich moth Do you plan on maintaininn the dims throughout the nnetwork or will you collapse...

Yes

rich moth Jan 6, 2025, 4:50 AM

#

So that complexity tool is cool and all but the real magic is how it adaptive to a dynamic adaptive transformer. The cool part is the transformer can automatically adjust its dim size based on the complexity of the data, so like bigger for more complex stuff and smaller or simpler things. It's like the automatic transmisson of a car as gear change it speaks with a sensor to dynamically shift it,

rancid sorrel Jan 6, 2025, 4:50 AM

#

you kno what i just thought of that would befucking awsome

rich moth Jan 6, 2025, 4:51 AM

#

rancid sorrel you kno what i just thought of that would befucking awsome

tell me!

rancid sorrel Jan 6, 2025, 4:51 AM

#

do you know what a continous varible transmission is?

rich moth Jan 6, 2025, 4:51 AM

#

I own one a Nissan Altima 2014.

rancid sorrel Jan 6, 2025, 4:51 AM

#

make one for transfromer

rich moth Jan 6, 2025, 4:52 AM

#

its is pretty awesome idea.

rancid sorrel Jan 6, 2025, 4:54 AM

#

i am trying to find the math for this
https://www.youtube.com/watch?v=mWJHI7UHuys&t=1068s

YouTube

driving 4 answers

This Is The World's First Geared CVT and It Will Blow Your Mind - ...

https://www.ratiozero.com/
Support the channel by shopping through this link: https://amzn.to/3RIqU0u
Patreon: https://www.patreon.com/d4a
Become a member: https://www.youtube.com/channel/UCwosUnVH6AINmxtqkNJ3Fbg/join
Grit: https://www.youtube.com/channel/UCt3YSIPcvJsYbwGCDLNiIKA

Today I have the privilege to hold in my hands something special....

▶ Play video

rich moth Jan 6, 2025, 5:28 AM

#

rancid sorrel i am trying to find the math for this https://www.youtube.com/watch?v=mWJHI7UHuy...

its interesting. ill check out the video tomorrrow. bed time

plucky iron Jan 6, 2025, 12:30 PM

#

can't able to route to different apps within the HF space
my root app and another folder app files I can't see working
https://huggingface.co/spaces/QBit069/inq
my main app.py is:

# folder structure
# root/
# ├── demo/
# │   └── app.py
# ├── resnet18/
# │   └── app.py
# ├── resnet34/
# │   └── app.py
# └── app.py  # Main app to route traffic


from fastapi import FastAPI
from fastapi.middleware.wsgi import WSGIMiddleware
import os

# Create a FastAPI instance
app = FastAPI()

# Mount the apps
def load_subapp(folder_name):
    folder_path = os.path.join(os.getcwd(), folder_name)
    exec(open(os.path.join(folder_path, "app.py")).read(), globals())
    return globals().get("app")

app.mount("/demo", WSGIMiddleware(load_subapp("demo")))
app.mount("/resnet18", WSGIMiddleware(load_subapp("resnet18")))
app.mount("/resnet34", WSGIMiddleware(load_subapp("resnet34")))

@app.get("/")
def read_root():
    return {"message": "Welcome! Use /demo, /resnet18, or /resnet34 to access specific models."}```

How can I route?

fickle shale Jan 6, 2025, 2:26 PM

#

just ask

gritty vessel Jan 6, 2025, 2:58 PM

#

Is there any way to to capture variable time steps in data set like I have many events

#

Rain events*

#

Some events are 9 hours long and some are like half hour

#

So when I create batches

#

One event has like 50 time steps that's the total duration of the event

#

And some had like 2 time steps that's around half hour

fickle shale Jan 6, 2025, 3:05 PM

#

gritty vessel Is there any way to to capture variable time steps in data set like I have many ...

creating another column?

gritty vessel Jan 6, 2025, 3:05 PM

#

fickle shale creating another column?

It's 2d data

#

Let's say event 1 ,timestep 1 will have a image

#

All the time steps in that event will have an image

#

But count of image differs in each event

fickle shale Jan 6, 2025, 3:07 PM

#

gritty vessel It's 2d data

like event 1 has one image and in this image we have 5image?

gritty vessel Jan 6, 2025, 3:07 PM

#

No what?

#

How can have 5 image in 1 image

fickle shale Jan 6, 2025, 3:08 PM

#

gritty vessel How can have 5 image in 1 image

a image of multiple image collage type

#

can u elaborate ur question i am not able to understand!

gritty vessel Jan 6, 2025, 3:08 PM

#

gritty vessel Let's say event 1 ,timestep 1 will have a image

Here

#

Wait I will explain again

#

See it's raining for 9 hours so we will have 56 images for those 9 hours

#

This all 9 hours in 1 event

#

After few hours it rains again for 2 hours

#

So it's event 2 but now it has 12 images

#

Do you understand it now?

fickle shale Jan 6, 2025, 3:14 PM

#

I don't know may be others can help u

fickle shale Jan 6, 2025, 3:14 PM

#

gritty vessel Is there any way to to capture variable time steps in data set like I have many ...

u want to capture timesteps?

#

and ur data look likes?

gritty vessel Jan 6, 2025, 3:15 PM

#

My bad

#

It's sequence of images

#

But for each event sequence is different

#

And size of sequence is different as well

hybrid locust Jan 6, 2025, 6:35 PM

#

Hey folks, hope you're doing well. I'm trying to run this project for humanizing a midi clip
https://github.com/erwald/midihum

GitHub

GitHub - erwald/midihum: MIDI humanisation with machine learning

MIDI humanisation with machine learning. Contribute to erwald/midihum development by creating an account on GitHub.

#

Thing is, when running Main.py, it throws the following error. Do you know how to fix it?

midihum_model loading model from model_cache\midihum.json and model_cache\midihum_scaler.json
C:\Users\Guido\AppData\Local\Programs\Python\Python313\Lib\site-packages\sklearn\utils\_tags.py:354: FutureWarning: The MyXGBRegressor or classes from which it inherits use `_get_tags` and `_more_tags`. Please define the `__sklearn_tags__` method, or inherit from `sklearn.base.BaseEstimator` and/or other appropriate mixins such as `sklearn.base.TransformerMixin`, `sklearn.base.ClassifierMixin`, `sklearn.base.RegressorMixin`, and `sklearn.base.OutlierMixin`. From scikit-learn 1.7, not defining `__sklearn_tags__` will raise an error.
  warnings.warn(
midihum could not humanize the given file: 'super' object has no attribute '__sklearn_tags__'```

steep bough Jan 6, 2025, 7:22 PM

#

Hello hello

#

I'm here to ask about what would it take to work on the more theoretical part of Data Science? The more "science" part of it rather than the more applied/industry part of it

#

I'm currently in a CS major, but I'm thinking about changing to a more math reliant degree at my college (which is math focused on Data Science and AI)

serene scaffold Jan 6, 2025, 7:44 PM

#

@steep boughthe kind of jobs you're describing are to be had in academia more so than in industry. so you'd need to get a PhD. And when you're getting a PhD, you get to blaze your own trail and be interdisciplinary.

whatever direction you end up wanting to go with theoretical data science, there are PhDs doing it in the context of CS, statistics, and probably a few others.

wheat merlin Jan 6, 2025, 7:51 PM

#

steep bough I'm currently in a CS major, but I'm thinking about changing to a more math reli...

Stelercus is right. Also, if you end up applying to Ph.D./Masters programs, i'd highly recommend you try and be part of some research projects in undergrad so that when you apply you have that on your resume

#

Also when you apply, it can be a good practice to contact professors and suggest professors you would like to work with on your application

#

Showing that you know what research is and that you are prepared for it is like 80% of the selection process

steep bough Jan 6, 2025, 7:56 PM

#

Yeah. I had a feeling that that would lead me to academia, which is what I want to do (or at least that's where my current interests lie)

#

And, what kind of projects does one do in theoretical data science?

steep bough Jan 6, 2025, 8:00 PM

#

wheat merlin Stelercus is right. Also, if you end up applying to Ph.D./Masters programs, i'd ...

I'm planning to do that. My school let's me have a semester where I can do pretty much anything, and one of those things is a research project. I'm just still figuring out what area

serene scaffold Jan 6, 2025, 8:06 PM

#

steep bough And, what kind of projects does one do in theoretical data science?

talk to the research faculty in your department and ask them about what they're working on.

steep bough Jan 6, 2025, 8:29 PM

#

Got it
Thanks :3

hybrid locust Jan 6, 2025, 9:32 PM

#

hybrid locust Thing is, when running `Main.py`, it throws the following error. Do you know how...

solved! Had to downgrade scikit-learn to a lower version → https://stackoverflow.com/questions/79290968/super-object-has-no-attribute-sklearn-tags

rancid sorrel Jan 6, 2025, 10:06 PM

#

steep bough Yeah. I had a feeling that that would lead me to academia, which is what I want ...

i hope your good at degree level math

mighty widget Jan 6, 2025, 10:44 PM

#

i have an image it has bounding boxes which are txt files how do I make this in yolov8 format

#

the image looks like this

#

bounding boxes looks like this

#

in yolo format

#

how can I export this for use

steep bough Jan 6, 2025, 11:39 PM

#

rancid sorrel i hope your good at degree level math

I think I am, although I haven't had much beyond some discrete math and calc 3

#

Oh, and a bit of linear algebra

earnest widget Jan 7, 2025, 1:00 AM

#

mighty widget in yolo format

Are you using yolov5 ultralytics?

rich moth Jan 7, 2025, 3:44 AM

#

The time series are really interesting.

fickle shale Jan 7, 2025, 11:52 AM

#

rich moth The time series are really interesting.

Damn! What r u doing!

cursive oriole Jan 7, 2025, 12:42 PM

#

Hello, I want to get into AI-ML, I can't afford any of the paid courses available online which is why I'm watching a lot of statquest, but I haven't done anything on the coding part, I have a few project ideas I wanna try, one of them is to create an AI model that can be trained to speed run videogames and do things like find glitches or come up with tactics

#

I want some advise on what I should do next after watching StatQuest to like work towards that project if it is possible

bright rain Jan 7, 2025, 12:46 PM

#

Why is tensorflow library not working on python version 13?

wheat merlin Jan 7, 2025, 1:18 PM

#

cursive oriole I want some advise on what I should do next after watching StatQuest to like wor...

You could train a model for PyGame snake

fickle shale Jan 7, 2025, 1:37 PM

#

cursive oriole Hello, I want to get into AI-ML, I can't afford any of the paid courses availabl...


Donald Knuth```

rich moth Jan 7, 2025, 3:02 PM

#

fickle shale Damn! What r u doing!

The cool thing about analyzing the second derivative like this is that it pulls out patterns in how the data is accelerating or decelerating over time

odd meteor Jan 7, 2025, 4:31 PM

#

bright rain Why is tensorflow library not working on python version 13?

Perhaps the maintainers of TensorFlow are yet to update the library to support the latest version of Python.

You might wanna downgrade to version 12 if you really can't wait for them to make the update.

odd meteor Jan 7, 2025, 4:34 PM

#

cursive oriole Hello, I want to get into AI-ML, I can't afford any of the paid courses availabl...

Are you familiar with Sklearn yet? If no, I think you can start with

https://kaggle.com/learn
Andrew NG machine learning course on Coursera (it's free)

Learn Python, Data Viz, Pandas & More | Tutorials | Kaggle

Practical data skills you can apply immediately: that's what you'll learn in these no-cost courses. They're the fastest (and most fun) way to become a data scientist or improve your current skills.

plush kettle Jan 7, 2025, 4:44 PM

#

Do news classifier classify news bias based on title or content, this is for NLP

serene scaffold Jan 7, 2025, 4:50 PM

#

plush kettle Do news classifier classify news bias based on title or content, this is for NLP

I would expect them to use both, but there's no requirement.

calm thicket Jan 7, 2025, 4:51 PM

#

why not both

weak oxide Jan 7, 2025, 4:58 PM

#

Some help with my learning of machine learning. I managed to master Prophet which I found surprisingly easy to implement but for some reason I cant implement a simple linear regression machine learning model

import matplotlib as plt
from matplotlib import pyplot as plt
import seaborn as sns
import sklearn
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
print (plt.style.available)
from sklearn.metrics import mean_squared_error,r2_score
from sklearn import linear_model
lr = linear_model.LinearRegression()
plt.style.use('classic')

GIS = pd.read_csv('GIS Prices.csv')
GIS.head()
df1 = pd.DataFrame(data=GIS, columns=['y'])
df2 = pd.DataFrame(data=GIS, columns=['Close'])
df = pd.merge(df1, df2, left_index=True, right_index=True)
df
X = df['y']
y = df['Close']
X_train,X_test,y_train,y_test= train_test_split(X,y,test_size =0.2)
lr = linear_model.LinearRegression()
lr.fit(X_train,y_train) ```
I keep getting this error : ValueError: Expected a 2-dimensional container but got <class 'pandas.core.series.Series'> instead. Pass a DataFrame containing a single row (i.e. single sample) or a single column (i.e. single feature) instead.
How would I fix this problem?

#

I get the error after I press lr.fit(X_train,y_train)

#

I would send the csv but dont know how

serene scaffold Jan 7, 2025, 5:03 PM

#

weak oxide Some help with my learning of machine learning. I managed to master Prophet whic...

GIS is already a dataframe. it's unclear why you take it apart and then put it back together.
Is "Close" really the only feature you want the model to use?

weak oxide Jan 7, 2025, 5:04 PM

#

serene scaffold `GIS` is already a dataframe. it's unclear why you take it apart and then put it...

I admit it was my attempt to fix the error, pandas thinks its a series for some reason

serene scaffold Jan 7, 2025, 5:05 PM

#

weak oxide I admit it was my attempt to fix the error, pandas thinks its a series for some ...

it's not that Pandas "thinks" it's a series "for some reason"--it is a series.

weak oxide Jan 7, 2025, 5:05 PM

#

ok so Im unsure what to do then

serene scaffold Jan 7, 2025, 5:05 PM

#

try changing X = df['y'] to X = df[['y']]

weak oxide Jan 7, 2025, 5:06 PM

#

ok

#

It did work nice

#

didnt realize the importance of two brackets instead of one

serene scaffold Jan 7, 2025, 5:08 PM

#

using two gives you a DataFrame instead of a Series

#

you can select more than one column that way. or you can select just one and get a DataFrame with only that column.

weak oxide Jan 7, 2025, 5:08 PM

#

somebody needs to change Stackflow

serene scaffold Jan 7, 2025, 5:09 PM

#

weak oxide somebody needs to change Stackflow

Don't deflect responsibility. It's okay if you didn't know or understand something, but it's not someone else's fault.

weak oxide Jan 7, 2025, 5:09 PM

#

serene scaffold Don't deflect responsibility. It's okay if you didn't know or understand somethi...

Fine I respect that

#

I admit I didnt know

#

Ill see if I can get the linear to work anyway

serene scaffold Jan 7, 2025, 5:10 PM

#

GIS = pd.read_csv('GIS Prices.csv')
train, test = train_test_split(GIS, test_size=0.2)
X_train = train[['y']]
y_train = train['Close']

this should be all that's required.

#

though it's pretty sus that the X data comes from the y column

weak oxide Jan 7, 2025, 5:12 PM

#

I was doing prophet earlier with the same data

#

so it was ds, y, and then close (wheat prices)

#

just for familiarity with the forecasting model

weak oxide Jan 7, 2025, 5:14 PM

#

serene scaffold though it's pretty sus that the X data comes from the y column

You used Prophet?

serene scaffold Jan 7, 2025, 5:14 PM

#

weak oxide You used Prophet?

No

weak oxide Jan 7, 2025, 5:14 PM

#

ok

#

anyways thanks

#

https://facebook.github.io/prophet/

Prophet

Prophet is a forecasting procedure implemented in R and Python. It is fast and provides completely automated forecasts that can be tuned by hand by data scientists and analysts.

#

thanks @serene scaffold

spiral willow Jan 7, 2025, 5:39 PM

#

Hey guys, i am new to this channel. Actually I wanted to build some data science projects with your help and support. Can you guys please me out. Have a great day ahead

serene scaffold Jan 7, 2025, 5:47 PM

#

spiral willow Hey guys, i am new to this channel. Actually I wanted to build some data science...

You can ask specific questions in this channel as you have them, yes

spiral willow Jan 7, 2025, 5:49 PM

#

Okay sure, thanks

bleak dew Jan 7, 2025, 6:09 PM

#

Are there any ways to get pandas to warn when using set items? df["new_column"] = ... ? Just spent an hour tracking down an unexpected edit of a dataframe like this.

serene scaffold Jan 7, 2025, 6:13 PM

#

bleak dew Are there any ways to get pandas to warn when using set items? `df["new_column"]...

not that I know of, but you can avoid shared mutable state by passing copies of dataframes to functions, etc.

bleak dew Jan 7, 2025, 6:13 PM

#

serene scaffold not that I know of, but you can avoid shared mutable state by passing copies of ...

Yep, I'm aware, and usually do this. Yet this pesky thing lurked in the code 😄 Would be nice to get a visible warning thou

ornate iris Jan 7, 2025, 6:38 PM

#

hello all, Im looking for other Data Scientists to review a tool Im trying to make, this one is mostly just about dealing with missing values and trying to automate work in our field where possible

https://paste.pythondiscord.com/FM6A

untold bloom Jan 7, 2025, 7:10 PM

#

In [1]: import warnings

In [2]: def pd_setitem_that_warns(*args, original=pd.DataFrame.__setitem__):
   ...:     warnings.warn(f"setting some df with {args[1:]}")
   ...:     return original(*args)
   ...:

In [3]: pd.DataFrame.__setitem__ = pd_setitem_that_warns

In [4]: df["new"] = 7
/path/to/ipython:2: UserWarning: setting some df with ('new', 7)

In [5]: df
Out[5]:
  item  month  sales  new
0    A      1    100    7
1    A      2    200    7
2    B      3    300    7
3    A      2    100    7
4    D      1    300    7
5    Z      3    200    7
6    Z      4      0    7
7    B      2    500    7
```you can wrap

serene scaffold Jan 7, 2025, 7:20 PM

#

untold bloom ```py In [1]: import warnings In [2]: def pd_setitem_that_warns(*args, original...

good idea--I figured that pd.DataFrame.__setitem__ was read-only

#

@bleak dew look at it ^

ornate iris Jan 7, 2025, 10:48 PM

#

I decided to start again with the earlier file I shared as the foundation to a new program. Here's generation 1! Interface built in PyQt5.

I'd love to hear your feedback!

distant quail Jan 7, 2025, 10:59 PM

#

whats the main difference between pytorch and tensorflow? Which should I be using more?

ornate iris Jan 7, 2025, 11:10 PM

#

distant quail whats the main difference between pytorch and tensorflow? Which should I be usin...

PyTorch and TensorFlow are both powerful frameworks for machine learning and deep learning, but they cater to slightly different preferences and workflows in the data science and ML community.

PyTorch is often favored for its dynamic computation graph, which allows you to define and modify the model graph on the fly. This makes it particularly intuitive and Pythonic, offering flexibility for experimentation, research, and debugging. PyTorch’s ecosystem integrates seamlessly with Python libraries like NumPy and pandas, which is useful for data preprocessing in data science workflows. Moreover, its focus on usability makes it a great choice for quickly prototyping models and working on smaller, more customized projects.

TensorFlow, on the other hand, is built with scalability and production in mind. It uses static computation graphs, which can be optimized for efficient deployment on various platforms, including servers, mobile devices, and even the web (via TensorFlow.js). TensorFlow offers tools like TensorFlow Extended (TFX) for end-to-end ML workflows and TensorFlow Serving for deploying models in production. If your work requires seamless transition from development to production or emphasizes large-scale, distributed training, TensorFlow might be the better fit.

In a data science context, if your focus is on exploratory analysis, rapid experimentation, and custom model development, PyTorch might feel more natural. However, if you're working on a project that needs to scale to production or integrate with a robust deployment pipeline, TensorFlow’s ecosystem offers more comprehensive support. Many data scientists choose to familiarize themselves with both frameworks, as each has unique strengths that can be leveraged based on the project’s requirements.

#

In the Python ecosystem, several libraries form the foundation of data science and machine learning work. NumPy serves as the cornerstone for numerical computing, providing efficient array operations and mathematical functions, while Pandas offers powerful data manipulation and analysis through its DataFrame structure. For machine learning, scikit-learn remains the go-to library for traditional algorithms, preprocessing, and model evaluation, while TensorFlow and PyTorch dominate deep learning applications, with PyTorch gaining particular popularity in research settings. XGBoost and LightGBM are essential for gradient boosting implementations.

For visualization, Matplotlib provides the basic plotting capabilities that many other libraries build upon. Seaborn extends Matplotlib with statistical visualizations and a more modern aesthetic, while Plotly offers interactive plots that work well in web applications and notebooks. For specialized visualizations, Bokeh excels at creating interactive dashboards, and Altair provides a declarative approach to creating statistical charts. In the R ecosystem, ggplot2 remains the gold standard for static visualizations, while libraries like Shiny enable interactive web applications.

distant quail Jan 7, 2025, 11:11 PM

#

👍 thank you :)

serene scaffold Jan 7, 2025, 11:32 PM

#

ornate iris PyTorch and TensorFlow are both powerful frameworks for machine learning and dee...

This sounds like a copy/paste from ChatGPT

ornate iris Jan 7, 2025, 11:37 PM

#

serene scaffold This sounds like a copy/paste from ChatGPT

Mix of GPT and Claude, saves me time in typing while I'm cooking.

#

My apologies, I just reread the rules won't happen again

digital lark Jan 8, 2025, 12:08 AM

#

HELP, i have darts tft that went well in the tests but idk y it started giving out train_loss=nan.0, i even converted the covs to dataframes, dropped na and converted them back to series.

How can I stop getting training loss nan, what can cause this ?

serene scaffold Jan 8, 2025, 12:10 AM

#

digital lark HELP, i have darts tft that went well in the tests but idk y it started giving ...

This probably means that somewhere in the pipeline, you did something invalid, like divide by zero. Try following a tensor from start to finish and see where the nans appear

hexed plume Jan 8, 2025, 12:57 AM

#

Hey less intellegent question here. I am VERY new to python, don't know what I'm doing, and don't know how I managed to code a basic ai. I am using stable baselines 3 and gym for my ai and I'm wondering where do I add aditional inputs to the neural network. Please help me or give me good sources to learn how to do this 🙂

digital lark Jan 8, 2025, 1:09 AM

#

serene scaffold This probably means that somewhere in the pipeline, you did something invalid, l...

Thanks, turns out some of the data was to low, practicality 0. A lot of my data is close to 0 and the scaling turns it to minesweeper out there.

echo lance Jan 8, 2025, 2:05 AM

#

I want to do work in machine learning.. but recently every work i got is related to gen ai and rag projects only... I want to go towards more ML side.. but dont know why automatically I am devieated towards Gen Ai.. is it good for future ? Or should i take some action ?

weak oxide Jan 8, 2025, 2:17 AM

#

echo lance I want to do work in machine learning.. but recently every work i got is related...

What type of work?

shrewd spindle Jan 8, 2025, 2:21 AM

#

Does anyone know about data parallelism like splitting the dataset to finetune a model using multiple GPUs. I have a project but I am a complete beginner in this field. Can anyone help? DM please

ornate iris Jan 8, 2025, 2:21 AM

#

echo lance I want to do work in machine learning.. but recently every work i got is related...

With the new advent of Large Concept Models, Gen AI has a promising next few years. Recommendation systems building more and more into slightly better chat bots and management systems. Especially with more and more research into multimodal, better versions of Test time adaptation and surprise minimization. But IMO, core ML isn't going anywhere and it's only getting stronger as Gen AI depends on developments in core ML skills. Really you have to ask what you're most passionate about.

plush kettle Jan 8, 2025, 3:00 AM

#

serene scaffold I would expect them to use both, but there's no requirement.

Alright, thanks

serene scaffold Jan 8, 2025, 4:05 AM

#

echo lance I want to do work in machine learning.. but recently every work i got is related...

RAG is the hottest thing in AI right now. it's not going to stay that way forever.
And RAG can still involve ML. I recently fine-tuned an LLM to improve its performance for a RAG task (and it worked).

serene scaffold Jan 8, 2025, 4:05 AM

#

shrewd spindle Does anyone know about data parallelism like splitting the dataset to finetune a...

GPUs compute things in a way that's massively parallel. when you use more GPUs, it isn't to make things more parallel per se.

fickle shale Jan 8, 2025, 4:12 AM

#

rich moth The cool thing about analyzing the second derivative like this is that it pulls ...

Great share the dataset and code!

echo lance Jan 8, 2025, 6:24 AM

#

serene scaffold RAG is the hottest thing in AI right now. it's not going to stay that way foreve...

I havent done the finetuning yet .. and so it feels like I am just writing backend code, data processing and prompt engineering. And I feel that this is not data science. So i feel the experience I am gaining is not some high demand experience.. this work can be done by a good developer with some documentation. 😕

So am i overthinking it ? Or should i get more skills in ML or data engineering to get a more difficult domain... with less crowd .. i dont know if it is correct to think like thiss

echo lance Jan 8, 2025, 6:27 AM

#

weak oxide What type of work?

I get freelancing projects from upwork and also from a consultancy.

odd meteor Jan 8, 2025, 8:38 AM

#

shrewd spindle Does anyone know about data parallelism like splitting the dataset to finetune a...

Hi glitch, don't ask question to ask question (if you know what I mean)

Not sure anyone would wanna commit to send you a DM if they can simply answer your questions here. Well, except the person is very free and feeling nice.

Well, what exactly do you need help with in distributed training? Did you try something and you got an error message or ?

Meanwhile, data parallelism (DP) isn't a recommended strategy in practise lately because there's a much better option.

Distributed Data Parallelism (DDP) is now the most preferred (recommended) strategy.

There are several variants of ddp strategy you could explore.

Regular one (DDP)
ddp_spawn
ddp_notebook (if you're training on jupyter lab instead of python script. I don't advise this though but, yeah, this strategy works if you're using Jupyter notebook/lab)
DDP Sharded
Bagua, DeepSpeed
Fully Sharded data parallel (fsdp)
etc.

I might be able to comment further if you provide a full picture on what exactly you need help with. Is it in fixing an error, or actually setting up your code for ddp, or??

fringe adder Jan 8, 2025, 12:18 PM

#

I'm trying to write code for gan, and something is wrong with my discriminator but I can't figure out what. Can anyone help?

willow sequoia Jan 8, 2025, 2:17 PM

#

guys do you think coding neural network in scratch was a mistake ?😭

#

in python it was easy so i had to do it from "scratch" 😭

lapis sequoia Jan 8, 2025, 3:01 PM

#

thanks to python and openai api i made a tool to cheat in school

willow sequoia Jan 8, 2025, 3:05 PM

#

smart

mighty needle Jan 8, 2025, 3:26 PM

#

I need a little help

Basically i have what i think is a tuple with data, the first is a tensor with the image data, and the other is an array which tells the class from which it came from

How do I use it to train a model? If i use something like data[0], i remove the other array, which i want as the classes also work as its classification

rancid sorrel Jan 8, 2025, 3:38 PM

#

see data preprosing

#

specificaly image preprosing

mighty needle Jan 8, 2025, 4:07 PM

#

the image is in the form of a numpy array

serene scaffold Jan 8, 2025, 4:09 PM

#

rancid sorrel specificaly image preprosing

you're telling them to go to #media-processing? this is the correct channel for their question.

rancid sorrel Jan 8, 2025, 4:10 PM

#

i mean thats a dedicated channel for it, but its a part of data science an AI

#

there is just a specific subset of AI data preprocessing thats been around for a long time when it comes to Image/video

#

he can stay here if he wants, but... that was more the term for what he needs to google

#

https://www.kaggle.com/code/shahrish99/imagenet100-resnet18-pgdattacks

ImageNet100-Resnet18-PGDattacks

Explore and run machine learning code with Kaggle Notebooks | Using data from ImageNet100

#

there are alot of existing libires in tensorflow/pytorch so you dont have to actually mess with the image

hasty grail Jan 8, 2025, 4:13 PM

#

I tried locally using gpt4all

  File "C:\Users\hp\Desktop\runnyyy.py", line 2, in <module>
    model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf")
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "d:\jarvis\models\gpt4all\gpt4all-bindings\python\gpt4all\gpt4all.py", line 263, in __init__
    self.model = LLModel(self.config["path"], n_ctx, ngl, backend)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "d:\jarvis\models\gpt4all\gpt4all-bindings\python\gpt4all\_pyllmodel.py", line 291, in __init__
    raise RuntimeError(f"Unable to instantiate model: {errmsg}")
RuntimeError: Unable to instantiate model: Could not find any implementations for backend: kompute```


this is the error I faced

```from gpt4all import GPT4All
model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf")
model_path="D:\JARVIS\Models\gpt4all"
response = model.chat("Hello! How are you?")
print(response)```

this is the code I am trying to run

and I am pretty sure I followed all the instructions to locally download it

#

this is the picture of how my folder looks from inside of the gpt4all installation

rancid sorrel Jan 8, 2025, 4:24 PM

#

sounds like you got an import error

#

this is why i dont use python on windows outside WSL2

mighty needle Jan 8, 2025, 4:32 PM

#

rancid sorrel he can stay here if he wants, but... that was more the term for what he needs to...

i was asking something more along the lines of
"How do I train a model with a tuple"

I don't think data preposing will help, as i dont know what to do with the cleaned data

rancid sorrel Jan 8, 2025, 4:32 PM

#

ok what you actually trying to do?

#

like high level

mighty needle Jan 8, 2025, 4:35 PM

#

mighty needle I need a little help Basically i have what i think is a tuple with data, the fi...

i used an image from dataset function (keras) to get a dataset, and turned into into a batch

that batch is what i will use to train the model, but the batch is in the form of a tuple, first one being image data and the second one being directory data

i want to use both, as the directory data seems to contain the classification i want to use

#

the reason i have
batch_size=None
is because the batch size seems to turn the image data from a (256, 256, 3) to a (batch size, 256, 256, 3) which is a tensor

lapis sequoia Jan 8, 2025, 8:21 PM

#

How do apis work?

serene scaffold Jan 8, 2025, 8:22 PM

#

lapis sequoia How do apis work?

this isn't strictly a data science question; try #python-discussion

rich moth Jan 9, 2025, 6:22 AM

#

lapis sequoia thanks to python and openai api i made a tool to cheat in school

I envy the newer generations but also fear what you'll face. AI is much as you friend as your future adversary.

#

I feel like future generations will be able to cheat easier, but the threshold for acceptance will change. It's funny yet strange how AI and humans agree on one thing, the easiest/fastest path is through exploitation.

rich moth Jan 9, 2025, 6:56 AM

#

The real mind bender is here is , Did AI pick up explotation from hard math or from studying human behavior in its training data? Maybe a synergy of both? What if looking for shortcuts is just a sign of intelligence?

small wedge Jan 9, 2025, 7:02 AM

#

rich moth The real mind bender is here is , Did AI pick up explotation from hard math or f...

wdym by exploitation?

fickle shale Jan 9, 2025, 7:03 AM

#

Can anyone explain me,In multihead attention why we complex ur data instead of this we can't take second highest softmax probablites(let we use 2 self attention here!)?Isn't it valid and saves computation speed!

small wedge Jan 9, 2025, 7:03 AM

#

if you mean finding the easiest path that's just a result of how they are trained, the easiest path often results in a disproportionately large reward in rl for example (hence being an exploit)

rich moth Jan 9, 2025, 7:04 AM

#

small wedge wdym by exploitation?

I mean that both AI and humans naturally look for the easiest way to get a task done.

#

Imagine this, both AI and humans are wired to find the path of least resitantance (well most of us). In AI training, especially RL, if theres a shortcut that gives a big reward, the AI will zero in on it.

#

I'm just seeing a pattern in how Ai and humans fundamentally achieve optimization problems in math and train human data.

#

It's no different than learning to play catch with your dad. I mean the training data.

small wedge Jan 9, 2025, 7:15 AM

#

I guess that leads the analogy to an interesting place if you consider the exploitation/exploration balance that we manage in rl as well

#

if an agent only exploits something to the best of their knowledge there might be better exploits, they must take percieved suboptimal actions occasionally to thoroughly search the space of possible exploits

rich moth Jan 9, 2025, 7:18 AM

#

small wedge I guess that leads the analogy to an interesting place if you consider the explo...

It makes me think of how profesional gamers play. They zero in on exploits.

rancid sorrel Jan 9, 2025, 7:56 AM

#

i broke MS print to PDF

#

printing 979 pages to pdf is appretnly the lmit

bright comet Jan 9, 2025, 11:20 AM

#

what are the best libraries that i can use to create an AI?

mighty needle Jan 9, 2025, 12:07 PM

#

bright comet what are the best libraries that i can use to create an AI?

scikitlearn or keras

#

can anyone help me out with this error?

pine hawk Jan 9, 2025, 12:27 PM

#

i wanna create a accident detection system how shall i do

fickle shale Jan 9, 2025, 1:00 PM

#

pine hawk i wanna create a accident detection system how shall i do

may be using yolo

unkempt apex Jan 9, 2025, 1:21 PM

#

mighty needle can anyone help me out with this error?

None values, means check your variable

#

what it contains

unkempt apex Jan 9, 2025, 1:23 PM

#

pine hawk i wanna create a accident detection system how shall i do

pickup any CCTV/other data for the same , ( can easily find on kaggle )
then train Yolov5 / v8 model on it

look into roboflow and ultralytics hub

mighty needle Jan 9, 2025, 1:42 PM

#

unkempt apex None values, means check your variable

i checked them and there are no none values whatsoever

#

i used .shape() to check

#

both inputs used are tensors/arrays

cursive oriole Jan 9, 2025, 1:46 PM

#

odd meteor Are you familiar with Sklearn yet? If no, I think you can start with 1. https:...

I actually tried accessing andrew NG's machine learning course but for some reason it asks me to pay to access the notebooks and actual code

brave sand Jan 9, 2025, 2:59 PM

#

does anyone know how to make sp.tocsc faster?

unkempt apex Jan 9, 2025, 5:23 PM

#

mighty needle both inputs used are tensors/arrays

always share with some code!

pine hawk Jan 9, 2025, 6:09 PM

#

fickle shale may be using yolo

isnt it just object detection

#

how about crash detection T^T

pine hawk Jan 9, 2025, 6:09 PM

#

unkempt apex pickup any CCTV/other data for the same , ( can easily find on kaggle ) then tra...

ohh idk shit i am so new to it

unkempt apex Jan 9, 2025, 6:11 PM

#

you can use any vision model, if you want to check performance!
but always go with recommendations

if you want to check more results
just search with "github" as suffix to your google search and you will find repos regarding to that

upbeat prism Jan 9, 2025, 6:19 PM

#

Hello

cunning pond Jan 9, 2025, 6:34 PM

#

Can some help .I'm starting to learn Machine Learning and finding it really hard to understand linear regression why is it

serene scaffold Jan 9, 2025, 6:54 PM

#

cunning pond Can some help .I'm starting to learn Machine Learning and finding it really hard...

hello, why is it what?

cunning pond Jan 9, 2025, 6:56 PM

#

i mean why is it that its hard for me to understand even the first model (is it normal for beginner?)

merry ridge Jan 9, 2025, 7:09 PM

#

It helps if you have some background in calculus, the derivation is a pretty typical early application of derivatives.

cunning pond Jan 9, 2025, 7:11 PM

#

merry ridge It helps if you have some background in calculus, the derivation is a pretty typ...

im finding it a bit easy in other video

distant quail Jan 9, 2025, 8:30 PM

#

if i want to train an AI to learn to play a game using Q Learning, do I have to design the game as well or can i do it on an actual game like subway surfers?

serene scaffold Jan 9, 2025, 8:32 PM

#

distant quail if i want to train an AI to learn to play a game using Q Learning, do I have to ...

You need a way to expose the game state to the AI

distant quail Jan 9, 2025, 8:33 PM

#

serene scaffold You need a way to expose the game state to the AI

can i do it by letting it constantly take pictures of the laptop screen? Every x ms it takes a screenshot of the emulator

serene scaffold Jan 9, 2025, 8:34 PM

#

distant quail can i do it by letting it constantly take pictures of the laptop screen? Every x...

You would need to extract from that image the relevant information

#

But if all the information is knowable from the screen, then yes

distant quail Jan 9, 2025, 8:35 PM

#

I see, and for things like people teaching their character in unity to walk, how would i go about doing something like that?

serene scaffold Jan 9, 2025, 8:36 PM

#

I've never done reinforcement learning. But the AI has to be able to output the same keystrokes, etc as a player

distant quail Jan 9, 2025, 8:38 PM

#

So it isnt feeding a script but rather letting it enter keystrokes and then telling it what was good what wasnt? by that i mean you dont give the character a script, you just let the script run on your laptop as if youre playing^

#

or am i misunderstanding the idea

iron basalt Jan 9, 2025, 8:42 PM

#

distant quail if i want to train an AI to learn to play a game using Q Learning, do I have to ...

https://gymnasium.farama.org/

Gymnasium Documentation

A standard API for reinforcement learning and a diverse set of reference environments (formerly Gym)

distant quail Jan 9, 2025, 8:42 PM

#

iron basalt https://gymnasium.farama.org/

🙏 thank you

iron basalt Jan 9, 2025, 8:42 PM

#

You can make custom environments, including hooking into real games.

#

There are extra environments online.

#

Start with these simple ones.

#

https://gymnasium.farama.org/introduction/basic_usage/

Gymnasium Documentation

A standard API for reinforcement learning and a diverse set of reference environments (formerly Gym)

odd meteor Jan 9, 2025, 9:32 PM

#

cunning pond Can some help .I'm starting to learn Machine Learning and finding it really hard...

Are you using Top-down or Bottom-up approach to learn?

Maybe stop whatever strategy you're using currently for the time being and try the opposite.

Experiment to figure out the learning strategy that works best for you.

Sometimes it could be your learning approach, learning resources, information fatigue, deficiency in course prerequisite, stress etc

warm copper Jan 9, 2025, 10:00 PM

#

serene scaffold I've never done reinforcement learning. But the AI has to be able to output the ...

I am truly sorry that I just saw your DM from July 17

remote pewter Jan 9, 2025, 10:27 PM

#

Hello guys. I just recently started learning Python and Pandas. I understand that becoming a data scientist or a data engineer is way out of reach for a beginner like me. So, i would like to firstly break into data analyst. What do you guys should i study next?

serene scaffold Jan 9, 2025, 10:52 PM

#

remote pewter Hello guys. I just recently started learning Python and Pandas. I understand tha...

is there a reason that you can't get a CS degree?

#

because companies probably aren't going to hire entirely self-taught people to help them make expensive business decisions

remote pewter Jan 9, 2025, 10:57 PM

#

serene scaffold is there a reason that you can't get a CS degree?

I was actually always interested in the field of programming but for some reason i didn't directly hop into computer science right after my highschool. I chose to purse business instead because i was a very shy and introverted person and i believed studying business would somehow make me more outspoken (which is dumb but it actually worked out for me)

serene scaffold Jan 9, 2025, 10:58 PM

#

remote pewter I was actually always interested in the field of programming but for some reason...

so your degree was specifically in business?

remote pewter Jan 9, 2025, 10:58 PM

#

Yes.

#

But i don't want to break into programming, rather into data science.

serene scaffold Jan 9, 2025, 10:58 PM

#

sorry, I meant that it was in the business school, but what was the specific degree?

remote pewter Jan 9, 2025, 10:58 PM

#

Business Administration

#

but it had a very diverse set of subjects ranging from statistics to business intelligence and data analysis

#

But right after i finished my undergrad, i started learning python. And i also took a 3 months data science with python class but they only taught me pandas.

serene scaffold Jan 9, 2025, 11:39 PM

#

@remote pewter I would ask in #career-advice how people with your background have gotten jobs. be as detailed as you can, so that no one has to interview you to start giving you useful information.

cunning pond Jan 10, 2025, 2:02 AM

#

odd meteor Are you using Top-down or Bottom-up approach to learn? Maybe stop whatever st...

i was using udemy course but after i tried with youtube video in my language

remote pewter Jan 10, 2025, 2:23 AM

#

serene scaffold <@1271675429945610243> I would ask in <#470889390588035082> how people with your...

Thanks man 🙂 And sorry about it.

weary timber Jan 10, 2025, 9:27 AM

#

can anybody provide me a source to learn about residual networks

rich river Jan 10, 2025, 10:18 AM

#

# Import necessary libraries
import torch
from PIL import Image
import torchvision.transforms as transforms

# Read a PIL image
image = Image.open('iceland.jpg')

# Define a transform to convert PIL 
# image to a Torch tensor
transform = transforms.Compose([
    transforms.PILToTensor()
])

# transform = transforms.PILToTensor()
# Convert the PIL image to Torch tensor
img_tensor = transform(image)

# print the converted Torch tensor
print(img_tensor)

#

does transforms.PILToTensor() and other transforms only be used inside transforms.Compose? I see every code example online they are used inside transforms.Compose
I tried to use tmp = transforms.ToImage(tmp) it says Transform.__init__() takes 1 positional argument but 2 were given

upbeat prism Jan 10, 2025, 11:34 AM

#

Hello

A model basically “encodes” the learned knowledge, resp. its features, as weights. The "feature representation” of the knowledge is smaller than the knowledge itself. So I think there’s a lower bound to the size of the knowledge which is given by the “feature size” and an upper bound which is given by the “knowledge size”. Goal is to be on the lower bound so our model doesn’t just memorise.

The question now is: How do we figure out this size?

Note that I work with tiny networks, I solve XOR, MNIST. So we have very slow amount of params.

My initial thought is: I just run the experiment for varying model size and varying data size. The idea: If the model's accuracy doesn't change for higher amounts of data, we actually captured the features, if accuracy is prop to it, then we memorize.

But there must be some more involved metric no?

odd meteor Jan 10, 2025, 12:20 PM

#

rich river does `transforms.PILToTensor()` and other transforms only be used inside `transf...

No, that's not always the case. It can be applied independently.

There are many predefined transformations in the torchvision.transforms package and you can also combine / chain many of them together in a single unit by using the Compose transform. Checkout the pytorch documentation for details.

torpid latch Jan 10, 2025, 2:33 PM

#

hi folks, i want to learn to make ai. do you have any recommendations of book/video tutorials?

devout cloak Jan 10, 2025, 9:26 PM

#

This video tutorial from 3Blue1Brown is a good tutorial/explanation of Artificial Neural Networks:

https://www.youtube.com/watch?v=aircAruvnKk&list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi

YouTube

3Blue1Brown

But what is a neural network? | Deep learning chapter 1

What are the neurons, why are there layers, and what is the math underlying it?
Help fund future projects: https://www.patreon.com/3blue1brown
Written/interactive form of this series: https://www.3blue1brown.com/topics/neural-networks

Additional funding for this project was provided by Amplify Partners

Typo correction: At 14 minutes 45 seconds...

▶ Play video

iron basalt Jan 10, 2025, 10:18 PM

#

torpid latch hi folks, i want to learn to make ai. do you have any recommendations of book/vi...

What do you think "AI" is? And what do you want to make?

unkempt wigeon Jan 10, 2025, 10:51 PM

#

Does torch have an audio transform?

serene scaffold Jan 10, 2025, 11:55 PM

#

unkempt wigeon Does torch have an audio transform?

https://pytorch.org/audio/main/transforms.html

torpid latch Jan 11, 2025, 3:14 AM

#

iron basalt What do you think "AI" is? And what do you want to make?

i want to create a RAG

serene scaffold Jan 11, 2025, 3:41 AM

#

torpid latch i want to create a RAG

RAG for what?

torpid latch Jan 11, 2025, 3:53 AM

#

serene scaffold RAG for what?

i have no idea HAHA but i want to create a RAG hehe

serene scaffold Jan 11, 2025, 3:54 AM

#

torpid latch i have no idea HAHA but i want to create a RAG hehe

oh okay LOL. can you explain how RAG works, in your own words, so I know the current extent of your understanding? this isn't a test hehe

torpid latch Jan 11, 2025, 3:59 AM

#

serene scaffold oh okay LOL. can you explain how RAG works, in your own words, so I know the cur...

RAG is where you ingest data then use a model to retrieve the correct data and return it to user

serene scaffold Jan 11, 2025, 3:59 AM

#

torpid latch RAG is where you ingest data then use a model to retrieve the correct data and r...

Sorry, but that isn't really a good definition of RAG.

#

do you know what RAG stands for?

torpid latch Jan 11, 2025, 4:00 AM

#

retrieval augmented generation?

#

im new to AI

serene scaffold Jan 11, 2025, 4:35 AM

#

@torpid latch the steps are basically this:

the user asks a question
the system looks up information that's relevant to the question from a knowledge store; for example, it might pick out key words from the question and look up their Wikipedia articles
the system puts the user's question and the retrieved information into a prompt for the LLM
the LLM produces an answer to the original question

torpid latch Jan 11, 2025, 4:43 AM

#

serene scaffold <@757256695960305694> the steps are basically this: - the user asks a question -...

do you have any recommendations of books/videos for starting learning AI?

opaque condor Jan 11, 2025, 5:43 AM

#

torpid latch do you have any recommendations of books/videos for starting learning AI?

https://youtu.be/Z_ikDlimN6A?si=dPPPBkIqbE9Ed-XL

YouTube

Daniel Bourke

Learn PyTorch for deep learning in a day. Literally.

Welcome to the most beginner-friendly place on the internet to learn PyTorch for deep learning.

All code on GitHub - https://dbourke.link/pt-github
Ask a question - https://dbourke.link/pt-github-discussions
Read the course materials online - https://learnpytorch.io
Sign up for the full course on Zero to Mastery (20+ hours more video) - https:/...

▶ Play video

wide bane Jan 11, 2025, 11:34 AM

#

can someone help with the deeplabv3 architecture, I want to learn about it and also if it is possible that i can use segmentation_model in training model?

upbeat prism Jan 11, 2025, 12:02 PM

#

I do basic classification experiments on XOR, MNIST and fahsion-MNIST. Anyone knows some good reference for good model(sizes) for those issues?

white socket Jan 11, 2025, 12:21 PM

#

hello im trying to train a yolo model using my gpu so i did this

from ultralytics import YOLO
model = YOLO("yolov9m.pt")
model.to('cuda')

results = model.train(data="config.yaml", epochs=3)

When training the model i get this error

NotImplementedError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process (if using custom build). If you are a Facebook employee using PyTorch on mobile, please visit https://fburl.com/ptmfixes for possible resolutions. 'torchvision::nms' is only available for these backends: [CPU, Meta, QuantizedCPU, BackendSelect, Python, FuncTorchDynamicLayerBackMode, Functionalize, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradMPS, AutogradXPU, AutogradHPU, AutogradLazy, AutogradMeta, Tracer, AutocastCPU, AutocastXPU, AutocastCUDA, FuncTorchBatched, BatchedNestedTensor, FuncTorchVmapMode, Batched, VmapMode, FuncTorchGradWrapper, PythonTLSSnapshot, FuncTorchDynamicLayerFrontMode, PreDispatch, PythonDispatcher].

#

i dont understand what it means 😅

upbeat prism Jan 11, 2025, 3:30 PM

#

white socket i dont understand what it means 😅

YOLO, whatever it is, probably uses torchvision in particular torchvision.nms. If you use that to do computation, you run it on e.g. CPU or in your case GPU. Each case needs a backend, code that actualy runs it on the CPU, GPU etc.

GPU = cuda but there is not cuda backend implemented.

In short: you can't do it on GPU

#

there might be somethign else going on tho

#

from the docs. So I guess you dont have to do it manually.

#

https://docs.ultralytics.com/modes/train/#usage-examples

Train

Learn how to efficiently train object detection models using YOLO11 with comprehensive instructions on settings, augmentation, and hardware utilization.

teal reef Jan 11, 2025, 4:33 PM

#

yo guys

#

i wanted to ask like how will you fix a data of where you have user input for their college name and now you have multiple variations of every college in the data now

sinful relic Jan 11, 2025, 5:34 PM

#

did anyone try creating a text translation model, like from english to some unique language!
Let's say we want to create a model that translate English to LangX.
Any ideas?

serene scaffold Jan 11, 2025, 5:51 PM

#

sinful relic did anyone try creating a text translation model, like from english to some uniq...

look into neural machine translation (NMT)

rich moth Jan 11, 2025, 10:36 PM

#

Heres the complexity symphony i created for time series

rich moth Jan 11, 2025, 10:52 PM

#

It shows the measurments sof complexityy over time in the time series in a way that not just about numbers, but the evolution of data.

#

Like it alive or something lol. But it makes sense, time series data is always concurrent, with a history of the past. Like time its always here.

rich moth Jan 12, 2025, 12:18 AM

#

pearl imp Jan 12, 2025, 1:15 AM

#

Hey I'm working on a chatbot that uses information from multiple datasets and I was wondering how I should handle its training. Specifically, how would you handle preprocessing, merging, and ensuring consistency across datasets, as well as choosing the right model and framework for fine-tuning?

serene scaffold Jan 12, 2025, 1:18 AM

#

pearl imp Hey I'm working on a chatbot that uses information from multiple datasets and I ...

What is the chat bot intended to do? And what are the datasets?

pearl imp Jan 12, 2025, 1:22 AM

#

serene scaffold What is the chat bot intended to do? And what are the datasets?

It's a school project on election info in my country. I'm planning on using gpt 4o to chat

serene scaffold Jan 12, 2025, 1:22 AM

#

pearl imp It's a school project on election info in my country. I'm planning on using gpt ...

And you're sure you want to fine tune it? Because if it needs to answer questions about local elections, you should use RAG

#

Sorry, you said country. I thought you said county.

But what I said still applies.

pearl imp Jan 12, 2025, 1:26 AM

#

What's wrong with fine tuning and what is RAG? The data is stuff like the last election result by district, previous polling stations, list of candidates by party, etc.

serene scaffold Jan 12, 2025, 1:28 AM

#

pearl imp What's wrong with fine tuning and what is RAG? The data is stuff like the last e...

If you try to fine tune it to remember specific facts, it will not remember them. Especially not if they're numeric

hexed plume Jan 12, 2025, 1:28 AM

#

Hey does anyone know what's wrong with

"model = PPO("MlpPolicy", env, verbose=55)"

It's giving me an error for not being a list?

serene scaffold Jan 12, 2025, 1:29 AM

#

hexed plume Hey does anyone know what's wrong with "model = PPO("MlpPolicy", env, verbose=5...

Remember to always always show the whole error message

#

I'm in a car now so I'll elaborate when I get home hopefully

hexed plume Jan 12, 2025, 1:32 AM

#

Traceback (most recent call last):
File "D:\Python.AI.Games\AI.8.venv\Scripts\Main.py", line 354, in <module>
train_model()
File "D:\Python.AI.Games\AI.8.venv\Scripts\Main.py", line 321, in train_model
model.learn(total_timesteps=30000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000)
File "D:\Python.AI.Games\AI.8.venv\Lib\site-packages\stable_baselines3\ppo\ppo.py", line 311, in learn
return super().learn(
^^^^^^^^^^^^^^
File "D:\Python.AI.Games\AI.8.venv\Lib\site-packages\stable_baselines3\common\on_policy_algorithm.py", line 323, in learn
continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, n_rollout_steps=self.n_steps)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Python.AI.Games\AI.8.venv\Lib\site-packages\stable_baselines3\common\vec_env\base_vec_env.py", line 207, in step
return self.step_wait()
^^^^^^^^^^^^^^^^
File "D:\Python.AI.Games\AI.8.venv\Lib\site-packages\stable_baselines3\common\vec_env\dummy_vec_env.py", line 59, in step_wait
obs, self.buf_rews[env_idx], terminated, truncated, self.buf_infos[env_idx] = self.envs[env_idx].step(
^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Python.AI.Games\AI.8.venv\Lib\site-packages\stable_baselines3\common\monitor.py", line 94, in step
observation, reward, terminated, truncated, info = self.env.step(action)
^^^^^^^^^^^^^^^^^^^^^
File "D:\Python.AI.Games\AI.8.venv\Lib\site-packages\shimmy\openai_gym_compatibility.py", line 250, in step
obs, reward, done, info = self.gym_env.step(action)
^^^^^^^^^^^^^^^^^^^^^^^
TypeError: cannot unpack non-iterable NoneType object

#

code:

#

https://paste.pythondiscord.com/SA4Q

pearl imp Jan 12, 2025, 1:40 AM

#

serene scaffold If you try to fine tune it to remember specific facts, it will not remember them...

Are there any resources you can recommend I take a look at? This is my first time doing a project like this

serene scaffold Jan 12, 2025, 1:51 AM

#

pearl imp Are there any resources you can recommend I take a look at? This is my first tim...

I don't have a specific one in mind; just look into retrieval augmented generation

white socket Jan 12, 2025, 1:55 AM

#

upbeat prism https://docs.ultralytics.com/modes/train/#usage-examples

thank you so much!!!

marsh hedge Jan 12, 2025, 2:49 AM

#

greetings everyone, i want to start learning ai, can anyone suggest me some of the good resources available online? Thankyou

rich moth Jan 12, 2025, 3:14 AM

#

Im still fine tuning the visuals. I think ill split them in two gifss

iron basalt Jan 12, 2025, 3:26 AM

#

marsh hedge greetings everyone, i want to start learning ai, can anyone suggest me some of t...

What do you think "AI" is? And what do you want to make?

rich moth Jan 12, 2025, 3:28 AM

#

i got some work todo still with the visuals (obviously) but

serene scaffold Jan 12, 2025, 3:29 AM

#

iron basalt What do you think "AI" is? And what do you want to make?

you had me worried that I accidentally deleted every message since the last time you said this

marsh hedge Jan 12, 2025, 4:36 AM

#

iron basalt What do you think "AI" is? And what do you want to make?

i think 'AI' is the better version of the previous gen supercomputers which can do some tasks which requires human intelligence . It still cannot perform tasks like humans but in near future it will . what i want to make?-- my aim is to predict the packet loss in networking and automate the manual part because it sucks to get a high latency in fps games and i am tired of it. i am in my last year of graduation and i think i have the basics down....but i want to know what is happening in the global market....i also have some research papers on my name......i can show it to you personally which already got published.

marsh hedge Jan 12, 2025, 4:37 AM

#

serene scaffold you had me worried that I accidentally deleted every message since the last time...

do share your ideas if you want to.....i am interested in reading your ideas

fickle shale Jan 12, 2025, 5:16 AM

#

rich moth i got some work todo still with the visuals (obviously) but

Damn Why it's look good!

past meteor Jan 12, 2025, 8:55 AM

#

pearl imp Hey I'm working on a chatbot that uses information from multiple datasets and I ...

a rag would work best for something like this, depending on the questions asked

#

Basically, you embed your documents (turn them into a dense vector) and when a question gets asked you embed that question, find the most similar documents and pass them on to the bot when it's answering

dusty jetty Jan 12, 2025, 10:09 AM

#

Hiee is there anyone who could suggest me how could i start up with elasticSearch as i want to start it in python and i do not know anything about elasticsearch

supple grove Jan 12, 2025, 2:58 PM

#

if i have tabular data is there a way i can guage if a random forest, xgb or NN will be the best performer, without training all of them, or do I need to test all the models?. Is there a way to gauge threshold for data samples for NN to be viable.

serene scaffold Jan 12, 2025, 3:00 PM

#

supple grove if i have tabular data is there a way i can guage if a random forest, xgb or NN ...

You'd have to train them all. And there's infinitely many possible neutral networks

#

I heard xgboost is the best for tabular data

supple grove Jan 12, 2025, 3:00 PM

#

serene scaffold I heard xgboost is the best for tabular data

then why would i train all types to check?

serene scaffold Jan 12, 2025, 3:00 PM

#

supple grove then why would i train all types to check?

You shouldn't

supple grove Jan 12, 2025, 3:01 PM

#

Didnt u just say "youd have to train them all"

serene scaffold Jan 12, 2025, 3:02 PM

#

supple grove Didnt u just say "youd have to train them all"

If you want to know with certainty which of three options is best, you have to try all three. I'm not saying you should want to, that's just how it would be.

supple grove Jan 12, 2025, 3:03 PM

#

I see the point with infinite node settings in a NN, but how does a model go about it without missing the best setting. training infinity is not possible

serene scaffold Jan 12, 2025, 3:04 PM

#

You decide how many nodes the network will have. The network can't add or remove nodes from itself

supple grove Jan 12, 2025, 3:04 PM

#

serene scaffold If you want to know with certainty which of three options is best, you have to t...

on the other hand you do want to have the best performing model, is hearing that xgb usually performs best sufficient (I heard the same)

serene scaffold Jan 12, 2025, 3:04 PM

#

supple grove on the other hand you do want to have the best performing model, is hearing that...

Sufficient for what? Is there someone who's going to penalize you if you don't try at least n options?

supple grove Jan 12, 2025, 3:04 PM

#

serene scaffold You decide how many nodes the network will have. The network can't add or remove...

i was thinking experts doesnt set nodes and layers blindly. Can one estimate what amount of nodes and layers to try?

supple grove Jan 12, 2025, 3:05 PM

#

serene scaffold Sufficient for what? Is there someone who's going to penalize you if you don't t...

having a less accurate model is somewhat of a penalty in a game of competition.

serene scaffold Jan 12, 2025, 3:07 PM

#

So you're submitting your model to a competition?

supple grove Jan 12, 2025, 3:07 PM

#

No.

#

You dont want to be outperformed by competitors

#

Do you find it odd that I want to optimize the numbers?

serene scaffold Jan 12, 2025, 3:13 PM

#

I think you've fallen into the trap of premature optimization

white socket Jan 12, 2025, 3:18 PM

#

upbeat prism there might be somethign else going on tho

I have solved it!! I just swapped my game ready drivers with studio ready drivers and now everything works T-T

supple grove Jan 12, 2025, 3:35 PM

#

serene scaffold I think you've fallen into the trap of premature optimization

Not really.

#

Eg one of my questions were if anybody had suggestions on how to gauge the number of data samples needed to outperform xgb with a NN? a fair thought before training arbitrarily without contemplating possibilities and the better grounds for end result is hard premature optimization.

rich moth Jan 12, 2025, 3:41 PM

#

dusty jetty Hiee is there anyone who could suggest me how could i start up with elasticSearc...

Download it from github and installed it, edit the config.yaml file to your likings. Start the server. You can add an initialization block for to your python script to connect to it. You can also index data right into it, like datasets and embedded it. Its really easy to make a custom knowledge base for agents to access. Elasticsearch and Haystack work well together, maybe check that out.

dusty jetty Jan 12, 2025, 3:49 PM

#

Is there any resource because I have been struggling for the entire day so it would help me a lot

upbeat prism Jan 12, 2025, 4:43 PM

#

I have a torch dataset that has a plot function. I use subsets to split. Can I somehow make functions from the dataset available in my subset?

serene scaffold Jan 12, 2025, 4:52 PM

#

upbeat prism I have a torch dataset that has a plot function. I use subsets to split. Can I ...

are subsets of the dataset not also instances of the same class?

rich moth Jan 12, 2025, 5:15 PM

#

dusty jetty Is there any resource because I have been struggling for the entire day so it wo...

Maybe try this? Just search online. https://github.com/ImadSaddik/ElasticSearch_Python_Tutorial?tab=readme-ov-file

GitHub

GitHub - ImadSaddik/ElasticSearch_Python_Tutorial: This repository ...

This repository is part of a course on ElasticSearch in Python. It includes notebooks that demonstrate its usage, along with a YouTube series to guide you through the material. - ImadSaddik/Elastic...

upbeat prism Jan 12, 2025, 5:22 PM

#

serene scaffold are subsets of the dataset not also instances of the same class?

well the "interface" doesn't include self defiend functions, so to speak.

sterile belfry Jan 12, 2025, 5:29 PM

#

im trying to build a forecasting prediction model for sales but having some issues. would any one be able to give me a hand if I provide the code and data set

serene scaffold Jan 12, 2025, 5:31 PM

#

sterile belfry im trying to build a forecasting prediction model for sales but having some issu...

don't wait for a commitment before you provide the code, etc.
people want to know what they're getting into when they offer to help.

serene scaffold Jan 12, 2025, 5:32 PM

#

upbeat prism well the "interface" doesn't include self defiend functions, so to speak.

can you explain? if you have a class that represents datasets, you should be able to use the same class to represent subsets of that same dataset.

rich moth Jan 12, 2025, 10:17 PM

#

Wanted to share my current version with you guys hope you like it. I could use some feedback too, thanks

#

So I built this adaptive transformer that uses this new method to measure data complexity. Its suppose to let the model automatically adjust its size and complexity to better fit the data it's processing, making it much more efficient and able to capture subtle details.

#

It also reduces the chances of the model overfitting by significant amount if not completely mitigating the issue

rich moth Jan 12, 2025, 11:21 PM

#

I feel like AI and all of ML it encompasses is just pattern optimizers. I mean at a fundamental level.

serene scaffold Jan 12, 2025, 11:30 PM

#

rich moth I feel like AI and all of ML it encompasses is just pattern optimizers. I mean ...

it's about optimizing functions that recognize patterns, yes.

#

that's the whole game.

rich moth Jan 12, 2025, 11:39 PM

#

serene scaffold it's about optimizing functions that recognize patterns, yes.

Right, but it goes further than just recognition. I'm talking about a model that actively adapts to specific patterns it encounters.

serene scaffold Jan 12, 2025, 11:40 PM

#

rich moth Right, but it goes further than just recognition. I'm talking about a model that...

that's what I mean by "recognition". the model detects/recognizes/adapts itself to the pattern.

rich moth Jan 12, 2025, 11:40 PM

#

serene scaffold that's what I mean by "recognition". the model detects/recognizes/adapts itself ...

Oh, ok

iron basalt Jan 12, 2025, 11:49 PM

#

rich moth I feel like AI and all of ML it encompasses is just pattern optimizers. I mean ...

AI is about making rational agents. At a fundamental level, approximating AIXI. ML is about memoization and generalization. It can also be considered a programming paradigm which is driven by massive data and statistics rather than explicitly programmed by hand.

iron basalt Jan 12, 2025, 11:55 PM

#

iron basalt AI is about making rational agents. At a fundamental level, approximating AIXI. ...

The distinction between recognizing a pattern, and learning to recognize a pattern is important here for ML. One is where you just describe the pattern in full in code, the other is "here are a bunch of examples, organize them internally such that you can recognize them and also ones not shown here hopefully."

rich moth Jan 12, 2025, 11:56 PM

#

iron basalt AI is about making rational agents. At a fundamental level, approximating AIXI. ...

Hmm, thats interesting. I wonder if you can also view adaptive complexity as move towards a more generalized or flexible type of intelligence.

iron basalt Jan 13, 2025, 12:05 AM

#

serene scaffold you had me worried that I accidentally deleted every message since the last time...

This will be my goto response to these kinds of questions.

iron basalt Jan 13, 2025, 12:10 AM

#

iron basalt AI is about making rational agents. At a fundamental level, approximating AIXI. ...

You may see some things under both AI and ML, such as reinforcement learning, since the terms are not super clear as they are typically used. But if you go by the more original meanings, it's what I wrote.

#

ML is also slowly becoming an everything term in the same way AI is being used.

#

Some use ML in place of AI just so they can avoid a bunch of philosophical discussion, as everyone seems to have an opinion on it.

#

Or to de-hype what they are working on.

serene scaffold Jan 13, 2025, 12:36 AM

#

iron basalt This will be my goto response to these kinds of questions.

it's a good response-question

rich moth Jan 13, 2025, 2:44 AM

#

So I make an autotransformer with the DeepChem data. Playing around with the ideas I made.

vast yacht Jan 13, 2025, 5:10 AM

#

hi guys, i'm currently working on a data engineering project that involves incremental load. i'm trying to replicate (not entirely) dbt's incremental model but dont know the backend logic they use to handle old/unchanged rows without (explicitly) running a full load again, maybe they use a log or cache or metadata or sth idk. if you're familiar with dbt, i need your help. let's dm ❤️ thankssssssssss

fervent canopy Jan 13, 2025, 8:08 AM

#

https://github.com/SanshruthR/mock-hls-server Fake a live HLS stream from a MP4 source for testing purposes.

GitHub

GitHub - SanshruthR/mock-hls-server: Fake a live HLS stream from a ...

Fake a live HLS stream from a MP4 source for testing purposes. Sample mp4 URL: https://videos.pexels.com/video-files/6274203/6274203-sd_426_226_30fps.mp4 Online HLS Player: https://livepush.io...

amber sequoia Jan 13, 2025, 11:32 AM

#

does anybody have experience maybe with outputting matplotplib to the browser?

supple grove Jan 13, 2025, 12:03 PM

#

The first section is a 80iteration cross validation. Second part is predict_proba() results. Does this smell like over fitting or how can i check it? Noob at play

mild dirge Jan 13, 2025, 12:13 PM

#

supple grove The first section is a 80iteration cross validation. Second part is predict_prob...

Yes, the performance on training is much better than on the testing set, so there is likely overfitting at play here

supple grove Jan 13, 2025, 12:47 PM

#

Is the desired optimization by avoiding overfit that the test and train score goes towards a mean of the 2, in other words that the test score rises while the training score should drop?

upbeat prism Jan 13, 2025, 2:38 PM

#

serene scaffold can you explain? if you have a class that represents datasets, you should be abl...

I think the Subset just basically stores a bunch of indices of your dataset and implements the usual magic methods like __getitem__ etc. So if you draw an item from it, it uses the actual Dataset below it.

There is no plot function on Subset. Subset doesnt know about it, so if you call plot, it simpy doesn't know wtf you want from it.

mild dirge Jan 13, 2025, 2:38 PM

#

supple grove Is the desired optimization by avoiding overfit that the test and train score go...

Sorry for the late reply. The goal is to make the model generalize better. If you simplify the model it will generalize better, which generally means the testing result will get better and training maybe a bit lower. But there are other ways to generalize, such as data augmentation.

#

In which case training may not be affected as much, and the model will perform better on new data as well.

supple grove Jan 13, 2025, 3:55 PM

#

All good thanks.

#

I read that data augmentation with tabular data could be difficult, not saying its off the table.
I tried to lower learningrate and colsamplebytree

remote stream Jan 13, 2025, 3:58 PM

#

is there a way to utilise gpu in windows

supple grove Jan 13, 2025, 3:59 PM

#

I got this instead, which seems to be more "generalized" and not overfitting?
I do wonder why the test auc mean is 67% but when I look at the last line (which is precision recall f1 support scores) they are very high between 92%.99%. I dont undertand how they are so high when the test auc score is low.

jaunty helm Jan 13, 2025, 4:02 PM

#

supple grove I got this instead, which seems to be more "generalized" and not overfitting? I...

they are very high
I'm assuming you're formatting it like (train_precision, test_precision), (train_recall, test_recall), ... in which case... no? the test scores aren't great

supple grove Jan 13, 2025, 4:04 PM

#

no in case of the overfit?

#

Im not formatting it, just printing the result of print(precision_recall_fscore_support(y_test, y_pred))

jaunty helm Jan 13, 2025, 4:07 PM

#

supple grove Im not formatting it, just printing the result of print(precision_recall_fscore_...

oh I see
so it should be (class_1_precision, class_2_precision), (class_1_recall, class_2_recall), ... yeah?

#

(I've never used that function in my life tbh)

supple grove Jan 13, 2025, 4:08 PM

#

I'm just getting started, I was trying to understand the results im looking at.

supple grove Jan 13, 2025, 4:08 PM

#

jaunty helm oh I see so it should be `(class_1_precision, class_2_precision), (class_1_recal...

that would make sense

#

I dont understand when the testaucmean is around 67% shouldnt there be some of the results below that is closer to 67%?

jaunty helm Jan 13, 2025, 4:11 PM

#

supple grove that would make sense

then it looks like your model does very poorly when it comes to predicting class 2

#

my guess is that you have a heavy imbalance of data, i.e. class_1 appears way more than class_2

supple grove Jan 13, 2025, 4:12 PM

#

whats the testauc number at 67%?

jaunty helm Jan 13, 2025, 4:13 PM

#

supple grove whats the testauc number at 67%?

wdym
are you asking what auc is?

supple grove Jan 13, 2025, 4:17 PM

#

Idk why I thought it would correlate with performance

#

No, i had looked it up.

jaunty helm Jan 13, 2025, 4:21 PM

#

supple grove Idk why I thought it would correlate with performance

I mean... I wouldn't say the model is performing great unless I'm reading this incorrectly
would you mind doing a sklearn.metrics.classification_report(y_true, y_pred) and showing the results?

supple grove Jan 13, 2025, 4:21 PM

#

based on the auc value or the bottom line?

jaunty helm Jan 13, 2025, 4:22 PM

#

supple grove based on the auc value or the bottom line?

like put in the actual test values and the predicted test values?

supple grove Jan 13, 2025, 4:24 PM

#

I meant "wouldnt say its performing great" based on auc the the line in the bottom (prec rec etc)?

#

Im running the program again, ill let u know

#

jaunty helm Jan 13, 2025, 4:26 PM

#

supple grove I meant "wouldnt say its performing great" based on auc the the line in the bott...

I usually look at prec/rec/f1, and the .00-somethings on class_2 doesn't seem that good

supple grove Jan 13, 2025, 4:26 PM

#

ah, better formatting than the func I used lol.

jaunty helm Jan 13, 2025, 4:27 PM

#

supple grove

yeah so the "problem" is you have way more Falses than Trues, and your model does very poorly at predicting Trues overall
*this may or may not be a problem depending on your needs

supple grove Jan 13, 2025, 4:28 PM

#

yea, only about 8% true

#

in the set

jaunty helm Jan 13, 2025, 4:30 PM

#

supple grove yea, only about 8% true

basically you can imagine that what your model's learned is to pretty much always predict False because that'll almost always be correct simply due to your training data having mostly Falses

#

and your data is called "imbalanced" in this case

supple grove Jan 13, 2025, 4:32 PM

#

hmm, I thought i was using smotetomek

jaunty helm Jan 13, 2025, 4:34 PM

#

supple grove hmm, I thought i was using smotetomek

well, ig I'd double check the training dataframe to make sure that the synthetic data was in there
and oversampling isn't always gonna work

supple grove Jan 13, 2025, 4:36 PM

#

I have forgotten to balance it after i switched to xgb, so need to repipe it

#

it wasnt balanced

#

hmm well i hope it works lol

calm thicket Jan 13, 2025, 4:40 PM

#

remote stream is there a way to utilise gpu in windows

yes. what are you trying to use it for

supple grove Jan 13, 2025, 4:42 PM

#

if I'm normalizing the data. It is wrong to normalize the data and then split it? should it be split and then normalized after the split?

jaunty helm Jan 13, 2025, 4:48 PM

#

supple grove if I'm normalizing the data. It is wrong to normalize the data and then split it...

normalize after split, otherwise you're leaking information from the test set

supple grove Jan 13, 2025, 4:49 PM

#

tried with smotetomek. went some useless to trash ^^

jaunty helm Jan 13, 2025, 4:49 PM

#

which will make the metrics less reflective of actual performance

supple grove Jan 13, 2025, 4:50 PM

#

Right, I can see I actually did normalize after, was I thought I had justwhen i began.

jaunty helm Jan 13, 2025, 4:50 PM

#

supple grove tried with smotetomek. went some useless to trash ^^

well it's doing a bit better on the minority class
I think it's p much always like this with imbalanced datasets, you have to decide when it's a worthwhile tradeoff (sacrificing performance on the majority class to improve performance on the minority)

supple grove Jan 13, 2025, 4:52 PM

#

the minority is much more important to predict

jaunty helm Jan 13, 2025, 4:52 PM

#

that's usually the case

#

depending on what model you use there also might be parameters that helps w/ imbalanced datasets

supple grove Jan 13, 2025, 4:54 PM

#

xgboost

jaunty helm Jan 13, 2025, 4:54 PM

#

jaunty helm depending on what model you use there also might be parameters that helps w/ imb...

specifically something something class weights should help

supple grove Jan 13, 2025, 4:54 PM

#

read it was best for tabdata

#

should i remove the oversamplying if trying class weights

jaunty helm Jan 13, 2025, 4:56 PM

#

supple grove should i remove the oversamplying if trying class weights

probably

upbeat prism Jan 13, 2025, 6:11 PM

#

I wanan do a continual learning experiment on fashion-MNIST dataset. For that I want a paper or reference that tells me a good model size.

Here's their arxiv doc https://arxiv.org/pdf/1708.07747 but I?m not sure which one is actually a neural netwokr in the classical sense. Maybe it's the SGDClassifier? ^^

#

ah maybe the MLP no

#

but wtf is a sgdclassifier then

#

Linear classifiers (SVM, logistic regression, etc.) with SGD training.

ah hmm

supple grove Jan 13, 2025, 6:13 PM

#

i tried to do a gridsearch. trhat just lowered the scores trying to apply the best "params". I feel the scores are really low and not good at predicting. Can I do something else than balancing, hyper tuning params? and could I hit a "jackpot" or for example doing something that would improve the accuracy 2x for example?

remote stream Jan 13, 2025, 7:23 PM

#

calm thicket yes. what are you trying to use it for

Ml and dl

calm thicket Jan 13, 2025, 7:25 PM

#

there is a way

agile cobalt Jan 13, 2025, 8:05 PM

#

generally speaking WSL is better supported, but some common tools do support it just fine

throw windows subsystem for linux on google if you never heard about it before

calm thicket Jan 13, 2025, 8:11 PM

#

it's WSL

agile cobalt Jan 13, 2025, 8:24 PM

#

oops derp
edited, I knew something felt off but always get it mixed up

errant bison Jan 13, 2025, 9:53 PM

#

Can someone suggest a good project idea or any project definitions which is a requirement or good to learn

ornate iris Jan 14, 2025, 12:10 AM

#

So I had this strange Idea to put together a hybridized Latent Context Model and a Tiny Concept Model ( based on a Large concept model). Giving them a shared space, Then wrap them in an AI team infrastructure. Then apply Monte Carlo Tree Systems, Test time training and Surprise minimization.

Does anyone have any suggestions or guidance on this kind of thing?

lapis sequoia Jan 14, 2025, 12:35 AM

#

I guess I did most of what I could do, but I still got an CUDA OOM.Is this part of the VRAM unable to be allocated due to fragmentation or is it due to some other reason? I read the following document and still don't understand it.

#

#

weak oxide Jan 14, 2025, 3:34 AM

#

Is there a good video you guys recommend for clean learning and understanding of LSTMs like any favorite YouTuber?

#

Because LSTMs I kinda want to utilize for long term prediction of some prices

#

If anyone has a favorite YouTuber that's good on the subject that's all

rich moth Jan 14, 2025, 3:49 AM

#

lapis sequoia I guess I did most of what I could do, but I still got an CUDA OOM.Is this part ...

Try changing your batch size. I mean its probably not ideal, but you can try 4.

rich moth Jan 14, 2025, 3:57 AM

#

weak oxide Is there a good video you guys recommend for clean learning and understanding of...

Maybe this. https://www.youtube.com/watch?v=8HyCNIVRbSU

YouTube

The AI Hacker

Illustrated Guide to LSTM's and GRU's: A step by step explanation

LSTM's and GRU's are widely used in state of the art deep learning models. For those just getting into machine learning and deep learning, this is a guide in plain English with helpful visuals to help you grok LSTM's and GRU's.

Subscribe to receive video updates on practical Artificial Intelligence and it's applications.

Also, comment below an...

▶ Play video

serene scaffold Jan 14, 2025, 4:41 AM

#

lapis sequoia I guess I did most of what I could do, but I still got an CUDA OOM.Is this part ...

hello, it's easier for people to help you when you give all your code/errors as text, not as screenshots

#

!code

arctic wedgeBOT Jan 14, 2025, 4:41 AM

#

Formatting code on Discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

tawdry sundial Jan 14, 2025, 4:49 AM

#

whats better, llamaindex or langchain?

lapis sequoia Jan 14, 2025, 4:49 AM

#

hello

#

is starting with scikit learn a good idea? 🤔

#

https://scikit-learn.org/stable/

#

looks very solid

serene scaffold Jan 14, 2025, 4:56 AM

#

tawdry sundial whats better, llamaindex or langchain?

I don't know that either is "better", but I feel like whenever a new NLP technology comes out, there's a race to build the most popular library around that thing

#

and they tend to come up with obscure abstractions

timid jasper Jan 14, 2025, 4:57 AM

#

Hey, what’s the best way I can implement a generative ai ?
I need a sort of unlimited amount of requests, but I prefer my AI to be locally hosted than using an API
I have over 1 million specific texts I want to train my AI on

serene scaffold Jan 14, 2025, 4:57 AM

#

timid jasper Hey, what’s the best way I can implement a generative ai ? I need a sort of unli...

if you need to host it locally, your options are bounded by the GPU that you have. What GPU do you have?

tawdry sundial Jan 14, 2025, 4:58 AM

#

serene scaffold I don't know that either is "better", but I feel like whenever a new NLP technol...

which one do you prefer?

serene scaffold Jan 14, 2025, 4:58 AM

#

tawdry sundial which one do you prefer?

I don't use either.

timid jasper Jan 14, 2025, 4:58 AM

#

serene scaffold if you need to host it locally, your options are bounded by the GPU that you hav...

Nvidia 710 and using windows 10

tawdry sundial Jan 14, 2025, 4:58 AM

#

serene scaffold I don't use either.

its pretty popular rn

#

how should i decide?

serene scaffold Jan 14, 2025, 4:58 AM

#

timid jasper Nvidia 710 and using windows 10

You won't be able to use generative AI.

#

at least, not in a form that you would recognize.

timid jasper Jan 14, 2025, 5:00 AM

#

serene scaffold You won't be able to use generative AI.

What’s the minimum spec requirement?

serene scaffold Jan 14, 2025, 5:00 AM

#

timid jasper What’s the minimum spec requirement?

you'd be looking at GPUs that cost thousands of dollars each.

timid jasper Jan 14, 2025, 5:00 AM

#

Ah

#

So I’ll have to go for an api solution, any recommendations for a free to priced api with good amount of requests per day?

serene scaffold Jan 14, 2025, 5:01 AM

#

the generative AI bomb has only been possible because of innovations in GPU technology since your GPU was designed.
and the cost of these new GPUs reflect their capabilities.

timid jasper Jan 14, 2025, 5:01 AM

#

Makes sense!

serene scaffold Jan 14, 2025, 5:03 AM

#

timid jasper So I’ll have to go for an api solution, any recommendations for a free to priced...

I don't know, unfortunately. I've been in industry since the generative AI boom started, and we just buy hardware and API credits.

lapis sequoia Jan 14, 2025, 5:14 AM

#

what differentiates an A100 and 5090 🤔 in terms of training models

serene scaffold Jan 14, 2025, 5:16 AM

#

lapis sequoia what differentiates an A100 and 5090 🤔 in terms of training models

the amount of VRAM and FLOPS

#

(so pretty much, the size and speed)

lapis sequoia Jan 14, 2025, 5:17 AM

#

serene scaffold the amount of VRAM and FLOPS

if it can do more tflops, shouldn't it be better at gaming though

serene scaffold Jan 14, 2025, 5:17 AM

#

lapis sequoia if it can do more tflops, shouldn't it be better at gaming though

better at gaming, than what

lapis sequoia Jan 14, 2025, 5:18 AM

#

a 4090 for example

serene scaffold Jan 14, 2025, 5:18 AM

#

I'm not saying which is better for either; I'm just saying that if you're comparing two GPUs, those are the main things

lapis sequoia Jan 14, 2025, 5:18 AM

#

yes

#

i just had another question 🫡

#

because it can do more calculations, shouldn't it technically render more polygons

#

so why dont they use a100 in those expensive gaming setups

#

hm

serene scaffold Jan 14, 2025, 5:21 AM

#

if the most graphically intense games are designed for at most 5090s, then there won't be added benefit from using a "better" GPU

lapis sequoia Jan 14, 2025, 5:22 AM

#

how is 4090 faster than a100 in terms of processing power if a100 has 300tflops and 4090 has 90tflops

#

am i missing a variable

lapis sequoia Jan 14, 2025, 5:22 AM

#

serene scaffold if the most graphically intense games are designed for at most 5090s, then there...

i see

serene scaffold Jan 14, 2025, 5:23 AM

#

like, if you need to carry 1L of water, and you have a 2L bucket, getting a 3L bucket wouldn't be an upgrade.

lapis sequoia Jan 14, 2025, 5:23 AM

#

serene scaffold like, if you need to carry 1L of water, and you have a 2L bucket, getting a 3L b...

my understanding is that, more tflops allow you to carry the bucket faster

#

vram is the size of the bucket

serene scaffold Jan 14, 2025, 5:24 AM

#

right; flops is speed. my metaphor is only for VRAM

lapis sequoia Jan 14, 2025, 5:26 AM

#

so confusing

#

apparently, the a100 is slower

#

but it has more vram so it can train larger models

#

and more effecient than a 4090 since it uses less power

serene scaffold Jan 14, 2025, 5:30 AM

#

whenever you say "more efficient", you have to say for what

lapis sequoia Jan 14, 2025, 5:30 AM

#

like power effeciency

#

so its used in data centers

serene scaffold Jan 14, 2025, 5:30 AM

#

less power per floating point operation?

lapis sequoia Jan 14, 2025, 5:30 AM

#

for personal use, multiple 4090 is preferred over a100

#

yes

jaunty helm Jan 14, 2025, 8:36 AM

#

timid jasper So I’ll have to go for an api solution, any recommendations for a free to priced...

mistral has a free tier with pretty generous limits
1 request/sec, 500k tokens/min, 1b tokens/month

jaunty helm Jan 14, 2025, 8:44 AM

#

lapis sequoia so why dont they use a100 in those expensive gaming setups

because they're massive and don't fit in a pc case
also good luck trying to cool them down

#

and also, you need software (drivers) that allows the hardware to render anything at all onto your screen
so e.g. if you only have an A100 linked up to your pc, it can't even render your desktop, unlike an 4090

#

the datacenter cards are pure CUDA compute

tight sphinx Jan 14, 2025, 11:31 AM

#

Why when I request to Llama API for the llama3.1-70b model it takes forever and then just stops working

#

I'm using it for my discord bot

#

Also, I wanna train an AI on some datasets and pdfs too (texts), however I only have GTX 1050 Ti.. Is it sufficient?

#

If so, what's the suitable model to choose?

jaunty helm Jan 14, 2025, 11:44 AM

#

tight sphinx Also, I wanna train an AI on some datasets and pdfs too (texts), however I only ...

assuming you mean LLMs, mostly no

tight sphinx Jan 14, 2025, 11:45 AM

#

jaunty helm assuming you mean LLMs, mostly no

Alright

tight sphinx Jan 14, 2025, 11:46 AM

#

tight sphinx Why when I request to Llama API for the llama3.1-70b model it takes forever and ...

And about this?

jaunty helm Jan 14, 2025, 11:47 AM

#

tight sphinx And about this?

too vague for me to say anything about it

#

and I don't use llama api in the first place

tight sphinx Jan 14, 2025, 11:48 AM

#

I should have used the official API 💀

rancid sorrel Jan 14, 2025, 2:18 PM

#

lapis sequoia so why dont they use a100 in those expensive gaming setups

the a100 dosnt have a gpu output

#

well video out

#

https://www.hardwarezone.com.sg/thumbs/683676/og.jpg

#

see lack of video out

brave sand Jan 14, 2025, 3:34 PM

#

does anyone know faster alternatives to scipy tocsc?

serene scaffold Jan 14, 2025, 4:18 PM

#

brave sand does anyone know faster alternatives to scipy tocsc?

what does toscs do

brave sand Jan 14, 2025, 4:18 PM

#

serene scaffold what does toscs do

converts it to a csc matrix format (compressed sparse matrix)

serene scaffold Jan 14, 2025, 4:19 PM

#

if it's converting between data formats, rather than doing a calculation, then probably not

brave sand Jan 14, 2025, 4:19 PM

#

serene scaffold if it's converting between data formats, rather than doing a calculation, then p...

gotcha, well i implememented cupy instead but cupy is slower than scipy?

serene scaffold Jan 14, 2025, 4:20 PM

#

brave sand gotcha, well i implememented cupy instead but cupy is slower than scipy?

these are two orthogonal things. they can't be compared.

brave sand Jan 14, 2025, 4:20 PM

#

serene scaffold these are two orthogonal things. they can't be compared.

how so?

#

i replaced the sp (scipy) calls with cp and timed it

#

cupy is always slower

serene scaffold Jan 14, 2025, 4:21 PM

#

cupy is just "numpy on cuda", so it implements array data types themselves
are the cupy arrays that you're creating using the GPU?

brave sand Jan 14, 2025, 4:22 PM

#

serene scaffold cupy is just "numpy on cuda", so it implements array data types themselves are t...

how do i check if it's using the gpu?

#

shouldn't it default to using the gpu?

serene scaffold Jan 14, 2025, 4:23 PM

#

brave sand how do i check if it's using the gpu?

you can make an array and print arr.device

brave sand Jan 14, 2025, 4:24 PM

#

serene scaffold you can make an array and print `arr.device`

i did this:

print("available gpus:", cp.cuda.runtime.getDeviceCount())

print("Current GPU:", cp.cuda.Device().id)```

and got:
```Available GPUs: 1
Current GPU: 0```

serene scaffold Jan 14, 2025, 4:25 PM

#

serene scaffold these are two orthogonal things. they can't be compared.

but scipy is a library of additional array functions that aren't in numpy. whereas cupy is sort of a "copy" of numpy.

brave sand Jan 14, 2025, 4:26 PM

#

serene scaffold but scipy is a library of additional array functions that aren't in numpy. where...

ohhhh interesting

#

so i am using the gpu but it's still slower than scipy?

serene scaffold Jan 14, 2025, 4:26 PM

#

you keep saying "slower than scipy", but that's not how it works

#

you can compare cupy and numpy. you can't compare either of them to scipy

#

can you show the code you're trying to run that's too slow?

brave sand Jan 14, 2025, 4:28 PM

#

serene scaffold can you show the code you're trying to run that's too slow?

sure

brave sand Jan 14, 2025, 4:29 PM

#

serene scaffold can you show the code you're trying to run that's too slow?

https://hastebin.com/share/ayuzitopis.python

Hastebin

Hastebin is a free web-based pastebin service for storing and sharing text and code snippets with anyone. Get started now.

#

so context:
construct_reward_matrix is from a larger codebase that i ran a profiler on, it's saying .tocsc is taking the most time so i wanted to optimize it

#

\

serene scaffold Jan 14, 2025, 4:32 PM

#

brave sand https://hastebin.com/share/ayuzitopis.python

@brave sand is the main point of all this to compare the speed differences between the different array types?

brave sand Jan 14, 2025, 4:33 PM

#

serene scaffold <@765319974469238814> is the main point of all this to compare the speed differe...

i wanted to extract that function which was slow to see if implementing it in cupy is worth the performance boost vs implementing it in the larger codebase which might take longer and more effort

serene scaffold Jan 14, 2025, 4:34 PM

#

@brave sand and you're saying that the line that's too slow is which of these?

sp.coo_array((datas, (rows, cols)), shape=(self.n_actions, self.n_states))
cps.coo_matrix((datas, (rows, cols)), shape=(self.n_actions, self.n_states))

brave sand Jan 14, 2025, 4:35 PM

#

serene scaffold <@765319974469238814> and you're saying that the line that's too slow is which o...

yes, and the .tocsc() call

serene scaffold Jan 14, 2025, 4:35 PM

#

brave sand yes, and the .tocsc() call

which of those two lines is that?

#

because neither of them contain "toscs"

brave sand Jan 14, 2025, 4:36 PM

#

        return R.tocsc()

#

this one

#

right below it

serene scaffold Jan 14, 2025, 4:36 PM

#

that same line appears after both; in which instance is it too slow?

brave sand Jan 14, 2025, 4:37 PM

#

serene scaffold that same line appears after both; in which instance is it too slow?

oh shoot, are they not equivalent in the sense that cupy and scipy have their own .tocsc?

#

like, does .tocsc work for scipy and cupy?

#

so it's not a valid comparison?

serene scaffold Jan 14, 2025, 4:38 PM

#

you have to look at what types sp.coo_array and cps.coo_matrix return. different types can have different methods with the same name, and that guarantees nothing.

brave sand Jan 14, 2025, 4:39 PM

#

serene scaffold you have to look at what types `sp.coo_array` and `cps.coo_matrix` return. diffe...

ok i will look into that. do you think this is even worth pursuing to make my code run faster?

this is the function from the codebase:

    def construct_reward_matrix(
        self, t: int, market: Market = None, **kwargs
    ) -> sp.csc_array:
        """
        Construct a sparse reward matrix

        Args:
            t (int): timestep of the market price data ([-1, horizon])
            market (Market): market of prices.

        Returns:
            sp._csc.csc_matrix |A|x|S| matrix
        """
        if market is None:
            market = self.market

        if self.verbose:
            print(
                f"Constructing {self.n_actions:,}x{self.n_states:,} reward matrix."
            )

        harvest_items = self.get_harvest_items(market=market, t=t)
        penalty_constraint_sat_items = self.get_penalty_constraint_sat_items(
            t=t
        )  
        entries = harvest_items + penalty_constraint_sat_items

        locations = {(i[0], i[1]) for i in entries}
        assert len(entries) == len(locations)

        rows = []
        cols = []
        datas = []

        for action_id, state_id, data in entries:
            rows.append(action_id) 
            cols.append(state_id)
            datas.append(data)

        R = sp.coo_array(
            (datas, (rows, cols)), shape=(self.n_actions, self.n_states)
        )
        return R.tocsc()```

#

is the problem that converting to csc format is inherently slow no matter what library does it?

#

i thought using cupy would make it a lot faster because it's on the gpu

serene scaffold Jan 14, 2025, 4:43 PM

#

it looks like CSCs are for representing sparse 2d arrays with less memory, with performance costs. and it also takes time to convert to that format.

brave sand Jan 14, 2025, 4:45 PM

#

serene scaffold it looks like CSCs are for representing sparse 2d arrays with less memory, with ...

shld i try to not convert them to CSC?

serene scaffold Jan 14, 2025, 4:45 PM

#

brave sand shld i try to not convert them to CSC?

are you running out of memory?

brave sand Jan 14, 2025, 4:46 PM

#

serene scaffold are you running out of memory?

no?

serene scaffold Jan 14, 2025, 4:47 PM

#

brave sand no?

the point of CSC is to use less memory.

brave sand Jan 14, 2025, 4:47 PM

#

serene scaffold the point of CSC is to use less memory.

ok let me try it without converting to CSC. do you think gpu computing is still worth it?

serene scaffold Jan 14, 2025, 4:48 PM

#

brave sand ok let me try it without converting to CSC. do you think gpu computing is still ...

idk enough about your use case to say.

brave sand Jan 14, 2025, 4:48 PM

#

serene scaffold idk enough about your use case to say.

ok thanks!

#

Raw duration: 0.000000 seconds
Scipy duration: 0.001004 seconds```
@serene scaffold

serene scaffold Jan 14, 2025, 4:54 PM

#

brave sand ``` Raw duration: 0.000000 seconds Scipy duration: 0.001004 seconds``` <@2536963...

is this with cupy?

brave sand Jan 14, 2025, 4:54 PM

#

serene scaffold is this with cupy?

raw python lists vs scipy

#

    def construct_reward_matrix(self, t, market=None, **kwargs):
        if market is None:
            market = self.market

        if self.verbose:
            print(f"Constructing {self.n_actions:,}x{self.n_states:,} reward matrix.")

        harvest_items = self.get_harvest_items(market=market, t=t)
        penalty_constraint_sat_items = self.get_penalty_constraint_sat_items(t=t)
        entries = harvest_items + penalty_constraint_sat_items

        locations = {(i[0], i[1]) for i in entries}
        assert len(entries) == len(locations)

        rows, cols, datas = zip(*entries)

        return rows, cols, datas```

#

*no lists sorry

#

i need a datatype to store them though no?

serene scaffold Jan 14, 2025, 4:55 PM

#

what are "them"?

brave sand Jan 14, 2025, 4:56 PM

#

serene scaffold what are "them"?

rows cols datas

#

bc i do this:

    R = sp.coo_array((datas, (rows, cols)), shape=(self.n_actions, self.n_states))

#

shld i use a dictionary?

serene scaffold Jan 14, 2025, 4:57 PM

#

brave sand bc i do this: R = sp.coo_array((datas, (rows, cols)), shape=(self.n_act...

you're trying to store tabular data in python lists? you should use pandas or numpy for that.

brave sand Jan 14, 2025, 4:57 PM

#

serene scaffold you're trying to store tabular data in python lists? you should use pandas or nu...

it is slow though no?

serene scaffold Jan 14, 2025, 5:00 PM

#

brave sand it is slow though no?

don't fall into the trap of premature optimization. but numpy and pandas are "fast"

violet gull Jan 14, 2025, 5:56 PM

#

has anyone tried using LLM for automated parsing in industry applications? I know LLM can occasionally generate false information but recently ive seen no false information when it comes to numbers and stuff. Im wondering if anyone has found a way to successfully use it and what kind of cross checking ensures accuracy?

#

Is there a way to increase the accuracy to the point it is equivelent to a human written parser on complex input and would it be cost effective sacraficing compute cost for development cost?

serene scaffold Jan 14, 2025, 6:14 PM

#

violet gull has anyone tried using LLM for automated parsing in industry applications? I kno...

Can you give an example of what you want to "parse"?

violet gull Jan 14, 2025, 6:17 PM

#

serene scaffold Can you give an example of what you want to "parse"?

Chemical Sensor data which comes in big excel files

serene scaffold Jan 14, 2025, 6:20 PM

#

violet gull Chemical Sensor data which comes in big excel files

suppose that you have that excel file as a dataframe, and you pick the column that has the values you want to extract as col.
you can then ask something like "The following is a list of ... from a spreadsheet about ... . Please extract all the ... values that appear in the list. Examples of this include ... . Please return the extracted values as a JSON array of strings; please do not include any other explanations or text in your response. {col.tolist()}"

you will need to use an LLM that's instruction-tuned (they typically have "instruct" in the name.

#

you can modify the prompt and the inference parameters (like temperature), but don't try to fine-tune.

#

as far as evaluating the system: you need a test set with which you can calculate the precision and recall.

violet gull Jan 14, 2025, 6:26 PM

#

serene scaffold suppose that you have that excel file as a dataframe, and you pick the column th...

Oh I know how to do it, I just wonder if anyone actually does it and how accurate it is

#

On large scale

serene scaffold Jan 14, 2025, 6:26 PM

#

violet gull Oh I know how to do it, I just wonder if anyone actually does it and how accurat...

do you know why there isn't a universal answer for that question?

violet gull Jan 14, 2025, 6:26 PM

#

I didn’t ask for that

serene scaffold Jan 14, 2025, 6:27 PM

#

you're asking how accurate LLM-based entity extraction is in general/universally

#

are you not?

violet gull Jan 14, 2025, 6:28 PM

#

I’m asking has anyone done it, how well did it work

serene scaffold Jan 14, 2025, 6:29 PM

#

it worked well when I did it.

#

at least for certain types of entities.

violet gull Jan 14, 2025, 6:30 PM

#

serene scaffold it worked well when I did it.

Do you remember how accurate it was?

#

Failure rate? implementation complexity? what kind of data?

iron basalt Jan 14, 2025, 8:26 PM

#

brave sand shld i try to not convert them to CSC?

For sparse matrix formats in general. If your matrix is very sparse, worth it, else not.

#

It's improves both memory usage and performance. The reason it improves performance is because when doing multiplication, you can skip all zero entries, resulting in less multiply-adds (less loop iterations total, it does not even consider the zero entries).

#

However, those non-zero elements are slower (jumping around in memory position where those non-zeros are) than before, so this must be outweighed by the matrix being sparse enough.

#

If you are not constructing this sparse matrix repeatedly, but only once, or only every once in a while, then formats such as CSR/CSC are optimal.

#

They take more time to construct, but run just as fast as dense matrixes, while also avoiding all zero entries and reduced memory usage.

#

(COO if you need to construct the matrix repeatedly and need that to not be slow (ratio of number of times you construct the matrix over how many times you use it to multiply is not small))

iron basalt Jan 14, 2025, 8:37 PM

#

iron basalt If you are not constructing this sparse matrix repeatedly, but only once, or onl...

The reason CSR/CSC construction is slow is because it needs to basically make a packed format where all the non-zero entries are in one contiguous array (to be looped over during multiplication).

#

Converting a dense matrix to such a sparse matrix requires looping over all the matrix entries, creating this contiguous array from the non-zero entries. So if you then only use that to multiply once, you are not gaining anything unless the matrices are giant and very sparse (due to O(n^3) multiplication).

rancid sorrel Jan 14, 2025, 8:39 PM

#

brave sand does anyone know faster alternatives to scipy tocsc?

dataclass

serene scaffold Jan 14, 2025, 8:48 PM

#

rancid sorrel dataclass

that's not an alternative to what they're using.

rancid sorrel Jan 14, 2025, 8:49 PM

#

you wanna take A and make it B thats a sparse matrix?

hollow carbon Jan 14, 2025, 8:50 PM

#

anyone know some free real time data api that i can use for data analysis maybe sports or something

rancid sorrel Jan 14, 2025, 8:50 PM

#

hollow carbon anyone know some free real time data api that i can use for data analysis maybe ...

each sport or betting shop will put up alot of stats

#

depends on sport

hollow carbon Jan 14, 2025, 8:51 PM

#

wdym

rancid sorrel Jan 14, 2025, 8:51 PM

#

major league baseball and american footbal tend put out lot of stats

hollow carbon Jan 14, 2025, 8:51 PM

#

yeah basketball or football

rancid sorrel Jan 14, 2025, 8:51 PM

#

like you can see the players injury and crimminal convictions if you want

hollow carbon Jan 14, 2025, 8:51 PM

#

football meaning soccer

rancid sorrel Jan 14, 2025, 8:52 PM

#

no "football" the real one 😉 dosnt put out as much stats

hollow carbon Jan 14, 2025, 8:52 PM

#

ahh wb basketball

rancid sorrel Jan 14, 2025, 8:52 PM

#

https://www.mlb.com/stats/

MLB.com

2024 MLB Player Hitting Stat Leaders

The official source for player hitting stats, MLB home run leaders, batting average, OPS and stat leaders

thick rapids Jan 14, 2025, 8:52 PM

#

Guys one question, do you think an intelligent person with an economic background can be a good data scientist

rancid sorrel Jan 14, 2025, 8:53 PM

#

thick rapids Guys one question, do you think an intelligent person with an economic backgroun...

sure probably better than alot of data scientists

hollow carbon Jan 14, 2025, 8:53 PM

#

rancid sorrel https://www.mlb.com/stats/

is there an api i could use

thick rapids Jan 14, 2025, 8:53 PM

#

Because I’m doing a master in data science and I have a lot of friends with a computer science background and look down to me

hollow carbon Jan 14, 2025, 8:53 PM

#

idk pretty new to this api stuff

rancid sorrel Jan 14, 2025, 8:54 PM

#

hollow carbon ahh wb basketball

plus if you really wanna blow the mind of your professors, you can do mention the leagues when merica was more racist 😉

#

but generaly the disaplin your looking at @hollow carbon is data scraping, there are 2 major types of this
webscraping (http is an api)
actual api's like REST, GRPC, this usally has a more defiined model

#

if you want an api to playwith, i reccomend guild wars2 or Eve Online, both have a market API you can mess with , the documentation and a community that knows whats going on

#

there is also this https://www.oauth.com/playground/

#

but this is more "software developer"

#

fyi this is what i am like when i write papers 😉

serene scaffold Jan 14, 2025, 9:04 PM

#

thick rapids Guys one question, do you think an intelligent person with an economic backgroun...

masters degrees in data science tend to have a lower level of rigor than masters in computer science that are about data science

rancid sorrel Jan 14, 2025, 9:04 PM

#

the economist probably has more math

#

computer scientists tend to be software engneeris with decent descrete math

thick rapids Jan 14, 2025, 9:05 PM

#

serene scaffold masters degrees in data science tend to have a lower level of rigor than masters...

Yea it’s just the fact that they are so fed up of themselves I can’t stand em

serene scaffold Jan 14, 2025, 9:05 PM

#

thick rapids Yea it’s just the fact that they are so fed up of themselves I can’t stand em

then stop being friends with them and start planning your revenge

rancid sorrel Jan 14, 2025, 9:05 PM

#

or just earn so much money you can buy them

thick rapids Jan 14, 2025, 9:05 PM

#

serene scaffold then stop being friends with them and start planning your revenge

My revenge will be me getting the best grades I guess

rancid sorrel Jan 14, 2025, 9:06 PM

#

do you know R and Python?

#

ontop of your economics mathmatics?

thick rapids Jan 14, 2025, 9:06 PM

#

I have to do a project about spatial data in sql using pgadmin4 and postgis

thick rapids Jan 14, 2025, 9:06 PM

#

rancid sorrel do you know R and Python?

Yes I know em both

rancid sorrel Jan 14, 2025, 9:06 PM

#

congrats your a data scientist

#

you just care about the answer

thick rapids Jan 14, 2025, 9:07 PM

#

rancid sorrel congrats your a data scientist

Loooll thanks for the encouragement
I’m dealing with sql right now

thick rapids Jan 14, 2025, 9:07 PM

#

rancid sorrel you just care about the answer

What

rancid sorrel Jan 14, 2025, 9:07 PM

#

alot of the time data scienits/engineers dont care about the answer

#

your next reccomended netflix watch dosnt really matter that much

#

well not since everyone is using same algorthim

#

btw anyone know a good book for programming/finance intro?

thick rapids Jan 14, 2025, 9:09 PM

#

Check books about econometrics

#

I don’t know why you need the book tho

rancid sorrel Jan 14, 2025, 9:10 PM

#

i need to improve my time serise programming, but i need domain knowlge about finace to make code that is well useful

thick rapids Jan 14, 2025, 9:14 PM

#

Study stocks

#

You know finance is a bit difficult to start with

rancid sorrel Jan 14, 2025, 9:17 PM

#

yeah thats more or less what iam doing
can i caluatle a return and a monety carlo simulation, yes. do i know what they are for no

#

can i do those in 3-4 diffrnet methods in 2-3 programming laungess yes, this is where i am most of the skill and none of the reason

brave sand Jan 14, 2025, 9:44 PM

#

iron basalt For sparse matrix formats in general. If your matrix is very sparse, worth it, e...

it's a reward matrix, it is pretty sparse

thick rapids Jan 14, 2025, 10:03 PM

#

rancid sorrel can i do those in 3-4 diffrnet methods in 2-3 programming laungess yes, this is...

Try simpler business cases. For example categorising clients, studying time series of sales. Don’t jump into finance, is like jumping on java without knowing some python or some basic programming

rancid sorrel Jan 14, 2025, 10:22 PM

#

I can already do that 🙂

thick rapids Jan 14, 2025, 10:57 PM

#

rancid sorrel I can already do that 🙂

Ah ok I didn’t know
Then you’re ready for finance, congrats

rancid sorrel Jan 14, 2025, 11:03 PM

#

I am going back to learn how to do it in power bi as well 🙂

thick rapids Jan 14, 2025, 11:27 PM

#

You may also try Tableau

wicked pine Jan 15, 2025, 6:01 AM

#

guys

#

where can i paste code?

So i can get a link that leads to the code

rancid sorrel Jan 15, 2025, 7:32 AM

#

!paste

arctic wedgeBOT Jan 15, 2025, 7:32 AM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

pearl blaze Jan 15, 2025, 10:42 AM

#

i started khan acadmey linear algebra course of 144 videos some days ago for ml , should 144 vids are worth it ?

#

i want to build strong foundation , so i will complete it anyway , but i just wanna ask will this thing help me in practical use cases too ?

#

And i don't wanna become research scientist sort of thing , my aim is ultimately one that is start my startup and grow .

#

i know i can call api and integrate it on my web project as i also know good amound web dev thing , but i just wanna make some unique thing also and do thing like helping local / remote businesses in very large scale , that's why i started ml from scratch . And i am doing this thing along with software dev .

thick rapids Jan 15, 2025, 11:45 AM

#

You need time and practice on real world problems

#

I guess even 1444 videos aren’t worthy as a project

supple grove Jan 15, 2025, 2:49 PM

#

ornate iris So I had this strange Idea to put together a hybridized Latent Context Model and...

What did this idea come from?

ornate iris Jan 15, 2025, 2:51 PM

#

supple grove What did this idea come from?

I read a paper on Latent Consistency Models and realized there was potential to enhance their capabilities, As well as a paper on Large Context Models.

#

This was right after reading about some performance metrics Microsoft was getting from a small language model and MCTS compared to O1

#

But I realized that TTT, MTCS and Surprise Minimization are each applied in slightly different aspects. So while they're each showing effectiveness independently, there's a lot of potential if they're used in tandem and used properly.

lilac island Jan 15, 2025, 4:43 PM

#

Data analytics has a related Python course ?

real smelt Jan 15, 2025, 6:17 PM

#

I want to learn Data science, where do i begin

errant bison Jan 15, 2025, 6:26 PM

#

I want to create project using llms, nlp/ rag. Can anyone give idea how to go about it

gloomy river Jan 15, 2025, 6:37 PM

#

errant bison I want to create project using llms, nlp/ rag. Can anyone give idea how to go ab...

Using "llms" as in importing an already fine tuned model? You can get a lot of API keys with limited access from the web, if your device can handle it you can also run an LLM locally

past meteor Jan 15, 2025, 8:07 PM

#

real smelt I want to learn Data science, where do i begin

Have a look at the pinned posts, some of the resources I tend to recommend are in there

errant bison Jan 15, 2025, 8:43 PM

#

gloomy river Using "llms" as in importing an already fine tuned model? You can get a lot of A...

No fine tuning the model as well. Also do we need to pay to use the api? I just want to build a project to learn

green pilot Jan 15, 2025, 9:35 PM

#

For people looking into data analytics with python I recommend just getting on youtube and looking things up there are free data sets out there and use pandas and other distros that are commonly used. Right now i found a free data sets of customers working on make graphs and heat maps with it

gentle sage Jan 16, 2025, 4:30 AM

#

What separates a resume worthy project from a pet project? I’ve been practicing using RNN and LSTM on time series data however I don’t know what makes it resume worthy.

serene scaffold Jan 16, 2025, 4:32 AM

#

gentle sage What separates a resume worthy project from a pet project? I’ve been practicing ...

keep in mind that some employers won't even consider your personal projects. but if they were to look at your personal projects, they'd want to see explanations for how the system works and what decisions you made. why did you pick certain hyperparameters? what analysis have you done on the performance of the system?

gentle sage Jan 16, 2025, 4:38 AM

#

I have it annotated in a Jupyter notebook with comments on functionality inline

serene scaffold Jan 16, 2025, 4:39 AM

#

gentle sage I have it annotated in a Jupyter notebook with comments on functionality inline

it would be even better if you used markdown cells. and if you used math latex for anything mathematical.

gentle sage Jan 16, 2025, 4:42 AM

#

I’m sorry im not the best with formal names. I did use markdown cells with latex but lowkey I just c+v and filled it out

#

Kinda just copied format from other notebooks I’ve seen

final cobalt Jan 16, 2025, 5:12 AM

#

Hey people

#

Halp Q.Q

#

https://hastebin.com/share/ovopetumeq.py

Hastebin

Hastebin is a free web-based pastebin service for storing and sharing text and code snippets with anyone. Get started now.

#

Somewhere in this code is a bug. Yesterday this code (a messier version of it) worked. Today it isn't working. Specifically, yesterday continued to learn much further before converging. Now it isn't getting far at all before getting caught in some kind of minimum

unkempt apex Jan 16, 2025, 7:29 AM

#

final cobalt https://hastebin.com/share/ovopetumeq.py

Use the pastebin

pearl blaze Jan 16, 2025, 12:13 PM

#

thick rapids I guess even 1444 videos aren’t worthy as a project

i know projects are main thing but im building foundation now , but i will also learn numpy(already learned python) to implement atleast what i am learning for now , i think linear algebra can be implemented using numpy that's why .

thick rapids Jan 16, 2025, 12:13 PM

#

I hate linear algebra and I dont think you need it as much as statistics

#

start with stats

pearl blaze Jan 16, 2025, 12:15 PM

#

205 videos of statistics

#

khan acadmey

#

worth it ?

#

and i am starting with linear algebra because its required in statistics too

#

as it provides essential tools for managing and analyzing data, particularly when dealing with multiple variables

#

It simplifies the handling of large datasets and is crucial for understanding and applying statistical concepts effectively

thick rapids Jan 16, 2025, 12:44 PM

#

bruh yeah but for dont spend much of your energy in mat teory go on with stats and python

fallow coyote Jan 16, 2025, 12:45 PM

#

How do you balance learning the mathematics for ML and programming? I like learning the maths behind it but, I dont want to spend so much time learning it snd then not spending enough time programming.

pearl blaze Jan 16, 2025, 1:31 PM

#

fallow coyote How do you balance learning the mathematics for ML and programming? I like learn...

I am balancing ! i think time management

fickle shale Jan 16, 2025, 1:43 PM

#

fallow coyote How do you balance learning the mathematics for ML and programming? I like learn...

it's mostly depend on your final outcome. if u really want to learn math/stats behind ml algo u need to give time to theory!

#

Learn and implement!

scarlet quest Jan 16, 2025, 1:43 PM

#

Hey Guys

#

What's Going On

fallow coyote Jan 16, 2025, 1:46 PM

#

fickle shale it's mostly depend on your final outcome. if u really want to learn math/stats b...

I want to learn enough of the maths to sufficiently utilise the ML libraries and be able to create a ML program where I understand what its doing, if that makes sense. Currently in my foundation year at uni so, all the ML shit I'll be learning in my 2nd year of uni.

fickle shale Jan 16, 2025, 1:47 PM

#

fallow coyote I want to learn enough of the maths to sufficiently utilise the ML libraries and...

Learning enough is very difficult to ans!

#

Basic of linear algebra probability stats calculus takes time!

pearl blaze Jan 16, 2025, 2:27 PM

#

thick rapids bruh yeah but for dont spend much of your energy in mat teory go on with stats ...

the most of the energy i am trying to give to , connecting theory with practicality .

#

if we learn 2% everyday , and implement atleast 30% of it , its more than enough .

thick rapids Jan 16, 2025, 2:29 PM

#

yeah

pearl blaze Jan 16, 2025, 2:29 PM

#

i would like to know , is any library

#

that connect

#

linear algebra with practicality , basically library that help me to apply daily linear algebra dose

thick rapids Jan 16, 2025, 2:37 PM

#

there are no shortcut my friend unfortunatelty you have to connect things

untold bloom Jan 16, 2025, 2:55 PM

#

scipy.linalg, numpy.linalg

fallow coyote Jan 16, 2025, 3:19 PM

#

I think I'm better off learning ML/AI when I've reached a sufficient understanding of mathematics whilst I'm at uni. Its such a complex topic that a uni environment would be much better for me to fully understand it

serene grail Jan 16, 2025, 3:53 PM

#

fallow coyote I think I'm better off learning ML/AI when I've reached a sufficient understandi...

You can also watch some 3blue1brown videos on YouTube, he has a video series on calculus, a video series on linear algebra and some videos on how transformers work
He starts from the basics and tries to make it visual and intuitive
So when you start learning that stuff in uni you will have an advantage

fallow coyote Jan 16, 2025, 3:56 PM

#

serene grail You can also watch some 3blue1brown videos on YouTube, he has a video series on ...

I could but, at this point, I just want to make stuff. I want to learn and then create stuff. I'm in no rush as I'll be on the AI part in the 2nd year of my course (currently in foundation year, not even in the 1st year yet) so, I've got another a year and a half to prepare. I enjoy learning the maths but can't be asked to learn it further. I'll move onto something else for the meantime

midnight rain Jan 16, 2025, 4:18 PM

#

fallow coyote I could but, at this point, I just want to make stuff. I want to learn and then ...

but u better learn math before moving on

#

if u dont then u wont understand what you are doing

final cobalt Jan 16, 2025, 5:56 PM

#

unkempt apex Use the pastebin

?

unkempt apex Jan 16, 2025, 5:56 PM

#

!pastebin

arctic wedgeBOT Jan 16, 2025, 5:56 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

final cobalt Jan 16, 2025, 6:46 PM

#

https://paste.pythondiscord.com/KINA

#

Just in case anyone can see any obvious issues. I'm pretty sure whatever bug I'm looking for is in here. Sin switched with cos or something.

dense needle Jan 16, 2025, 6:50 PM

#

fallow coyote How do you balance learning the mathematics for ML and programming? I like learn...

there's no substitute for putting the time into both and you should just take classes for both if you are able. that said imo there are better and more resources available for independently studying programming than there are for math/statistics.

disclaimer: i'm not an ML expert. i am a person with a math/stats background who is transitioning into data science/programming.

final cobalt Jan 16, 2025, 6:51 PM

#

fallow coyote How do you balance learning the mathematics for ML and programming? I like learn...

In theory you can get away with very little math and still make AI

#

In theory

#

But it's extremely helpful. Sorta the difference between using both arms or having one tied behind your back

dense needle Jan 16, 2025, 6:54 PM

#

i am actually planning to work through Intro to Statisitcal Learning later in the spring/summer if anyone is interested in forming a group

final cobalt Jan 16, 2025, 6:54 PM

#

As someone who self-taught myself programming, I know that no amount of studying in the world is a substitute for actually getting your hands on the thing, but, that the foundational knowledge will fill in gaps you can't really learn on your own

dense needle Jan 16, 2025, 6:54 PM

#

or to start working through it. it's a big book lol

final cobalt Jan 16, 2025, 6:55 PM

#

My advice is to simply push along both fronts, and you'll learn what you need to (or at least, come to know what you don't know) in due course

#

And in any case, it's a moot point. You're going to need Calculus period, full stop. It doesn't matter what you're going to end up doing, calculus is a must have. That alongside linear algebra is the basis of ML

#

So don't sweat it. Just focus on learning Calculus and when you're done, assuming you've dicked around with ML in the mean time, you'll have a pretty good idea of what's left to study once that's over

#

On that front, I do have a little advice

#

Professor ~~Sexy~~ Leonard is your friend. https://www.youtube.com/professorleonard

YouTube

Professor Leonard

This Channel is dedicated to quality mathematics education. It is absolutely FREE so Enjoy! Videos are organized in playlists and are course specific. If they have helped you, consider Support:

You may find and support me at Patreon.com/Professorleonard

Please consider "Whitelisting" this Channel on your AdBlock if it is enabled.

Your su...

#

#1 best calculus teacher ever. You'll be fine

#

Second, do your calculus in a single straight shot. Do calc I, Calc II in the second term, Calc III in the summer (online if you have to) and then Calc IV in the second fall

#

Third: https://www.mathway.com/Algebra

This is what got me through Calculus. It'll break down the problems for you step by step. That said, that was before ChatGTP

Mathway | Algebra Problem Solver

Free math problem solver answers your algebra homework questions with step-by-step explanations.

#

And fourth: Subscribe to ChatGTP. It's $20 (USD) per month, but it'll be the best money you'll ever spend on an educational expense. You can give it a calculus problem and it'll break it down for you step by step with full english explanations. It'll also help with everything else in your educational life

#

And lastly, be a nuisance here. Ask every question you need to (after doing the reading of course, we aren't here to do your studying)

fallow coyote Jan 16, 2025, 7:09 PM

#

dense needle there's no substitute for putting the time into both and you should just take cl...

I know that but like I said, I want to make stuff and not just be spending copious amounts of time learning the background info. Tbf I have no fucking idea what I want to do. I just want to do something in my spare time where I feel like Im doing something whilst get through the shit that is uni

dense needle Jan 16, 2025, 7:11 PM

#

Maybe just try making stuff then until you bump into walls based on math limitations

#

Which is maybe something like “I can make this model but it performs poorly”?

#

Or “idk which model is appropriate here”

fallow coyote Jan 16, 2025, 7:19 PM

#

Idk man. Ive gone back to square one, as I usually do. Its not like the maths is hard to learn. My mathematical ability has always been my strongest part. Even with complicated concepts I can power through and learn. I just want to do something interesting where I get stuck in and not do menial BS

final cobalt Jan 16, 2025, 7:22 PM

#

fallow coyote Idk man. Ive gone back to square one, as I usually do. Its not like the maths is...

https://tenor.com/view/calculus-cat-cat-math-studying-cat-gif-23164559

Tenor

#

Learn

#

Calculus

dense needle Jan 16, 2025, 7:26 PM

#

fallow coyote Idk man. Ive gone back to square one, as I usually do. Its not like the maths is...

what's the most recent math class you took

fallow coyote Jan 16, 2025, 7:29 PM

#

In uni. Literally all my classes are maths (mechanics, electronics and your pure maths). Ill send you a link with my uni course: https://www.shu.ac.uk/courses/computing/bsc-honours-artificial-intelligence-and-robotics-with-foundation-year/full-time#course-modules

BSc (Honours) Artificial Intelligence and Robotics with Foundation ...

Develop the skills and knowledge to create robots that are increasingly intelligent – combining artificial intelligence with principles of electrical engineering to create innovative, autonomous devices.

#

Its utter dogshit atm. I dropped out of uni over a year ago after doing biomed for three years. Was essentially forced to go back to uni by my parents. Dont want to do uni anymore or at least do uni the conventional way

dense needle Jan 16, 2025, 7:30 PM

#

Seems like if you want to make some stuff just try to make it then and see where you land

fallow coyote Jan 16, 2025, 7:36 PM

#

Yeah but with ml stuff, im very limited in what I do if I dont understand what the code and values are telling me. Im just going to switch to a different are of computing for now

#

Over a year of programming and im still barely a beginner.

analog gust Jan 16, 2025, 8:36 PM

#

if anyone with knowledge in sklearn could read this i would be really thankful paimoncry
also if anyone is for some reason super interested in the topic please feel free to DM me since i dont think theres gonna be a simple answer for this ...

i'm not sure how to even start but, i was gonna program an AI for an NPC boss for my bachelor thesis, and my professor kind of pushed me into the direction of using scikit learn and he really wants me to use machine learning.
now, maybe my understanding of machine learning just isnt that great but, until now i've set up a python server which gets all data entries of the players movement, actions, his distance to the boss, attacks, dash directions, etc etc. and the idea was to use sklearn to evaluate the data and send its results back to the game, to change the bosses stats for the next round.
but i really don't have a lot of data that i'm sending, at least nothing compared to the sample sizes in the sklearn examples. and what is a graph of a DBscan gonna help me to evaluate the results? it seems i still need to evaluate them myself in the end? I've never worked with sklearn before , so maybe i just dont know its full potential.
I guess my question is basically, if your professor told you to use machine learning for this specific example, what would you do with the received data and how would that be helpful in the end?
ty in advance y'all are lovely sagehearts

serene scaffold Jan 16, 2025, 9:07 PM

#

analog gust if anyone with knowledge in sklearn could read this i would be really thankful <...

has your professor used scikit learn for this? because I've never heard of anyone using scikitlearn for reinforcement learning, and I don't think it's possible.

#

what is your professor's research domain?

analog gust Jan 16, 2025, 9:09 PM

#

serene scaffold has your professor used scikit learn for this? because I've never heard of anyon...

all I know is that another student did a similar thesis but about a topdown bullethell game, and he also encouraged her to use scikit but kinda complained that she didn't really use the machine learning part all that much...

serene scaffold Jan 16, 2025, 9:11 PM

#

https://scikit-learn.org/stable/faq.html#why-is-there-no-support-for-deep-or-reinforcement-learning-will-there-be-such-support-in-the-future

scikit-learn

Frequently Asked Questions

Here we try to give some answers to questions that regularly pop up on the mailing list. Table of Contents: About the project- What is the project name (a lot of people get it wrong)?, How do you p...

analog gust Jan 16, 2025, 9:11 PM

#

I don't think he works a lot with scikit himself, he only teaches advanced game development and not really ML

serene scaffold Jan 16, 2025, 9:11 PM

#

scikitlearn decidedly doesn't support reinforcement learning

#

it says so in their FAQ

#

do not try to do this with sklearn, because it can't be done. show them the link I just sent you. do not let them convince you to continue this exercise in futility.

analog gust Jan 16, 2025, 9:13 PM

#

so reinforcement learning would be the only option that would even make sense in this scenario?

serene scaffold Jan 16, 2025, 9:13 PM

#

yes

#

if the other student got in trouble for not using the ML tools in sklearn for that assignment, it's because there are no ML tools in sklearn for that.

analog gust Jan 16, 2025, 9:14 PM

#

thank you, that's kind of bad news but, at least I know not to keep looking for solutions now NamiPray

serene scaffold Jan 16, 2025, 9:14 PM

#

you asked about this a few weeks ago, right?

analog gust Jan 16, 2025, 9:16 PM

#

I think I opened a thread for it but at that time my issue was mostly still even connecting my project to a python environment to even access sklearn

serene scaffold Jan 16, 2025, 9:16 PM

#

https://pytorch.org/rl/stable/index.html

#

try this.

analog gust Jan 16, 2025, 9:17 PM

#

I will look into it cNaruThank

serene scaffold Jan 16, 2025, 9:17 PM

#

pepefedora

iron basalt Jan 16, 2025, 9:34 PM

#

analog gust so reinforcement learning would be the only option that would even make sense in...

You can also do a few other options. Such as behavioral cloning (copying a player playing through it), or genetic algorithms (evolution). Genetic algorithms take the most compute, but absolutely wreck any video game (far above human performance given enough training time). They are really good when applicable, and you can afford it (they can also be combined with other methods, such as reinforcement learning).

#

Reinforcement learning will work depending on the specific game / reward signal.

#

Some problems it will just never get there in any reasonable amount of training time (without you helping it out a lot (basically cheating)).

#

For animals IRL, they have a ton of evolved stuff that basically bootstraps them, so their reinforcement learning only needs to tweak things, it does not need to start from scratch (where the search space is so massive you often don't get anywhere).

#

(e.g. putting everything in your mouth when young is an evolved data collection behavior which helps the reinforcement learning, it gives it a bunch of data it would not have gotten otherwise through completely random actions (starting from random init))

#

(behavioral cloning is another way of getting a starting point)

analog gust Jan 16, 2025, 9:47 PM

#

iron basalt You can also do a few other options. Such as behavioral cloning (copying a playe...

I see yea... well right now my game is kind of made to be probably less than 3 minutes of playtime in which the player either kills the boss, gets killed, or the timer runs out. after that, next round starts obviously. in theory, the boss should change according to not only the player style ie aggressive/defensive etc but of course also the skill level. I guess it would make sense that I try to feed it test samples of what an aggressive/defensive behavior would be reflected like in the data? id hate for the test playert to have to play for hours to see any changes? floof_cry which method makes sense for this?