#data-science-and-ml | Python | Page 112

final kiln Mar 20, 2024, 12:25 PM

#

I think it's just adding a dimension of size 1 to the end of it, but not sure

river cape Mar 20, 2024, 12:25 PM

#

final kiln I think it's just adding a dimension of size 1 to the end of it, but not sure

If I dont add that , i get an error

final kiln Mar 20, 2024, 12:26 PM

#

it's probably there because of broadcasting rules

river cape Mar 20, 2024, 12:34 PM

#

One more thing have you heard of cross validation score?

#

In that we divide the training set into train-test folds and then compute the accuracies right?

#

What does a fold mean?

final kiln Mar 20, 2024, 12:38 PM

#

river cape What does a fold mean?

I think it just means split

untold bloom Mar 20, 2024, 12:47 PM

#

scalers (like StandardScaler() or MinMaxScaler()) expect their input to be 2D

#

single output regressors (which are most of them, like LinearRegression() or RandomForestRegressor()) yield their .predict output as 1D

#

so one must reshape the 1D prediction output to be 2D to be inverse-transformable via those scalers

#

so you reshape a 1D input of shape (N,) to be (N, 1)

#

.reshape(-1, 1) is one of the ways, another is [:, None] or [:, np.newaxis]

#

in your code, I presume you'll find a similar sort of reshaping to fit sc2 to your training target values in the first place

#

because again, they expect a 2D input, your target is 1D

untold bloom Mar 20, 2024, 12:52 PM

#

untold bloom .reshape(-1, 1) is one of the ways, another is `[:, None]` or `[:, np.newaxis]`

can also do .reshape(N, 1) but why be error prone when you can put -1 to make it infer

final kiln Mar 20, 2024, 1:01 PM

#

I usually use unsqueeze

river cape Mar 20, 2024, 1:36 PM

#

untold bloom scalers (like StandardScaler() or MinMaxScaler()) expect their input to be 2D

Isnt the input already in 2D?

#

by the use of double square brackets

#

[[6.5]] is in 2D right

desert oar Mar 20, 2024, 1:58 PM

#

river cape If I dont add that , i get an error

what's the error? error messages are there to inform you what the problem is. you are supposed to read them, not treat them as an opaque blob of red text.

#

sometimes error messages are unhelpful, but even knowing where the error came from (the "traceback" part) is useful

#

and it's especially useful when asking for help, because otherwise you're forcing other people to guess at what the problem might be

mellow vector Mar 20, 2024, 2:01 PM

#

is there a meaningful difference between a) df.drop(index = 50) and b) df = df.loc[~(df.index == 50)] also, should i focus less on bracket notation?

river cape Mar 20, 2024, 2:03 PM

#

desert oar what's the error? error messages are there to inform you what the problem is. yo...

This is the error I am getting

desert oar Mar 20, 2024, 2:05 PM

#

mellow vector is there a meaningful difference between a) df.drop(index = 50) and b) df = df.l...

yes. the latter is very wasteful. it constructs an entire boolean array

#

it doesn't help here, but don't forget that ~(x == y) is just x != y

mellow vector Mar 20, 2024, 2:05 PM

#

ya i thought about that after the fact

desert oar Mar 20, 2024, 2:06 PM

#

river cape This is the error I am getting

the error message looks pretty clear to me. what's the issue?

mellow vector Mar 20, 2024, 2:06 PM

#

regarding 50 though, im not working with unique indexes

#

is generating a series still overkill?

desert oar Mar 20, 2024, 2:07 PM

#

mellow vector regarding 50 though, im not working with unique indexes

for starters, don't use non-unique indexes

desert oar Mar 20, 2024, 2:07 PM

#

mellow vector is generating a series still overkill?

in the case of a non-unique index then you might just get an error with .drop, you'd have to see what happens

mellow vector Mar 20, 2024, 2:07 PM

#

i don't have a choice in that, it's the coursework

desert oar Mar 20, 2024, 2:07 PM

#

oh?

#

is this a multiindex? that's different

mellow vector Mar 20, 2024, 2:08 PM

#

no it's just the instructors preference i guess but thats kind of besides the point

desert oar Mar 20, 2024, 2:08 PM

#

!e ```python
import pandas as pd
df = pd.DataFrame({"i": [1, 2, 2], "y": [4, 5, 6]}).set_index("i")
df = df.drop(index=1)
print(df)

arctic wedgeBOT Mar 20, 2024, 2:08 PM

#

@desert oar :white_check_mark: Your 3.12 eval job has completed with return code 0.

desert oar Mar 20, 2024, 2:09 PM

#

!e ```python
import pandas as pd
df = pd.DataFrame({"i": [1, 2, 2], "y": [4, 5, 6]}).set_index("i")
df = df.drop(index=2)
print(df)

arctic wedgeBOT Mar 20, 2024, 2:09 PM

#

@desert oar :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 |    y
002 | i   
003 | 1  4

desert oar Mar 20, 2024, 2:09 PM

#

looks like it works

#

so yes, use .drop when available

mellow vector Mar 20, 2024, 2:16 PM

#

hmm think i will, just out of curiosity do you happen to know if pandas handles different index types differently? if it were a rangeindex it could obviously jump straight to it

#

eh nvm, i should practice reading docs

untold bloom Mar 20, 2024, 2:35 PM

#

river cape Isnt the input already in 2D?

the input to .inverse_transform is the output of .predict right?

final kiln Mar 20, 2024, 2:35 PM

#

no way I actually need that sqrt right ? there's no floor nor anything, so there's an assumption that the result is always an integer, which means that I can simplify it from wtv properties guarantee that assumption

untold bloom Mar 20, 2024, 2:35 PM

#

that's the 1D array I meant

final kiln Mar 20, 2024, 2:36 PM

#

the result is not only an integer, it is also odd

#

otherwise the -1 and division by 2 wouldnt be an integer

#

(odd number) = sqrt(x), what can I say about x?

ancient drift Mar 20, 2024, 2:43 PM

#

anyone know why my acc is constant for all epochs (ive just shown 2 but it stays constant after that) when i use binary cross entropy as a loss function? seems to calculate just fine when i switch to normal non-binary cross entropy

lapis sequoia Mar 20, 2024, 2:47 PM

#

anyone already experimented the training loss and inference result begain strange after tensorflow version upgrade ?

#

it's very weird, there is no warning or deprecation notice

#

is pytorch more safe?

final kiln Mar 20, 2024, 2:49 PM

#

final kiln no way I actually need that sqrt right ? there's no floor nor anything, so there...

actually, im gonna do the reasonable thing and just keep two extra arrays to use as a lookup table

#

it's allocated once when the model is instantiated and then reused across all layers

river cape Mar 20, 2024, 2:52 PM

#

untold bloom the input to `.inverse_transform` is the output of `.predict` right?

Yes

final kiln Mar 20, 2024, 2:53 PM

#

ancient drift anyone know why my acc is constant for all epochs (ive just shown 2 but it stays...

how many classes does your model output have ?

final kiln Mar 20, 2024, 2:54 PM

#

lapis sequoia anyone already experimented the training loss and inference result begain strang...

good idea to double check your code and then see if someone has opened an issue on their github repo

#

most times double checking your code works, in my experience at least

ancient drift Mar 20, 2024, 2:54 PM

#

final kiln how many classes does your model output have ?

2 classes, flawless and flawed, 1 output neuron

final kiln Mar 20, 2024, 2:54 PM

#

ancient drift 2 classes, flawless and flawed, 1 output neuron

shouldn't you have 2 output neurons ?

#

one per class

ancient drift Mar 20, 2024, 2:56 PM

#

final kiln shouldn't you have 2 output neurons ?

i read its better to have one output neuron for binary classifiers but idk

final kiln Mar 20, 2024, 2:58 PM

#

ancient drift i read its better to have one output neuron for binary classifiers but idk

yeah im reading it too

#

so the loss goes down, but acc remains stable with repeated values

#

that is kinda odd

#

maybe the API for the binary cross entropy has something different to it

#

it's optimizing something other that for acc

#

that is, you may be using binary wrong

ancient drift Mar 20, 2024, 3:01 PM

#

maybe, ill check the docs

#

funny thing is when i run this model on my test data it performs the best in acc

final kiln Mar 20, 2024, 3:02 PM

#

sounds like a deep debugging is required

#

never had this problem at least

untold bloom Mar 20, 2024, 3:14 PM

#

river cape Yes

and it is 1D, hence the error

river cape Mar 20, 2024, 3:14 PM

#

untold bloom and it is 1D, hence the error

SO for inverse_transform we need a 2d array

untold bloom Mar 20, 2024, 3:17 PM

#

indeed

river cape Mar 20, 2024, 3:34 PM

#

For polynomial regression , how to we deicde the degree of the polynomial?

final kiln Mar 20, 2024, 3:36 PM

#

river cape For polynomial regression , how to we deicde the degree of the polynomial?

heavily depends on how your plot looks like, if it's a line you go for linear, if its a parabola you go for x**2, if it does an S kinda thing you go for cubed, if the shape is complicated you go for high degree, you're fitting a tailor approximation

#

dont go too high tho, cuz after a certain power floating point stops working at certain ranges

river cape Mar 20, 2024, 3:39 PM

#

final kiln heavily depends on how your plot looks like, if it's a line you go for linear, i...

Okay so I did with degree 4 and here is the results

final kiln Mar 20, 2024, 3:39 PM

#

that looks like an x2 or x3

river cape Mar 20, 2024, 3:39 PM

#

final kiln dont go too high tho, cuz after a certain power floating point stops working at ...

Oh so this would like this?

final kiln Mar 20, 2024, 3:40 PM

#

it works the same as when you're fitting a network

#

the degree is an hyper parameter

#

and you gotta check for overfitting

river cape Mar 20, 2024, 3:40 PM

#

final kiln and you gotta check for overfitting

#

Is this overfitting?

final kiln Mar 20, 2024, 3:41 PM

#

I don't know, which points were use for fitting and which are being used for validation/test ?

river cape Mar 20, 2024, 3:42 PM

#

X_grid = np.arange(min(X),max(X),0.01)
X_grid = X_grid.reshape(len(X_grid),1)
plt.scatter(X,Y,color = 'red')
plt.plot(X_grid,lin_reg2.predict(poly.fit_transform(X_grid)),color='Blue')
plt.title("Polynomial Regression Results")
plt.xlabel("Positon Level")
plt.ylabel("Salaries")
plt.show()

#

Only difference is the degree

#

Only is in 4 and the other 24

final kiln Mar 20, 2024, 3:43 PM

#

you gotta select a certain percentage of points at random

#

and not use them during fitting

#

then calculate the error on those points after the fit

#

if that's high, you got an overfit

river cape Mar 20, 2024, 3:44 PM

#

final kiln if that's high, you got an overfit

Whats the problem if it is overfitting?

#

Or if a model is overfitting?

final kiln Mar 20, 2024, 3:45 PM

#

river cape Whats the problem if it is overfitting?

it doesn't predict unseen data with accuracy

#

there's like, an infinite number of curves you can pass through those points

#

you want the one that«ll be most useful to you

final kiln Mar 20, 2024, 3:47 PM

#

river cape

im surprised you could fit a 24 degree tho

#

I think it's doing that cuz you didn't give it enough time to fit

#

24 degrees is a lot, most of physics happens in the 1st 2nd and 3rd

#

oh, one way to improve it within the same number of iterations would be to to do x -> x - 5 substitution, so that the center of the approx is in x = 5

river cape Mar 20, 2024, 3:50 PM

#

Thing is when i visualize the results normally , in the graph it shows me a straight line to each poinr

#

plt.scatter(X,Y,color = 'red')
plt.plot(X,lin_reg2.predict(X_poly),color = 'blue')
plt.title("Polynomial Regression Results")
plt.xlabel("Positon Level")
plt.ylabel("Salaries")
plt.show()

#

This is for normal visualization

#

final kiln Mar 20, 2024, 3:52 PM

#

makes sense

#

24 is too high

#

you gotta do the train/val split too

#

try like, 5 or 6 degrees

river cape Mar 20, 2024, 3:52 PM

#

river cape Okay so I did with degree 4 and here is the results

but when I divide the x -area into more specific values it becomes this.

#

river cape Mar 20, 2024, 3:58 PM

#

final kiln 24 is too high

Is it necessary for a validation set?

final kiln Mar 20, 2024, 4:01 PM

#

I mean you can gauge it by looking at the graph, but the correct way is using a split yeah

#

It's also a good idea to normalize the data

#

Divide the x axis by 10 and the y axis by 1e6

#

It helps preventing overflow

mild grotto Mar 20, 2024, 4:11 PM

#

#

@wooden sail a little animation I made, thanks

final kiln Mar 20, 2024, 4:19 PM

#

chat gpt almost nailed this transcription, I think if I'm more careful with my scribbles I can get it to do all my latex

wooden sail Mar 20, 2024, 4:21 PM

#

mild grotto

awesome

final kiln Mar 20, 2024, 4:53 PM

#

this is the math for the cuda kernel

#

I already caught an error when writing this

#

chat gpt was also not very good, might as well just code it right away

#

dirac should be lower lower on the third eqn

#

4 to 5 is actually wrong, first term is running over too many l's

#

i gotta limit it to those where k = k'

#

there's also one more symmetry consideration besides Mkk' = Mk'k, which is talking about the coordinates of the two vectors being doted, but also, there's symmetry with cc', since the dot product itself is comutative, that is, qncc' = qnc'c

dusky abyss Mar 20, 2024, 5:32 PM

#

i have 3 classes represented by 0, 1 and 8, when i use tf's to_categorical on them all i get are 1 0 and 0 1 which is 2 classes, i checked the documentation and from what i understand it should work with multiple classes, i even changed my classes to 0 1 and 2 but in that case i only get a single class, 1

#

nvm its working now, had to explicitly tell it i had 3 classes

final kiln Mar 20, 2024, 5:46 PM

#

this is gonna require some faith

#

or maybe a drawing to convince myself it makes sense

#

but again, the idea is that the metric tensor is symmetric so I save half the operations on a given calculation of the dot product, and the dot product is comutative, so I save half the operation cuz the resulting scores matrix is then also symmetric

#

half at each step

clear dove Mar 20, 2024, 5:52 PM

#

Hey, everyone

#

I am currently in 2nd year AI & DS field and I need a mini project that I can represent at my University also which I can use for my portfolio any ideas??

long canopy Mar 20, 2024, 6:06 PM

#

@final kiln what times have you been getting on your instance+container startups?

#

am trying to see if I want to go instance -> container -> task or just instance -> task

#

am gonna run a couple of tests later

final kiln Mar 20, 2024, 6:07 PM

#

if you have the image locally it should be like a couple seconds

#

usually better after you've initted it at least once before

long canopy Mar 20, 2024, 6:08 PM

#

final kiln if you have the image locally it should be like a couple seconds

yeah i plan to upload the image as an artifact and spin up the instance from an already-made-for-the-task image

#

a custom minimal arch image or something

#

but it feels like it might be less efficient on a large scale

final kiln Mar 20, 2024, 6:11 PM

#

long canopy yeah i plan to upload the image as an artifact and spin up the instance from an ...

I pre-prepare an AMI that contains the image, it does work, but it still takes 2min to startup

#

it's a lot worst if I don't do it

#

haven't found a good solution for this

long canopy Mar 20, 2024, 6:16 PM

#

final kiln I pre-prepare an AMI that contains the image, it does work, but it still takes 2...

what's the image?

long canopy Mar 20, 2024, 6:16 PM

#

final kiln haven't found a good solution for this

i'll keep you up to date on my times

final kiln Mar 20, 2024, 6:17 PM

#

long canopy what's the image?

just a py image with pip install torch >.>

#

the nvidia stuff is heavy

#

I almost wonder if I should yield and just use the base AMI

#

but I don't want vendor lockin

long canopy Mar 20, 2024, 6:18 PM

#

final kiln the nvidia stuff is heavy

you compare without the nvidia build with non-nvidia build?

#

because virtualization can cause fuckups here

final kiln Mar 20, 2024, 6:18 PM

#

long canopy you compare without the nvidia build with non-nvidia build?

yeah torch is huge with nvidia

#

like really big in comparison with the cpu version

long canopy Mar 20, 2024, 6:18 PM

#

i mean in terms of speed

final kiln Mar 20, 2024, 6:18 PM

#

haven't compared really

#

but ought to be faster, tho might not be

long canopy Mar 20, 2024, 6:19 PM

#

hm i'll be running these benchmarks in the coming week

long canopy Mar 20, 2024, 6:19 PM

#

final kiln but ought to be faster, tho might not be

yeah the virtualization can mess things up

final kiln Mar 20, 2024, 6:20 PM

#

long canopy yeah the virtualization can mess things up

there's no virtualization tho

#

unless you got a mismatch between the image arch and the machine arch

long canopy Mar 20, 2024, 6:20 PM

#

you're not on cloud?

final kiln Mar 20, 2024, 6:21 PM

#

long canopy you're not on cloud?

I am, you mean they virtualize the underlying machine »

long canopy Mar 20, 2024, 6:21 PM

#

yeah the instance is a virtual machine, it's not bare metal

final kiln Mar 20, 2024, 6:22 PM

#

idk if they virtualize the gpu

#

I recall doing something like having multiple processes use the gpu at the same time, and there were no guard rails anywhere

#

might as well just do a 1:1 mounting of the gpu into the VM's right

sturdy thistle Mar 20, 2024, 7:16 PM

#

“If you want to go fast, go alone. If you want to go far, go together”

Hey there! I'm currently self-studying statistics as a prerequisite for artificial intelligence. So, I'm looking to join a community of like-minded individuals. If you're also starting to learn prerequisites for AI, I'd love to connect. We could share knowledge, update each other on the topics we're covering each day, and discuss our plans for tomorrow or the week ahead. Let me know if you're interested in teaming up to support each other's learning journey!

umbral delta Mar 20, 2024, 8:22 PM

#

Can someone help with part b?

final kiln Mar 20, 2024, 8:23 PM

#

mixing tailor and fourier features I see

umbral delta Mar 20, 2024, 8:26 PM

#

yes

umbral delta Mar 20, 2024, 8:26 PM

#

final kiln mixing tailor and fourier features I see

i did part a) and it was massively overfitted
enough that the dots fully overlapped

final kiln Mar 20, 2024, 8:27 PM

#

so isn't it the same problem but with L1 instead of L2

umbral delta Mar 20, 2024, 8:36 PM

#

final kiln so isn't it the same problem but with L1 instead of L2

yes, but idk how to do the penalty term in sklearn

open raven Mar 20, 2024, 9:08 PM

#

Encoding 2 categorial variables stored in one pandas data frame comprised from 2 columns. Each column has label built from string of alpha characters - no whitespaces.
One-hot encoder is used, Instantiated with arguments handle_unknown =ignore, sparse_output=False

Encoder delivers a data frame with columns axis labels as RangeIndex - means numeric.

The expectation is for new features the labels to be concatenation of original feature name and category encoded. But OH delivers data frame columns labeled in numeric way. Instantiating encoder with argument feature_name_combiner=‘combat’ doesn’t help.

What do I miss?

If I only understand it properly according to sklearn.preprocessing.OneHotEncoder API documentation, section constructor parameter feature_name_combiner the encoder should for presented circumstances deliver concatenated labels of encoded features: old feature and encoded category.

open raven Mar 20, 2024, 10:01 PM

#

Well, get_feature_names_out() will help. Applied to encoder object, results stored as columns of data frame with encoded categories.

desert oar Mar 20, 2024, 10:23 PM

#

umbral delta Can someone help with part b?

it sounds like they just want you to swap out L1 distance in place of L2, no?

#

that is, | y_predicted - y_actual | instead of sqrt( ( y_predicted - y_actual )^2 )

desert oar Mar 20, 2024, 10:24 PM

#

umbral delta yes, but idk how to do the penalty term in sklearn

can you share the code that they provided in the python notebook?

desert oar Mar 20, 2024, 10:25 PM

#

open raven Encoding 2 categorial variables stored in one pandas data frame comprised from 2...

what are you trying to do?

umbral delta Mar 20, 2024, 11:35 PM

#

desert oar can you share the code that they provided in the python notebook?

https://drive.google.com/file/d/1DrjPPs6I4u8ZF2HLI_-Qnn-LwDyHj6Rz/view?usp=sharing

Google Docs

LPNormOptimizationProblems.ipynb

Colaboratory notebook

kindred blade Mar 21, 2024, 1:03 AM

#

#

LMAO

gritty vessel Mar 21, 2024, 1:16 AM

#

Is this the correct way to feed data in cnn?

#

I have data like each row represent an Image

#

I want to capture spatial features as well as temporal features

#

Each array is oh 1000*1000

serene scaffold Mar 21, 2024, 2:11 AM

#

gritty vessel Is this the correct way to feed data in cnn?

this is not readable.

#

if you're going to post a screenshot, make sure it's only exactly what you need to share.

gritty vessel Mar 21, 2024, 2:11 AM

#

Oh I'm sorry for the overlook

gritty vessel Mar 21, 2024, 2:12 AM

#

serene scaffold if you're going to post a screenshot, make sure it's only exactly what you need ...

I wanted to share the ss only I took it frm pc and sended to my home mail from my colgmail

gritty vessel Mar 21, 2024, 2:12 AM

#

serene scaffold if you're going to post a screenshot, make sure it's only exactly what you need ...

I will explain it

#

First two columns are date and time and rest other columns are latitude longitude Imgtir1 imgtir2 imgvis imgswir all these are an array of 1000*1000 each in each row

serene scaffold Mar 21, 2024, 2:13 AM

#

what are you using for the neural network

gritty vessel Mar 21, 2024, 2:14 AM

#

For extracting spatial features from my data

serene scaffold Mar 21, 2024, 2:14 AM

#

"Imgtir1 imgtir2 imgvis imgswir" -- what do these mean?

serene scaffold Mar 21, 2024, 2:14 AM

#

gritty vessel For extracting spatial features from my data

I'm asking what library you're using to create the neural network. not why you're making it.

gritty vessel Mar 21, 2024, 2:14 AM

#

Oh sorry

#

I'm using tensorflow

serene scaffold Mar 21, 2024, 2:15 AM

#

"Imgtir1 imgtir2 imgvis imgswir" -- what do these mean?

gritty vessel Mar 21, 2024, 2:17 AM

#

They are short wave infrared electromagnetic light and visible light

#

I am working on satellite data

serene scaffold Mar 21, 2024, 2:18 AM

#

so you probably need your data as a tensor with this shape
(num_rows, 1000, 1000, 4)

#

so basically, each image is a 3d array, two dimensions for height and width, and one dimension for those four... parts?

#

spectra, maybe?

#

dataframes are strictly two-dimensional. so you don't want it as that.

gritty vessel Mar 21, 2024, 2:26 AM

#

I used

#

From_records

#

So it stored the array as it is in it

#

Just for the to get an idea how my data

#

Is I then stack lat long in one array and then stacked other 4 in channels

#

So it was like this 26,1000,1000,6

#

I was confused that whether I should pass lat lon as feature or not as coordinates are Improtsnt for catching spatial features right?

#

After wards I also wanted to capture temporal features also so what I did first pass it it cnn without dates after wards I will set date and time as index and will pass it through lstm

gritty vessel Mar 21, 2024, 3:16 AM

#

As labels I have two arrays each of 1000*1000

#

They are named as flash and count

#

So flash is wherever we observe flash there will be 1 and and in count number of flashes at that particular area

final kiln Mar 21, 2024, 8:42 AM

#

Ngl, never a good sign when you're solving eqns in latex and a tilde just floats off to another letter

#

now I have to redo everything

#

deltas on u should be up up too

#

and there you go, the cuda kernel

#

looks complicated but it's not, the deltas are if statements, and only one term is computed for a given (u, l)

#

and the super and subscripts are just indexes, so they're like M^n_l = matrix_array[n][l]

#

so each term will be computed in parallel and used to fill a matrix shaped (n, u, l) which I can then reduce sum along l, no space is wasted because every position in this matrix is filled. The f and g mappings are easily constructed using one of numpy's or pytorch's triu functions

#

I gotta do the gradient, but it's quite easy to calculate since it's all just simple multiplication

final kiln Mar 21, 2024, 10:01 AM

#

here's the draft for the full treatment

grizzled sail Mar 21, 2024, 11:03 AM

#

(excuse the phrasing, i'm still learning) when someone is writing a neural network or other similar kind of ai, and it comes to the programming of the model itself, i seem to see a lot of people writing nn's comprised of only dense layers like in the image attached? can someone explain to me what the purpose or point is, how they compare with writing a neural network comprised of conv2d or LSTM layers (i do understand these kinds of nn's are used for different purposes but thats beside my point)

wooden sail Mar 21, 2024, 11:16 AM

#

grizzled sail (excuse the phrasing, i'm still learning) when someone is writing a neural netwo...

a lot of it is trial and error mixed with heuristics, but sometimes you have good motivation for the choice

#

if you have shift invariance, convolutions make sense

#

if you have slow variance along one axis, then LSTMs make sense

grizzled sail Mar 21, 2024, 11:17 AM

#

so there is actually logic behind it? cause ive only ever written a nn totally alone and with zero guidance so i assumed it was supposed to be a component of a bigger whole, not it's own standalone layer

wooden sail Mar 21, 2024, 11:18 AM

#

dense layers are just general affine transformations. the network can learn to make them shift invariant, but also not. you can use these if you know nothing at all about the problem

grizzled sail Mar 21, 2024, 11:18 AM

#

(as in yes it is A standalone layer, but that it was supposed to work in conjunction with the "actual" nn)

#

also, do different nn layers require specific inputs, cause i assumed i could just add a different kind of layer (for example adding an LSTM layer but i started to get an error (this was a few days ago and i just undid it for the time being)

wooden sail Mar 21, 2024, 11:19 AM

#

wdym by "specific inputs"

#

as long as the shape is correct, a layer won't care what the input is and will enforce its special conditions

grizzled sail Mar 21, 2024, 11:20 AM

#

i guess i meant "specififcally shaped inputs" in that case

wooden sail Mar 21, 2024, 11:20 AM

#

same as a sine function doesn't care what number you give as an input, it'll always treat it as an angle in radians return a number between -1 and 1

#

then yes, each layer requires a specifically shaped input

grizzled sail Mar 21, 2024, 11:22 AM

#

i'm going to go and try running it again and see if i can get the shape bit right 👀

crisp raptor Mar 21, 2024, 11:29 AM

#

Petition to use Q-learning with self driving cars

final kiln Mar 21, 2024, 11:35 AM

#

crisp raptor Petition to use Q-learning with self driving cars

Y do you need a petition ?

crisp raptor Mar 21, 2024, 11:35 AM

#

It's a joke dude

final kiln Mar 21, 2024, 11:35 AM

#

Uhm, I didn't get it

crisp raptor Mar 21, 2024, 11:36 AM

#

Reinforcement learning

final kiln Mar 21, 2024, 11:38 AM

#

I'm not very knowledgeable about reinforcement learning. Gotta get on it eventually

trim needle Mar 21, 2024, 11:45 AM

#

Is it possible to train a T5 model WITHOUT teacher forcing?

#

I don't understand any of this. I want my model to NEVER generate a specific token at the beginning. I'm so desperate that I've given it an extra penalty... and yes, the model doesn't use Teacher Forcing.

mellow vector Mar 21, 2024, 11:59 AM

#

Morning DS/ML

hollow sentinel Mar 21, 2024, 12:40 PM

#

I have a data science portfolio project idea. I got this tortilla price dataset from Mexico from Kaggle. My hypothesis is that supermarkets offer lower prices for tortillas compared to convenience stores and traditional markets in Mexico. Is this a good portfolio project to have?

#

I was also thinking of creating a website to showcase all my data science projects

#

I just don’t know what impacts the project itself is going to have. Like if it’s impressive enough for an employer.

#

I’ll do it anyways I guess?

final kiln Mar 21, 2024, 1:14 PM

#

honestly just do it, once you get into the meat of the problem you'll know how to layer it more and more cuz you'll start having endless ideas

#

I need less ideas rn

#

have too many and there's not enough time for all

final kiln Mar 21, 2024, 1:15 PM

#

hollow sentinel I have a data science portfolio project idea. I got this tortilla price dataset ...

it's also a very reasonable hypothesis

#

more expensive, and significantly lower quality cuz they just buy it from the same supplier at lower quantities but closer to expire, or they just buy it from the supermarket and resell it

#

ig stuff specific to the place, like here they produce rice, so ig their rice is gonna be cheaper and super natural

#

whereas super market will be processed

hollow sentinel Mar 21, 2024, 1:17 PM

#

final kiln it's also a very reasonable hypothesis

Gotcha, thanks for the advice!!

subtle imp Mar 21, 2024, 1:21 PM

#

I need to plot to graphs into one plot, but each having their own axes, e.g. like the following 2 gaussians, with one being rotated 90degrees cw. Is there a easy way to do this, I've tried for some time and didn't manage it, the only thing that I could image working is that I plot the second one on the y axis and scaled all the values to fit in the range of the first?

wooden sail Mar 21, 2024, 1:24 PM

#

you'd just need to swap the axes

#

!e

import numpy as np
import matplotlib.pyplot as plt
N = 50
x = np.arange(N)
y = np.exp(-N/1000*(x - N//2)**2)*50
plt.plot(x, y)
plt.plot(y, x)
plt.show()
plt.savefig("biggest_oof.png")

arctic wedgeBOT Mar 21, 2024, 1:32 PM

#

@wooden sail :white_check_mark: Your 3.12 eval job has completed with return code 0.

wooden sail Mar 21, 2024, 1:32 PM

#

subtle imp I need to plot to graphs into one plot, but each having their own axes, e.g. lik...

like so? if you swap the x and y axis you get a 90° rotation of a plot

subtle imp Mar 21, 2024, 1:35 PM

#

Hmm, could work if I scale the data to fit the ranges, I just would have thought that there might be a option to maybe have 2 subplots overlapping. Thanks for the help!

wooden sail Mar 21, 2024, 1:35 PM

#

subtle imp Hmm, could work if I scale the data to fit the ranges, I just would have thought...

this is also possible https://matplotlib.org/stable/gallery/subplots_axes_and_figures/two_scales.html

subtle imp Mar 21, 2024, 1:37 PM

#

also tried this before, but they always have shared axis, but if I turn the plots the don't x and y are swapped

#

Maybe just normalize the whole data to [0,1] and then use the approach from before, just removeing the ticks and that should work

open raven Mar 21, 2024, 3:08 PM

#

OneHot Encoder acting on pandas data frame. How to prevent fit_transform method from placing old index labels in new column?

serene scaffold Mar 21, 2024, 3:20 PM

#

open raven OneHot Encoder acting on pandas data frame. How to prevent fit_transform method ...

any time you fit an encoder, you're basically resetting it. so fit_transform shouldn't do that.

spark nimbus Mar 21, 2024, 3:34 PM

#

Are there any guides on converting pandas-on-pyspark code to pyspark SQL? I have to convert a few thousand LoC next week 😓

open raven Mar 21, 2024, 3:42 PM

#

serene scaffold any time you fit an encoder, you're basically resetting it. so fit_transform sho...

Thanks for input from you - it’s good to know that.

void crescent Mar 21, 2024, 3:58 PM

#

can someone please explain why my model gives the exact same prediction every time. it works for single predictions but when i mass predict it gives the same output for all predictions.

def process_image(img):
  resized = np.zeros((50,50,3))
  resized[:, :, :3] = read_img

  img_tensor = tf.convert_to_tensor(resized)
  img_tensor = tf.expand_dims(img_tensor, axis=0)

  return img_tensor

img_label_arr = random.choices(combined_data, k=4)
print(img_label_arr)

for il in img_label_arr:
  label = il[1]
  read_image = il[0]
  plt.imshow(read_image)
  plt.title(f"Label: {label}")
  plt.axis("off")
  plt.show()

  image_tensor = process_image(read_image)

  predictions = model.predict(image_tensor)

  print("Predictions:", predictions)


  prediction = benign_or_malignant(predictions[0][0])
  print("Prediction:", prediction)

final kiln Mar 21, 2024, 4:00 PM

#

almost there

final kiln Mar 21, 2024, 4:29 PM

#

refined this a bit more

#

wooden sail Mar 21, 2024, 4:36 PM

#

may i ask what your motivation behind doing this is?

final kiln Mar 21, 2024, 4:36 PM

#

it halves the number of floating point calculations, and also halves the amount of memory

wooden sail Mar 21, 2024, 4:37 PM

#

not that, that i get

#

just the idea of learning metric tensors is common, so i would expect this to already exist in several flavors

final kiln Mar 21, 2024, 4:38 PM

#

couldn't even find anyone using quadratic forms

#

like, xMx.T

#

the transformer does it implicitly

wooden sail Mar 21, 2024, 4:38 PM

#

you might've looked with the wrong key words

final kiln Mar 21, 2024, 4:38 PM

#

uhm, it is possible

wooden sail Mar 21, 2024, 4:38 PM

#

every single time you read mahalanobis distance, this is what they're doing

final kiln Mar 21, 2024, 4:38 PM

#

let me see

wooden sail Mar 21, 2024, 4:39 PM

#

.wa s mahalanobis distance

strange elbowBOT Mar 21, 2024, 4:39 PM

#

Wolfram Alpha

Failed to get response.

wooden sail Mar 21, 2024, 4:39 PM

#

awesome

#

final kiln Mar 21, 2024, 4:39 PM

#

right, but I haven't seen this concept used in ML

wooden sail Mar 21, 2024, 4:39 PM

#

it's used everywhere

final kiln Mar 21, 2024, 4:39 PM

#

where ?

wooden sail Mar 21, 2024, 4:40 PM

#

anywhere you read "maximum likelihood" or under the name "mahalanobis distance" just as above

#

this is how optimization has been done for the past 100 years

final kiln Mar 21, 2024, 4:40 PM

#

I'm not sure I follow

wooden sail Mar 21, 2024, 4:40 PM

#

but maybe you needed to look under statistical methods

final kiln Mar 21, 2024, 4:40 PM

#

this is an attention mechanism

wooden sail Mar 21, 2024, 4:40 PM

#

machine learning people call this by the statistical name

#

this is why many problems in optimization have several names: the different communities don't talk with each other

final kiln Mar 21, 2024, 4:41 PM

#

I have not seen quadratic forms be used explicitly in deeplearning NLP

wooden sail Mar 21, 2024, 4:41 PM

#

the statistical mahalanobis distance squared is the same as using a metric tensor on a point of a manifold

#

you don't have to call it by name for it to be equivalent though

final kiln Mar 21, 2024, 4:42 PM

#

I would've at least seen the equation I imagine

#

I do recall seeing something in computer vision

wooden sail Mar 21, 2024, 4:43 PM

#

any term that involves any sort of mean squared error or maximum likelihood or maximum a posteriori or bayesian methods

#

anything that builds up a covariance matrix of some sort or a hessian matrix

#

they're all doing this, with a different name

final kiln Mar 21, 2024, 4:44 PM

#

I'm confused because the stuff you mention seems to be related to loss right, this is a layer

wooden sail Mar 21, 2024, 4:44 PM

#

no, not necessarily only loss

#

if you look up deep unfolding algorithms, they treat iterations of older algorithms as layers of a neural network

final kiln Mar 21, 2024, 4:45 PM

#

I think it's very telling that they dont mention quadratic form on the 2017 paper

wooden sail Mar 21, 2024, 4:45 PM

#

any unfolding of a 2nd order or higher method (e.g. quasi newton), or even of linear methods, involves products with gramian matrices (another name they go under)

final kiln Mar 21, 2024, 4:45 PM

#

they're using a quadratic form as a layer and don't mention it

wooden sail Mar 21, 2024, 4:45 PM

#

i would never take "they're not using my preferred terminology" as a sign

final kiln Mar 21, 2024, 4:46 PM

#

it's not a matter of preferred terminology

wooden sail Mar 21, 2024, 4:46 PM

#

quadratic forms are super standard and everyone expects the rest to take them for granted

final kiln Mar 21, 2024, 4:46 PM

#

it's a no mention of it

#

they're using this whole CS Inspired terminology to talk about something that is just xMy.T

wooden sail Mar 21, 2024, 4:49 PM

#

all newton methods do that too and they won't call it metric tensor nor quadratic form

final kiln Mar 21, 2024, 4:49 PM

#

I would imagine you'd use the simple eqn tho

wooden sail Mar 21, 2024, 4:50 PM

#

you'd be surprised

final kiln Mar 21, 2024, 4:50 PM

#

I was

wooden sail Mar 21, 2024, 4:50 PM

#

final kiln I would imagine you'd use the simple eqn tho

you also just wrote like 2 or 3 pages on the same product 😛 as you can see

final kiln Mar 21, 2024, 4:50 PM

#

wooden sail you also just wrote like 2 or 3 pages on the same product 😛 as you can see

wdym ?

#

it's less than a page and it's not just doing one product, I'm calculating a bunch of stuff at the same time

#

like, im not explaining a layer

#

im placing it in a form to feed it to the gpu

wooden sail Mar 21, 2024, 4:53 PM

#

at any rate, the area you're looking for under statistical optimization is called "information geometry" and it's all about learning manifolds and metric tensors to do parallel transport

#

all right

final kiln Mar 21, 2024, 5:00 PM

#

the actual motivation is that it is intuitive, in fact, even people who are not math savy find it to be an interesting concept. our brains like geometry so I see it as the way to make networks interpretable

#

ig the layer can exist somewhere under some other name, but it still wouldn't clash with my objective

past meteor Mar 21, 2024, 5:32 PM

#

Anyone have specific tips to debug the location of tensors and/or memory related issues with torch?

#

I have a 70k parameter neural network that should be on my GPU (I literally call .to('cuda') on it) and when I call .to('cuda') on it closer to where I do inference if goes oom trying to allocate 60GB... For a 70k parameter network

final kiln Mar 21, 2024, 6:27 PM

#

can only be the data right

#

are you doing backprop ? the more batches, the more gradients it stores, I've also found that loss functions tend to be inefficient with their allocations

abstract rune Mar 21, 2024, 6:33 PM

#

I wrote a determinant calculator, for dimension 200, it takes around 10-11 secs
written in Go

meager ridge Mar 21, 2024, 6:37 PM

#

final kiln have too many and there's not enough time for all

the most relateable thing

abstract rune Mar 21, 2024, 6:42 PM

#

i compared numpy solution with mine
numpy takes 1.2 secs while my takes 6 secs

#

numpy literally takes less than a second ughh

wooden sail Mar 21, 2024, 6:54 PM

#

how does your algorithm compute the determinant?

abstract rune Mar 21, 2024, 6:55 PM

#

i find the row-echelon form and then use this

#

I wrote the code for REF in Go

func RowEchelonForm(A [][]float64) ([][]float64, int) {
    matrix := Copy(A)
    var determinantFactor = 1
    rows := NumberOfRows(A)

    for i := 0; i < rows; i++ {
        r_idx, c_idx := LeftMostColumnWithNonZeroEntry(matrix, i)
        // r_idx = represents the row_index of the entry which is non-zero; it needs to be same as "i"; or else swap it.
        // c_idx = represents the col_index of the entry which is non-zero; on that

        if r_idx == -1 || c_idx == -1 {
            break
        }
        if r_idx != i {
            matrix, _ = RowSwitch(r_idx, i, matrix)
            determinantFactor *= -1
        }
        column, _ := GetColumnAt(c_idx, matrix)

        for j := r_idx + 1; j < rows; j++ {
            scalar := -1 * (float64(column[j]) / float64(column[i]))
            matrix, _ = RowAddition(scalar, j, i, matrix)
        }

    }
    return matrix, determinantFactor
}

func LeftMostColumnWithNonZeroEntry(A [][]float64, currentRow int) (int, int) {
    for i := 0; i < NumberOfCols(A); i++ {
        for j := currentRow; j < NumberOfRows(A); j++ {
            if A[j][i] != 0 {
                return j, i
            }
        }
    }
    return -1, -1
}

#

The code is a mess I know 😅

wooden sail Mar 21, 2024, 6:58 PM

#

looks about right, i think LU decomposition is the most common, which is the same as row reducing

#

google does seem to say C is a little faster than go, which would explain the difference

abstract rune Mar 21, 2024, 7:01 PM

#

1:5 ?

wooden sail Mar 21, 2024, 7:02 PM

#

depends on the actual code, but people in google searches claim 3 to 20x factor

abstract rune Mar 21, 2024, 7:02 PM

#

C is a factor, but my code is creating a lot of variables, and I am not mutating the original matrix (tryna do functional programming, (tho usiing for loops llmao))

wooden sail Mar 21, 2024, 7:02 PM

#

that certainly makes it slower

#

see how it performs over several matrix sizes

#

but usually BLAS/LAPACK is hard to beat

abstract rune Mar 21, 2024, 7:06 PM

#

whats BLAS, LAPACK ??

wooden sail Mar 21, 2024, 7:07 PM

#

the libraries (usually written in c or fortran) that numpy wraps. they're libraries optimized for special linear algebra operations, also optimized for specific processor architectures

#

https://numpy.org/devdocs/building/blas_lapack.html

#

quite vexing because it automatically implements SIMD and parallelization

#

you can usually only beat it in cases of special composite matrices where you can split the total action of a matrix into smaller ones... for which you use BLAS/LAPACK 😛 instead of doing it naively for the whole matrix

past meteor Mar 21, 2024, 7:12 PM

#

final kiln are you doing backprop ? the more batches, the more gradients it stores, I've a...

I'm using lightning and training on GPU. The interesting thing is that it doesn't go OOM while training 🤷

#

I'll look again tomorrow

#

Maybe I just needed sleep

abstract rune Mar 21, 2024, 7:19 PM

#

damn it

#

it is quite complex stuff ugh

desert oar Mar 21, 2024, 7:39 PM

#

wooden sail every single time you read mahalanobis distance, this is what they're doing

personally i liked the interpretation angle it provided

#

i definitely encouraged him to pursue it 😆

#

you know about the manifold learning stuff and i don't, so maybe this has already been investigated and i didn't realize

desert oar Mar 21, 2024, 7:41 PM

#

final kiln couldn't even find anyone using quadratic forms

i'm with Edd though, this does show up everywhere all the time. when i said nobody has done it before, i specifically meant that i wasn't aware of anyone looking into this particular restatement of the attention mechanism. sorry if i wasn't clearer about that before.

wooden sail Mar 21, 2024, 7:42 PM

#

it could be that no one has done it for the attention mechanism in particular, i have never read an NLP paper. the idea is overall standard and you find it in any book on optimization though

final kiln Mar 21, 2024, 7:42 PM

#

desert oar i'm with Edd though, this does show up everywhere all the time. when i said nobo...

You're both missing the point tho, I'm iterating on an existing system, totally irrelevant if the underlying math is used everywhere else

#

It's like arguing against matrix mul

desert oar Mar 21, 2024, 7:43 PM

#

wooden sail it could be that no one has done it for the attention mechanism in particular, i...

"the idea" meaning quadratic forms and their usefulness/interpretation in general, right?

wooden sail Mar 21, 2024, 7:43 PM

#

yeah

#

from what little i recall of what is done in attention, the matrix is neither symmetric nor square

desert oar Mar 21, 2024, 7:44 PM

#

wooden sail from what little i recall of what is done in attention, the matrix is neither sy...

i think that was the point of the investigation: if you rearrange terms, you get something that looks like it could arise from the decomposition of a square symmetric matrix, so let's see what happens if you actually impose that constraint

wooden sail Mar 21, 2024, 7:45 PM

#

that's interesting in its own right

#

a reasonable mapping where this even makes sense is nice to think about

desert oar Mar 21, 2024, 7:45 PM

#

wdym by a reasonable mapping?

wooden sail Mar 21, 2024, 7:46 PM

#

how to make it so that the vectors participating in the bilinear form are in the same vector space

#

one where this metric makes sense

final kiln Mar 21, 2024, 7:47 PM

#

They all go through the same projection

desert oar Mar 21, 2024, 7:47 PM

#

this i think was the derivation:

Q = X @ Wq
K = X @ Wk
V = X @ Wv

Q @ K.T == (X @ Wq) @ (X @ Wk).T
        == (X @ Wq) @ (Wk.T @ X.T)
        == X @ (Wq @ Wk.T) @ X.T

final kiln Mar 21, 2024, 7:47 PM

#

Before being doted

wooden sail Mar 21, 2024, 7:47 PM

#

what's what here

desert oar Mar 21, 2024, 7:47 PM

#

and then the idea was to set M = (Wq @ Wk.T) and impose that M is symmetric and square, right?

wooden sail Mar 21, 2024, 7:47 PM

#

that's a pretty strong constraint

final kiln Mar 21, 2024, 7:48 PM

#

Yeah, like, I think the ideal thing to do is quadratic, but I wanted to explore metric just for the sake of it, in early experiments I found it was actually way easier to interpret what it was doing

desert oar Mar 21, 2024, 7:48 PM

#

wooden sail what's what here

X is the sequence, and Wq Wk Wv are the attention projection matrices. then Q K and V are the query key and value matrices. following the attention-is-all-you-need notation

wooden sail Mar 21, 2024, 7:48 PM

#

without any special justification, this completely changes the data manifold

#

there's no reason why this should be better. it's gonna be faster due to structure and have nice geometric properties

#

not necessarily useful ones

#

nice to investigate though

desert oar Mar 21, 2024, 7:49 PM

#

right. i was very curious what that would do to the model 😆

final kiln Mar 21, 2024, 7:49 PM

#

It can also have a regularization effect maybe

wooden sail Mar 21, 2024, 7:49 PM

#

yeah, that'd be the case

final kiln Mar 21, 2024, 7:50 PM

#

The early experiment was an array sorter

#

So it would take a sequence 2, 3, 1 and output 1, 2, 3

#

With metric I could look at the distances between embeddings and see that they were actually being sorted along an axis

#

I have been posting everything here

#

But it was way back idk if I can retrieve it

desert oar Mar 21, 2024, 7:52 PM

#

umbral delta https://drive.google.com/file/d/1DrjPPs6I4u8ZF2HLI_-Qnn-LwDyHj6Rz/view?usp=shari...

it looks like they are expecting you to implement the optimization routine yourself using cvxopt, not use scikit-learn. the example even uses L1 in computeL1LeastSquares. so i'd start there instead of fishing around in a library that you aren't expected to be using.

wooden sail Mar 21, 2024, 7:52 PM

#

i thought the rectangularness of the matrices in attention stuff is what lets you deal with sequences of irregular lengths

#

i wouldn't know though

#

maybe that just comes from the choice of tokenization and embedding

#

i've barely touched this application

desert oar Mar 21, 2024, 7:53 PM

#

wooden sail i thought the rectangularness of the matrices in attention stuff is what lets yo...

which, the projection matrices? i thought they were rectangular in order to get a kind of bottleneck effect, so if your word vectors are 50 or 100 dimensional then the attention mechanism is only operating on 10 or 20 dimensional vectors

#

i think you're right about the tokenization thing. stelercus is the NLP expert though

final kiln Mar 21, 2024, 7:56 PM

#

The same projection is applied to every embedding and then every embedding "dots" with each other embedding, the sequence length is accounted by the cross doting

wooden sail Mar 21, 2024, 7:58 PM

#

and from what i gathered, you have some pair of sets of input vectors that you elementwise inner product using this symmetric (positive definite???) matrix to get a new vector whose entries are the products?

#

or you have a different one per pair of vectors in the set

final kiln Mar 21, 2024, 8:01 PM

#

If you're asking about 2017 scaled dot product, you have 3 projections. The results from two of them are used to produce the matrix whose entries are the "cross dots". Then you use that matrix as a transformation of on the third projection

#

So it's like the matrix of dot products is an MLP layer constructed on the fly

#

There's softmax applied to that matrix before using it as a transformation, and the values are scaled by 1/sqrt(dim of proj space) so they called scaled dot product

#

In my case I threw away the three projections and use only one, followed by a quadratic form

wooden sail Mar 21, 2024, 8:04 PM

#

i'm just trying to figure out how much efficiency one can squeeze out

#

compared to, say, letting M for one pair of vectors be L + L^T for a lower triangular L, which makes it easier to guarantee symmetry

#

though that's not a problem if you compute the gradients by hand considering the symmetry explicitly

final kiln Mar 21, 2024, 8:06 PM

#

Yeah I went the custom kernel route, that's what I was calculating earlier

#

But there's another layer to this

#

Which is that the particulars of the attention mechanism might not even be important

#

There's a study where they substituted the attention for an avg pooling and it still worked out fine

#

Caviat is

#

It was for vision and it's a paper on arxiv

#

So I'm reproducing their results but for NLP, and also exploring this other side with the metric tensor condition thing

#

I got a bunch of layers done, scaled dot product, quadratic, avg pooling, etc etc. Now I'm finishing up this one

#

But early results

#

For sentiment analysis it doesn't care one bit what you use for attention

wooden sail Mar 21, 2024, 8:09 PM

#

if you have a deep enough network and enough data, the architecture doesn't matter much 😛 idk

#

fun times implementing the layers, but why don't you test it on what's already there? other than the street cred

final kiln Mar 21, 2024, 8:10 PM

#

Well it's stranger than that cuz the rest of the network doesn't really make the embeddings interact

desert oar Mar 21, 2024, 8:10 PM

#

final kiln For sentiment analysis it doesn't care one bit what you use for attention

fwiw i suspect that this is because sentiment analysis is mostly a matter of finding useful n-grams, you don't have a lot of sentiment encoded in long-range high-order relationships in text that isn't also apparent from word choice

final kiln Mar 21, 2024, 8:10 PM

#

So like, I even used identity and it worked out

past meteor Mar 21, 2024, 8:10 PM

#

wooden sail if you have a deep enough network and enough data, the architecture doesn't matt...

Especially if the architecture has little inductive bias

final kiln Mar 21, 2024, 8:10 PM

#

desert oar fwiw i suspect that this is because sentiment analysis is mostly a matter of fin...

Yes I agree with your take

desert oar Mar 21, 2024, 8:10 PM

#

my intuition for vision is that you have much less of a requirement for long-range relationships between video frames because so much more information is available already in each frame. which is why the "token mixing" mechanism matters a lot less for video

final kiln Mar 21, 2024, 8:11 PM

#

wooden sail fun times implementing the layers, but why don't you test it on what's already t...

Wdym by what's already there ?

wooden sail Mar 21, 2024, 8:11 PM

#

final kiln Wdym by what's already there ?

that you can do all of this with pytorch or smth

desert oar Mar 21, 2024, 8:12 PM

#

past meteor Especially if the architecture has little inductive bias

i'm looking forward to the first 1T parameter fully-connected MLP that's competitive with GPT 2

final kiln Mar 21, 2024, 8:12 PM

#

wooden sail that you can do all of this with pytorch or smth

I am doing it on torch. Just not pytorch cuz I needed to brush up on my low level programming

wooden sail Mar 21, 2024, 8:12 PM

#

ok, that answers the question

wooden sail Mar 21, 2024, 8:13 PM

#

desert oar i'm looking forward to the first 1T parameter fully-connected MLP that's competi...

new paper coming up: money is all you need

#

"we replaced everything with dense layers and threw as many A100's as we could at it. it just works."

past meteor Mar 21, 2024, 8:14 PM

#

"Hyperparameters? We have graduate students exploring different sets as their master's thesis"

wooden sail Mar 21, 2024, 8:14 PM

#

the reality of that hurts

past meteor Mar 21, 2024, 8:15 PM

#

The famed grad student descent

final kiln Mar 21, 2024, 8:15 PM

#

I hope you don't mind that I screenshot this hilarious exchange

past meteor Mar 21, 2024, 8:16 PM

#

I had a bit of an evil thought today

#

The results of my modelling is beating the one of the prev batch by 10-20 %

#

The previous ones weren't bad either. Maybe an idea would be to present $client with middling results the first time and then show better ones the second 😩

#

that way, you always end the project on a positive note (I wouldn't actually do this)

wooden sail Mar 21, 2024, 8:19 PM

#

you kinda rediscovered academia tbh

past meteor Mar 21, 2024, 8:19 PM

#

Isn't this salami publishing and frowned upon

wooden sail Mar 21, 2024, 8:19 PM

#

yes

#

also everyone does it

quasi sparrow Mar 21, 2024, 10:05 PM

#

Hi everyone

I just started working on a regression task that uses data from IoT devices, each device has a different sampling rate.

The question is how can I create samples for my dataset if all the sensor readings (my features) have a different time stamp.

#

And also, when it comes to putting the model in production. How can I present the data for inference if all the data points are not available at the same time. Is there some kind of buffering technique that I could use?

#

I know there are tools in the cloud to stream data, but I am working on a problem where I won’t have access to the cloud, expect for training so I need to rely on open source.

#

Any feedback helps, thank you!

wooden sail Mar 21, 2024, 10:16 PM

#

do you know anything about the measurement data? if you want to use the latest data and you know the data varies slowly, you can think of extrapolation methods

#

if you're ok with introducing a delay, storing some n previous samples makes sense, and then you could interpolate the data of all sensors to the timestamp of the sensor that was updated last

quasi sparrow Mar 21, 2024, 10:32 PM

#

Yes, the data comes from an industrial machine. The data points are: temperatures, motor speeds, pressure, etc. The sample rates are between 1 to 8 seconds.

#

I think I could try both methods and see which one gets the most accurate model! Thank you for the advice.

primal pelican Mar 22, 2024, 12:51 AM

#

Greetings Community:

I am Javascript developer and I have a task of getting text out of pdf and images
I searched google and find out that pytesseract and paddleocr are very good ocrs

any suggestions which libraries of python I can use to fullfil my task andatleast of getting 80% accuracy

serene scaffold Mar 22, 2024, 1:37 AM

#

primal pelican Greetings Community: I am Javascript developer and I have a task of getting tex...

https://pypi.org/project/pdfplumber/

PyPI

pdfplumber

Plumb a PDF for detailed information about each char, rectangle, and line.

#

extracting text from PDFs isn't bad if it's just plain text. but PDFs with lots of images and stuff are hell on earth

abstract rune Mar 22, 2024, 6:09 AM

#

tabula is also a great tooll !!

mild grotto Mar 22, 2024, 8:39 AM

#

Now my particles will follow the gradient as they drop heat. This causes them to have pseudo collisions and bounce off of other particles

hallow radish Mar 22, 2024, 8:50 AM

#

Hello guys, when merging two dataframes and returning selected columns; if the values are NaN it matches will all the NaNs and returns duplicates instead of returning NaN. What would be the ideal way to resolve this?

final kiln Mar 22, 2024, 9:49 AM

#

mild grotto Now my particles will follow the gradient as they drop heat. This causes them to...

That looks awesome.

jovial tinsel Mar 22, 2024, 9:58 AM

#

is there a possible way to run a python script in ios?

final kiln Mar 22, 2024, 9:59 AM

#

with apple, anything could be impossible

small wedge Mar 22, 2024, 10:08 AM

#

mild grotto Now my particles will follow the gradient as they drop heat. This causes them to...

woah hutaowow

final kiln Mar 22, 2024, 1:19 PM

#

wooden sail if you have a deep enough network and enough data, the architecture doesn't matt...

But btw, it was only one layer, don't recall the embedding dimension but I likely not very large too. I used the IMBd dataset.

#

I'll get the full results once I'm done with this and prep a series of datasets

wooden sail Mar 22, 2024, 1:21 PM

#

final kiln But btw, it was only one layer, don't recall the embedding dimension but I likel...

the network as a whole isn't just the one layer though, is it?

#

(i'm asking, i don't know)

final kiln Mar 22, 2024, 1:22 PM

#

wooden sail the network as a whole isn't just the one layer though, is it?

well by one layer, I mean one transformer block, which is equal to the attention module followed by the MLP, and the MLP doesn't let information flow from one embedding to the other

wooden sail Mar 22, 2024, 1:24 PM

#

if you have enough layers after the transformer block, you can do whatever

final kiln Mar 22, 2024, 1:24 PM

#

I didn't see much sense in having an avg pooling attention split into several heads, so there's also nothing additional there like projections

final kiln Mar 22, 2024, 1:25 PM

#

wooden sail if you have enough layers after the transformer block, you can do whatever

after that, I average the embeddings to get one embedding, and then project that to the number of classes

#

simple average too if im not mistaken, so no weights there

#

my reasoning for why it works is that it's just counting good words and bad words and using that to decide if it's a positive or negative review

#

the interesting part is gonna be when I get it to do next token prediction, which I suspect it won't work, would make no sense if it did

#

no wait it's the other way around, I first project them and then avg

#

#

there's only one projection matrix for all embeddings tho so the point still stands

#

it's not the capacity of the output layer that is doing it

#

that explanation would also have been the first thing I would've thought

molten elk Mar 22, 2024, 2:01 PM

#

Does anyone know if what can be done in R can be done in Python easily?

#

I'm not sure if it's worth investing time to learn R

final kiln Mar 22, 2024, 2:02 PM

#

Never seen it be used outside of school

past meteor Mar 22, 2024, 2:05 PM

#

molten elk Does anyone know if what can be done in R can be done in Python easily?

There are many methods that aren't as accessible in Python as in R

#

Whether you should care about them is a different question 😄

molten elk Mar 22, 2024, 2:10 PM

#

What if we use numpy?

boreal gale Mar 22, 2024, 2:10 PM

#

molten elk Does anyone know if what can be done in R can be done in Python easily?

for the most part, what can be done in R can also be accomplished in python quite easily.

but it's worth noting that R has first-class time series support and there exists implementations of some niche models like multinomial logistic regression, where python might not offer as much out-of-the-box support (though statsforecast in python has been a thing for some time, maybe R's time series capabilities are already matched in python?)

past meteor Mar 22, 2024, 2:12 PM

#

They haven't been matched yet imo

#

For niche things at least

#

Also models like GAMs aren't as great in python. I think the valid question is, should you care about GAMs, maybe not

#

I don't think most people should learn it though

#

R has two great niches, it's easier to do descriptive statistics, plotting, ... etc. in it because dplyr, ggplot2 and co have a way more intuitive API than Pandas and matplotlib. The first niche is let's say social science researchers that don't want to dive deep into coding but want to do data analysis and ML

#

The second niche is cutting edge statistics. Some implementations are only in R land. I think the same applies for MATLAB.

The vast majority of people aren't in both scenarios so I'd just say: learn python 😉

final kiln Mar 22, 2024, 2:33 PM

#

my take is that you cant go wrong with learning more things and languages in particular always have some new ways of thinking about things, tho I personally dont like R, py covers all my bases

#

I think that if you learn enough languages with similar paradigms, learning an additional one within that paradigm is not very hard, but learning languages that work very differently take time to get used to

past meteor Mar 22, 2024, 2:34 PM

#

Most people don't learn R tho, I never learnt it. I learnt how to do things in it

final kiln Mar 22, 2024, 2:35 PM

#

yeah I used it for a stats class

#

I liked matlab

past meteor Mar 22, 2024, 2:35 PM

#

I haven't touched it in a year but I'm probably still faster at certain things with it than python

final kiln Mar 22, 2024, 2:36 PM

#

there's also different phases right, most modern languages will be optimized so you can learn the basics quickly and be able to use it, but it might take a long time to master them

buoyant vine Mar 22, 2024, 2:37 PM

#

Tbh i'm not sure really how true that is anymore

#

I think outside of some super specific industry thing which maybe is bundled up with a bunch of legacy stuff

#

It seems in general Python can go beyond what R can, especially with the number crunching related tasks, whether that be with Numba + Numpy or even Torch and tensors

final kiln Mar 22, 2024, 2:39 PM

#

sticking with py is a good strategic choice career wise, cuz it is used in a variety of industry scenarios, not just ML and data science

past meteor Mar 22, 2024, 2:39 PM

#

For my thesis I wanted to do a statistical test that had no credible python implementations

#

For R the implementation was from the author

#

Again, it's a question of whether or not you care about the advantages R can give you, because they exist but my entire point is that they're niche enough the vast majority of people shouldn't

final kiln Mar 22, 2024, 2:42 PM

#

forward kernel is done (probably), i should throw a party

past meteor Mar 22, 2024, 2:42 PM

#

I'm also hoping Polars becomes better integrated in the ecosystem because what I never got over was how incoherent pandas was

#

And matplotlib

final kiln Mar 22, 2024, 2:43 PM

#

I wonder if the spreadsheet+python synergy could substitute pandas

past meteor Mar 22, 2024, 2:43 PM

#

The whole idea of an index in pandas is just ... idk

#

It makes the library deeply strange

final kiln Mar 22, 2024, 2:44 PM

#

I really like excel, and I really like py, so both of them together, could work

#

I think there's a company that used spreadsheets as their production db for an unreasonably long time

past meteor Mar 22, 2024, 2:44 PM

#

That's just already Pandas/Polars?

final kiln Mar 22, 2024, 2:45 PM

#

I suppose the difference would be that I like using one over the other

#

if I do like it

past meteor Mar 22, 2024, 2:47 PM

#

Speaking of legacy, I don't know how it is abroad but here Java and C# dominated so most Python projects are Greenfields and data projects

#

I'm curious how it'll look like 5 and 10 years down the line 😅

final kiln Mar 22, 2024, 2:48 PM

#

i've seen takes that in 5 to 10 years legacy will be much bigger problem because of tools like copilot

#

well, in excel they're just using daframes api, so no improvement

final kiln Mar 22, 2024, 3:27 PM

#

the amount of ways I can shoot myself in the foot in cpp

desert oar Mar 22, 2024, 3:40 PM

#

past meteor The whole idea of an index in pandas is just ... idk

it's kind of a radical design approach, compared to most "data frame" libraries and frameworks that actively eschew row labels

past meteor Mar 22, 2024, 3:40 PM

#

desert oar it's kind of a radical design approach, compared to most "data frame" libraries ...

Do you like it?

desert oar Mar 22, 2024, 3:41 PM

#

past meteor Do you like it?

i think so. i've gotten used to it, at any rate. the API for multi-indexes is still very much lacking though

#

i think the ideal scenario would be to not have separated indexes and data, but to have built-in "indexes" in the sense of a database index that can be attached to a data frame

#

data.table has that, but it's a bit auto-magical, compared to a proper database where you have a relatively wide range of control over what kind of index to use

#

the polars devs have expressed that they don't want to add that to polars, because they felt like it wasn't necessary (which they are totally wrong about, but it's their library and their choice), but they also said it should be easy enough to build a sidecar index thing that works with polars (not totally wrong)

#

for example in data.table you get to specify exactly one column as the primary index, which it uses to sort the rows and uses binary search for lookup and join operations. and you get to specify more columns as secondary indexes, but i don't remember offhand what kind of data structure it maintains for those and what kinds of optimizations it provides.

#

but unlike in pandas the index column does not become separated from the data, it's just the sorting key

#

at the opposite end of the spectrum is xarray, which is basically pandas but multi-dimensional and fully embracing the separation of "coordinates" (the xarray equivalent of a pandas index) and "values" (data)

#

so it's kind of use-case-dependent

past meteor Mar 22, 2024, 3:47 PM

#

desert oar the polars devs have expressed that they don't want to add that to polars, becau...

And I love them for this tbh

desert oar Mar 22, 2024, 3:47 PM

#

i think pandas indexes are great when you have clear obvious choices. it enables you to get much better performance than you otherwise might be able to do with a "dumb" data frame library that lacks a query execution engine

past meteor Mar 22, 2024, 3:47 PM

#

They won performance in different places

desert oar Mar 22, 2024, 3:47 PM

#

and it does help keep things organized by separating "id" columns from everything else, which i find appealing and convenient when "exporting" data to numpy or torch for use in ML. it also makes equi-joins on keys and as-of joins on timestamps very convenient.

#

so i like it in its own time and place. but i'm not exactly clamoring for the index/data separation in other data frame libraries either. i really just want the option to designate certain columns as "coordinate" columns (without physically separating them from the data) and to define database-style indexes for performance as needed. but at that point maybe i just want to start embedding duckdb.

past meteor Mar 22, 2024, 3:49 PM

#

I also don't like .loc and .iloc and the general error messgage that goes something like "setting a copy on a slice on a dataframe..."

desert oar Mar 22, 2024, 3:50 PM

#

i think the distinction between loc and iloc is important if you are going to be using the indexes

#

the error messages and UX/API... yeah

past meteor Mar 22, 2024, 3:50 PM

#

We've been stockholmed into believing these make sense

#

But they don't

left tartan Mar 22, 2024, 3:50 PM

#

(My obligatory: I just do everything in sql -because- of pandas's ridiculous api)

desert oar Mar 22, 2024, 3:50 PM

#

i think it makes perfect sense if you believe that it makes sense to separate coordinates from data

past meteor Mar 22, 2024, 3:50 PM

#

Just have .filter or .where be how you operate with data

desert oar Mar 22, 2024, 3:50 PM

#

it's not a ridiculous API if you buy into the index-data separation

#

but if you don't like the index-data separation then frankly i think pandas itself is just not the tool for you. it's a core part of the pandas design, like it or not

#

that, or you ignore indexes and suffer with slow linear scans for every filtering operation. but you probably should just use polars then.

past meteor Mar 22, 2024, 3:51 PM

#

That's the thing - it's tightly integrated with most of the DS stack you can't get around it

#

I use polars wherever I can

desert oar Mar 22, 2024, 3:51 PM

#

is it? you could probably just skip pandas entirely except with seaborn and statsmodels

#

it's very integrated into the community though

#

so it will be hard to find a DS team that doesn't use it at some point

#

and it wouldn't be fair to prohibit colleagues from using it

past meteor Mar 22, 2024, 3:52 PM

#

I have my interns using Pandas of course

left tartan Mar 22, 2024, 3:53 PM

#

desert oar that, or you ignore indexes and suffer with slow linear scans for every filterin...

Pandas is just at the tail end of many of our pipelines for sure, but duckdb is most of the rest of it. If it weren't for duckdb, we'd use polars.

desert oar Mar 22, 2024, 3:54 PM

#

tldr: it makes sense if you don't like the index/coordinate-data separation, but i maintain that the loc/iloc distinction makes sense and is necessary when that separation exists.

#

UX/API design for interacting with that distinction is another matter

past meteor Mar 22, 2024, 3:54 PM

#

Oh, I didn't argue loc/iloc being different is bad

desert oar Mar 22, 2024, 3:54 PM

#

oh, maybe that was just billy 😆

past meteor Mar 22, 2024, 3:54 PM

#

One is for labels and the other for positions, that much is clear

#

I'm arguing that the idea of having indexes on a df is bad yes

#

If you have them, then yeah, you need both

#

Indexes in Pandas' way

desert oar Mar 22, 2024, 3:55 PM

#

yeah, fair. it's like how maybe R didn't need to also be a radical lisp to be a useful stats language

left tartan Mar 22, 2024, 3:55 PM

#

I think my problem is that there are many analytical cases where there's no clear index/data separation

past meteor Mar 22, 2024, 3:56 PM

#

I think being able to add your index ad-hoc / when reading data would've made more sense. Kind of like how data.table does it

left tartan Mar 22, 2024, 3:56 PM

#

Or, more particularly, that the indices vary depending on the query

past meteor Mar 22, 2024, 3:56 PM

#

group_by().reset_index() 😩

#

(Aside from Pandas I have them doing dbt + duckdb)

#

My judgement call when selecting their stack was that SQL, dbt and Pandas will go a longer way for them than Polars early career wise (even though it's no secret that's my fav)

desert oar Mar 22, 2024, 4:11 PM

#

past meteor I think being able to add your index ad-hoc / when reading data would've made mo...

wait you can definitely do this

desert oar Mar 22, 2024, 4:13 PM

#

past meteor (Aside from Pandas I have them doing dbt + duckdb)

are you doing this for "local" work in projects? i've been thinking about trying it

#

normally i do DVC + ad-hoc scripts

past meteor Mar 22, 2024, 4:14 PM

#

desert oar wait you can definitely do this

I mean, polars style API where you could add a DB style index on top if you needed it

past meteor Mar 22, 2024, 4:15 PM

#

desert oar are you doing this for "local" work in projects? i've been thinking about trying...

local as in not in the cloud/on a VM?

final kiln Mar 22, 2024, 4:23 PM

#

I've refined this a bit more. im not sure if I can call x_i the thing itself or the coordinates of the thing in a given basis. I also should preface the 4th paragraph with y Im defining so much new stuff, the motivation is that Im defining the memory layout for the tensors

pastel gate Mar 22, 2024, 4:30 PM

#

I'm learning data science and my teacher send me this code,

But the accuracy seems to be at 100% which I find impossible. Is there anything wrong with this code?

#

Oh I can't send a file

#

import pandas as pd
import numpy as np
import sys
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix
from sklearn.metrics import roc_curve, roc_auc_score
from sklearn.model_selection import learning_curve
from sklearn import metrics
import scikitplot as skplt
from sklearn.model_selection import LearningCurveDisplay, ShuffleSplit
from sklearn.model_selection import train_test_split
df = pd.read_csv('data.csv',sep=';')
X = df.drop(['k'], axis=1)
y = df['k']
X_train, X_other, Y_train, Y_other = train_test_split(X, y, test_size=0.2, random_state=42)

X_test, X_val, Y_test, Y_val = train_test_split(X_other, Y_other, test_size=0.5, random_state=42)

#

x = X_train['w']
y = X_train['l1']
k = Y_train
for i,row in X_train.iterrows():
if k[i]==1:
plt.plot(x[i],y[i],'rx')
else:
plt.plot(x[i],y[i],'gx')
plt.xlabel("Weight")
plt.ylabel("First")
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
ac=pd.DataFrame({'k':[],'Accuracy':[]},index = [])
for k in [1,3,5]:
knn = KNeighborsClassifier(n_neighbors=k)
knn.fit(X_train,Y_train)
y_pred=knn.predict(X_val)
accuracy = accuracy_score(Y_val, y_pred)
row=pd.DataFrame({'k':[k],'Accuracy':[accuracy]})
ac = pd.concat([ac, row],ignore_index=True,axis=0)
data_p = scaler.transform([[70,16,15]])
knn.predict(data_p)
data_p = scaler.transform([[60,16,15]])
knn.predict(data_p)
ac['Accuracy'].max()
ac[ac['Accuracy']==ac['Accuracy'].max()]

k=int(ac[ac['Accuracy']==ac['Accuracy'].max()]['k'].values[0])
knn_super = KNeighborsClassifier(n_neighbors=k)
knn_super.fit(X_train,Y_train)
y_calculated_class=knn_super.predict(X_test)
y_probability_classs=knn_super.predict_proba(X_test)
knn_super.fit(X_train,Y_train)
y_calculated_class=knn_super.predict(X_train)
y_probability_classs=knn_super.predict_proba(X_train)
mac_pom=confusion_matrix(Y_train,y_calculated_class)
mac_pom
#PPV
for i in range(len(mac_pom)):
print('PPV',i, mac_pom[i,i]/mac_pom[:,i].sum())
print('TPR',i,mac_pom[i,i]/(mac_pom[i].sum()))
print('TNR',i,(mac_pom.sum()-mac_pom[i].sum()-mac_pom[:,i].sum()+mac_pom[i,i])/(mac_pom.sum()-mac_pom[i].sum()))
print(metrics.classification_report(Y_train, y_calculated_class))
fig, ax = plt.subplots(figsize=(2, 2))
cmd = metrics.ConfusionMatrixDisplay.from_predictions(Y_train, y_calculated_class,ax=ax)
plt.savefig('Knn_macierz_pomylek.pdf')

final kiln Mar 22, 2024, 4:35 PM

#

pastel gate import pandas as pd import numpy as np import sys import seaborn as sns import m...

what debugging steps have you tried so far ?

pastel gate Mar 22, 2024, 4:37 PM

#

final kiln what debugging steps have you tried so far ?

I checked if tesst group, validation group and training group have diffrent data and they do. I added more data to 'data.csv'. Even extremly weird and out of order data and accuracy was sstill 100%

#

Honestly I had like one theoretical lesson of data science and no practice and I don't know if the code is correct or not

final kiln Mar 22, 2024, 4:38 PM

#

Have you tried printing out some results ?

#

Like, take the model and feed it some X and see if it is the correct Y

#

And also, print the Y

#

Usually in these cases theres something off with the data and the model converges to outputting some constant

pastel gate Mar 22, 2024, 4:40 PM

#

I used diffrent data set, much larger and it still gave me a 100% accuracy

#

honestly I'm still trying to figure out how this exactly works because like I said I had only one theretical lesson of data science

#

And I think this was a good data set because I got it from kaggle and it seemed ok

final kiln Mar 22, 2024, 4:43 PM

#

Try printing out the output, it will be clearer from there

pastel gate Mar 22, 2024, 4:43 PM

#

I don't want to tell my teacher that his code is wrong unlesss I'm absolutely sure it is 😅

final kiln Mar 22, 2024, 4:43 PM

#

And 100% accuracy is possible if the dataset is artificially constructed for it

pastel gate Mar 22, 2024, 4:44 PM

#

I used two diffrent data sets so it's not likely

final kiln Mar 22, 2024, 4:45 PM

#

What does the output of the model print out ?

pastel gate Mar 22, 2024, 4:53 PM

#

While I was looking trough it I found that accuracy for every k is the same as well. My etacher said that it was because of data givem but I changed and expanded the data. Can the problem lie here?

desert oar Mar 22, 2024, 5:18 PM

#

past meteor local as in not in the cloud/on a VM?

either/or. i mean "ad-hoc adding things as you build up a project" rather than in some kind of production pipeline

final kiln Mar 22, 2024, 5:44 PM

#

My resume be like "I use Ricci btw"

muted hollow Mar 22, 2024, 7:12 PM

#

Guys, what to do with the dataset that have the number of positive sample way more than the other. Should I downsample it or ignore

final kiln Mar 22, 2024, 7:14 PM

#

muted hollow Guys, what to do with the dataset that have the number of positive sample way mo...

Weighted cross entropy loss

#

Assuming you're using cross entropy loss

past meteor Mar 22, 2024, 8:03 PM

#

Or ... do nothing and look at your metrics to decide (ROC, DET, ...) an operating point

#

that's my preferred choice over downsampling, upsamling and the likes

desert oar Mar 22, 2024, 8:19 PM

#

that. you already have 2000 samples in the smallest case. that's probably enough.

#

start with not worrying about it

#

imbalance is less bad than outright not having enough data to make a good decision about one particular class

odd meteor Mar 22, 2024, 8:55 PM

#

muted hollow Guys, what to do with the dataset that have the number of positive sample way mo...

In anything you decide to do, avoid using SMOTE. Usually, I prefer tunning the class_weight or scale_pos_weight parameter.

odd meteor Mar 22, 2024, 9:01 PM

#

muted hollow Guys, what to do with the dataset that have the number of positive sample way mo...

From the image, it appears you're likely working on sentiment analysis.

If your explanatory variable(s) are text data, you can simply perform data augmentation using TextAttack (the same library used for performing adversarial text attack in NLP)

https://pypi.org/project/textattack/

PyPI

textattack

A library for generating text adversarial examples

supple inlet Mar 22, 2024, 11:00 PM

#

Im running a mistral 7B instruct model on my tesla p40 (24gb vram) and im getting a CUDA out of memory error for my gpu?

orchid lintel Mar 23, 2024, 12:56 AM

#

Anyone got an efficient way of turning a Sparse Polars df into the short arrays that can define a CSR array? I tried turning every column into a 1-element list of Structs (each containing a non-zero value and its corresponding Row) but I kept getting weird errors around duplicate columns (apparently this is a known bug with Structs atm)

serene scaffold Mar 23, 2024, 2:27 AM

#

supple inlet Im running a mistral 7B instruct model on my tesla p40 (24gb vram) and im gettin...

In the future, please give text as actual text. Not as screenshots.
The error message tells you what the problem is. Something else appears to be using up a bunch of GPU memory.

mild grotto Mar 23, 2024, 3:20 AM

#

Fixed a bug. Turns out int(1.9-2)==0 not -1

#

I was wondering why my stuff was sticking to the 0 line

tidal bough Mar 23, 2024, 4:04 AM

#

mild grotto Fixed a bug. Turns out int(1.9-2)==0 not -1

What is happening under the hood here? Is this the solution of a DE with fancy boundary conditions?

versed pilot Mar 23, 2024, 10:23 AM

#

past meteor My judgement call when selecting their stack was that SQL, dbt and Pandas will g...

SQL and pandas are fundamental skills for all data professions, and dbt helps make SQL development more manageable. But I guess this means you're preparing people either to be "full stack" jacks of all trades, or even to move away from data science itself towards analytic engineering, data engineering etc.

manic latch Mar 23, 2024, 1:06 PM

#

import matplotlib.pyplot as plt
import numpy as np
from io import BytesIO
plt.clf()
uber = 8
forge_frag = 1600
nitro_value = 175
for uber in range(8, 12):
    i = 0
    x = 1000
    x_axis = []
    y_axis = []
    lol = False
    lol2 = False
    for z in range(100):
        nitro = (nitro_value * (z + 1))
        i += x
        i += 1500 / (uber - 7) - forge_frag * (10 * (uber - 7) - 10)
        f_p_n = i / nitro
        x_axis.append(f_p_n)
        y_axis.append(nitro)
        x += 2000
    plt.plot(y_axis, x_axis, label=f"Uber {uber}")
plt.axhline(y = 0, color = 'b', linestyle = 'dashed', label = "0 ea") 
plt.axhline(y = 160, color = 'r', linestyle = 'dashed', label = "160 ea")
plt.legend()
plt.title = f"Flux cost per nitro (Vaults)"
plt.xlabel('Nitro')
plt.ylabel('Flux per nitro')
data = BytesIO()
plt.savefig(data, format="png")
data.seek(0)

#

I have this bit of code, however it's missing another X axis which is supposed to show 0 to 100 in intervals of 5

#

I wanted that axis to be on top...and grid the whole graph based on it

#

been looking in stack overflow with no real results, it ends up swalloing the whole graph and creating a new one over it it seems

final kiln Mar 23, 2024, 1:37 PM

#

the full scaled dot product attention from 2017, I'm gonna use these to show the equivalence to the quadratic form, and then argue in favor of just using the quadratic form and then try to make the case for the further restriction of it being a metric

#

N_k is not a matrix, probly not the best convention now that I think about it but N acts on an index to produce its maximum range

open raven Mar 23, 2024, 2:32 PM

#

pandas DataFrame.to_numpy( )

Did this method never run through deprecation stage? In pandas 2.2 no more available - code runs onto „object has no attribute” error, yet the API Reference not reporting its support. As for pandas 2.1.2 the method is still present however under no warning of deprecation.

tidal bough Mar 23, 2024, 2:40 PM

#

open raven pandas DataFrame.to_numpy( ) Did this method never run through deprecation stag...

Huh, I see what you mean. Actually even weirder, I only see that in the docs - it's in the code still: https://github.com/pandas-dev/pandas/blob/main/pandas/core/frame.py#L1857

arctic wedgeBOT Mar 23, 2024, 2:40 PM

#

pandas/core/frame.py line 1857

def to_numpy(```

tidal bough Mar 23, 2024, 2:40 PM

#

ah, there: https://github.com/pandas-dev/pandas/issues/57931

GitHub

DOC: Doc page for DataFrame.to_numpy missing only for 2.2 · Issue #...

Pandas version checks I have checked that the issue still exists on the latest versions of the docs on main here Location of the documentation https://pandas.pydata.org/docs/reference/api/pandas.Da...

#

are you sure you're actually getting an error accessing it?

open raven Mar 23, 2024, 3:18 PM

#

tidal bough are you sure you're actually getting an error accessing it?

You' re right, myself seems to need to check own code for be correct first. Thanks for hints.

final kiln Mar 23, 2024, 6:04 PM

#

tomorrow is gonna be intense, I was supposed to have finished this stuff by yesterday, and this deadline was already a pushback from mid last week

final kiln Mar 23, 2024, 6:45 PM

#

desert oar this i think was the derivation: ``` Q = X @ Wq K = X @ Wk V = X @ Wv Q @ K.T =...

there's actually a good reason for doing it the way they did it, it reduces the number of parameters, when you do Wq@Wk.T you're getting back a dxd matrix, basically, with their way, as long as you choose a ksuch that k < d/2 you're using less parameters to make a mathematically equivalent layer

it is possible however, to cook a better and still mathematically equivalent layer with even less parameters if you were to stick with the quadratic form, Im gonna include a small proof of this on my report thing

harsh scarab Mar 23, 2024, 7:39 PM

#

Guys I need some help. I'm working on a Cars Dataset to predict the price based on cars infos, my problem is i donno how to get insight from the Car's Model Feature since there are 2736 unique values and my dataset has 27k rows

final kiln Mar 23, 2024, 7:45 PM

#

harsh scarab Guys I need some help. I'm working on a **Cars Dataset** to predict the price ba...

what do you mean by car's model feature ?

harsh scarab Mar 23, 2024, 7:51 PM

#

#

this how my dataset looks

#

I think that there will some relation between the Model and the Price

#

but the problem is there 2736 unique values and as you can see the dataset is very large

#

@final kiln

final kiln Mar 23, 2024, 7:55 PM

#

harsh scarab but the problem is there 2736 unique values and as you can see the dataset is ve...

that doesn't sound that large I think, isn't it possible to encode 2.7k classes efficiently ?

#

maybe using embeddings: https://pytorch.org/docs/stable/generated/torch.nn.Embedding.html

harsh scarab Mar 23, 2024, 8:00 PM

#

final kiln that doesn't sound that large I think, isn't it possible to encode 2.7k classes ...

maybe but i want to visualize to spot any pattern

final kiln Mar 23, 2024, 8:00 PM

#

ah you are doing EDA

harsh scarab Mar 23, 2024, 8:00 PM

#

yeah

final kiln Mar 23, 2024, 8:01 PM

#

maybe PCA can help somehow

harsh scarab Mar 23, 2024, 8:01 PM

#

for the make for example i spot that even that the occurence of some cars is less frequent but it affects the price very high

#

like Ferrari, Rolls Roys

#

which is logical

#

but there are only 58 Make

#

so it was easy

final kiln Mar 23, 2024, 8:02 PM

#

you can likely show this very well with a color coded histogram

#

x axis would be price bins, y axis would be count

#

and then each bar would have an assortment of colors showing the percentage that goes to each class

harsh scarab Mar 23, 2024, 8:03 PM

#

ducky_concerned

final kiln Mar 23, 2024, 8:03 PM

#

no

final kiln Mar 23, 2024, 8:03 PM

#

final kiln x axis would be price bins, y axis would be count

this

harsh scarab Mar 23, 2024, 8:04 PM

#

hmmm

final kiln Mar 23, 2024, 8:04 PM

#

2k classes might still look cluttered even in my proposed plot tho

#

maybe get the statistics for each class and plot that

#

like avg, std, etc

#

then a scatter plot with some error bars

#

labeled with the car model

harsh scarab Mar 23, 2024, 8:06 PM

#

that sounds good

#

imma try that

#

thank you

toxic mortar Mar 23, 2024, 11:56 PM

#

I want to build unsupervided learning semantic-based cluster grouping of key informations in uploaded files. I am considering between ChromaDB and Qdrant for the vector db. For the types of clustering algorithms Density-based or Centroid-based. For example, I have 10 uplaoded hedge fund files, and I want to add them into vect db and find up some informations for ex two advisers stock X will go up, one said it will go down, etc.. Did you solve similiar problem, if yes or if you have any suggestion for starting out and choosing optimal tech stack please lmk. ty

buoyant vine Mar 24, 2024, 12:27 AM

#

How much data is there?

#

Can you not fit this into a system locally? If so, I would use Neither and use PyNNDescent with SKlearn to create your clustering pipelines

#

A) It is faster to search B) less hassle and C) faster construction

warm copper Mar 24, 2024, 1:47 AM

#

Is there anyone online here?

hybrid prism Mar 24, 2024, 1:47 AM

#

Hey

warm copper Mar 24, 2024, 1:47 AM

#

😄

#

I have a question.

hybrid prism Mar 24, 2024, 1:47 AM

#

Yes

warm copper Mar 24, 2024, 1:47 AM

#

I think my teacher is crazy

hybrid prism Mar 24, 2024, 1:48 AM

#

I'm not that good with Python but sure

warm copper Mar 24, 2024, 1:48 AM

#

Im working on a project

hybrid prism Mar 24, 2024, 1:48 AM

#

warm copper I think my teacher is crazy

Who doesn't?

hybrid prism Mar 24, 2024, 1:48 AM

#

warm copper Im working on a project

Say

warm copper Mar 24, 2024, 1:48 AM

#

and my teacher asked us to use k-means clustering

#

and then find the accuracy

hybrid prism Mar 24, 2024, 1:48 AM

#

For what

#

Oh k

warm copper Mar 24, 2024, 1:48 AM

#

but k means clustering doesnt have accuracy for predictions

hybrid prism Mar 24, 2024, 1:48 AM

#

L

#

Skill issue

#

Nah I'm jk

#

What do you need help in

warm copper Mar 24, 2024, 1:49 AM

#

I mean that question must be wrong

tropic mirage Mar 24, 2024, 1:50 AM

#

hi guys , becoming ML/AI engineer full , stack development require?

hybrid prism Mar 24, 2024, 1:50 AM

#

Nah

clever owl Mar 24, 2024, 2:38 AM

#

Hey guys I have a list of similar words, they are similar by typos i.e. INDONAKANO COMPANY, INDOKSANO COMPNY..., I would like to group these similar typo words together. What is the common practice for doing this?

muted hollow Mar 24, 2024, 6:34 AM

#

Guys, is it bad to clean the sample for bert-base-uncased for classification problem like sentimental analysis problem

hot bridge Mar 24, 2024, 7:47 AM

#

clever owl Hey guys I have a list of similar words, they are similar by typos i.e. INDONAKA...

I think Levenshtein distance will do the work just fine here - this will give the measure of similarity

rain nymph Mar 24, 2024, 9:25 AM

#

Hey guys! I am building a NLP virtual assistant. Currently so far I have build till semantic analysis where the machine can understand if my given text is positive or negative. I am trying on how machine can understand the entity and open the applications that I give the query to open. Example = “can you open calculator please?”
By NER I have such output: (S Can/MD you/PRP open/VB calculator/NN please/NN ?/.)
Im using the NLTK libraries but now idk how I would make a function that will make the machine understand that it has to open calculator. I was thinking of pattern matching but again it gets very tidious. Am I going correctly or is there anything I should consider for entity recognition that im lacking currently? Thank you for your help :D

#

also Im using datasets as nltk corpora

hollow sentinel Mar 24, 2024, 1:50 PM

#

import pandas as pd

#

Import "pandas" could not be resolved from source

#

what does that mean 😿

past meteor Mar 24, 2024, 1:52 PM

#

hollow sentinel Import "pandas" could not be resolved from source

Are you sure pandas is installed? Are you right virtual environment?

#

You can always double check by opening the terminal and running pip list | grep pandas

hollow sentinel Mar 24, 2024, 1:56 PM

#

past meteor Are you sure pandas is installed? Are you right virtual environment?

i just fixed it, i did pip install in the terminal and then closed vscode and reopened it.

pastel gate Mar 24, 2024, 2:17 PM

#

when I use knn algorithm do I have to have a validation group?

serene scaffold Mar 24, 2024, 2:18 PM

#

pastel gate when I use knn algorithm do I have to have a validation group?

what is "the validation group" according to your understanding?

pastel gate Mar 24, 2024, 2:19 PM

#

Basically my teacher said that when you're using the knn algorithm you should have:

training group
Validation group where you check which k is best
Test group - to test final accuracy of the model

#

And validation group messed up my model, so I was thinking if I could check which k is best juts on the test group

serene scaffold Mar 24, 2024, 2:20 PM

#

how did the validation group "mess up your model"?

pastel gate Mar 24, 2024, 2:20 PM

#

I'm new to this so I basically have zero experience

pastel gate Mar 24, 2024, 2:21 PM

#

serene scaffold how did the validation group "mess up your model"?

Accuracy dropped from 97% to 60%

serene scaffold Mar 24, 2024, 2:21 PM

#

that doesn't mean that the validation set "messed up your model". the only set that has any influence on the model's behavior is the training set.

#

if your instructor told you that you need to have a validation set, then you do

#

but in general, to train a model, you only need a training set. (but if you don't have a test set, then you'll have no way of knowing how well it performs.)

hollow sentinel Mar 24, 2024, 2:41 PM

#

so stel i read this medium article and it said to not use seaborn for "default visualizations"

#

bc it doesn't generate the most impressive ones. what should i use instead?

serene scaffold Mar 24, 2024, 2:54 PM

#

hollow sentinel bc it doesn't generate the most impressive ones. what should i use instead?

idk. I hate making data visualizations
matplotlib sucks

hollow sentinel Mar 24, 2024, 2:54 PM

#

serene scaffold idk. I hate making data visualizations matplotlib sucks

matplotlib looks so ass

serene scaffold Mar 24, 2024, 2:54 PM

#

but don't take anything on medium for granted. most of the content on there is written by wannabe influencers.

hollow sentinel Mar 24, 2024, 2:54 PM

#

isn't seaborn built on top of mpl?

serene scaffold Mar 24, 2024, 2:54 PM

#

they all are.

hollow sentinel Mar 24, 2024, 2:55 PM

#

idk i wanna create some more impressive data visualizations for my portfolio

#

maybe tableau is the answer?

hollow sentinel Mar 24, 2024, 2:56 PM

#

serene scaffold but don't take anything on medium for granted. most of the content on there is w...

no ofc not, but i think the source i found is onto something

#

should i put the link here? this guy put it exclusive to medium ppl only

serene scaffold Mar 24, 2024, 2:58 PM

#

If you want

hollow sentinel Mar 24, 2024, 2:59 PM

#

serene scaffold If you want

https://towardsdatascience.com/how-to-find-unique-data-science-project-ideas-that-make-your-portfolio-stand-out-1c2ddfdbefa6

Medium

How to Find Unique Data Science Project Ideas That Make Your Portfo...

Forget Titanic and MNIST: Pick a unique project that builds your skills and helps you stand out from the crowd

#

i think he's right tbh

lofty thorn Mar 24, 2024, 3:00 PM

#

hi..
can anyone please teach me how to calculate the inter quartile range..

serene scaffold Mar 24, 2024, 3:00 PM

#

lofty thorn hi.. can anyone please teach me how to calculate the inter quartile range..

show the code for what you've tried so far.

lofty thorn Mar 24, 2024, 3:01 PM

#

i was reading book and there came this..i didn't write any code for it

#

is this didn't find fair to you?

past meteor Mar 24, 2024, 3:05 PM

#

hollow sentinel bc it doesn't generate the most impressive ones. what should i use instead?

seaborn looks better than mpl by default

lofty thorn Mar 24, 2024, 3:05 PM

#

past meteor Mar 24, 2024, 3:05 PM

#

Another option is plotly

hollow sentinel Mar 24, 2024, 3:08 PM

#

past meteor seaborn looks better than mpl by default

there are some weird characters here when i open the file in excel, but when i put it in a pandas dataframe it's fine

#

also there's missing values... do i impute the values e.g fill in the missing ones with an average? or do i just drop the missing entirely?

#

https://youtu.be/f9AQy7p0QEo

YouTube

Underfitted

Don't Replace Missing Values In Your Dataset.

Everyone knows they must replace missing values in their dataset before training a machine learning model.

Most people, however, miss one critical step.

This video will show you what you are missing and how to do it better.

🔔 Subscribe for more stories: https://www.youtube.com/@underfitted?sub_confirmation=1

📚 My 3 favorite Machine Learning...

▶ Play video

#

this isn't a school project or anything if anyone is concerned about helping me

#

just my own curiosity

#

if you always fill in your missing values with imputed data, wouldn't your analysis be skewed?

final kiln Mar 24, 2024, 3:15 PM

#

Math is hard

hollow sentinel Mar 24, 2024, 3:15 PM

#

the old kanye would've imputed everything with the mean

#

i miss the old kanye

#

but now i fr don't know what to do

#

if i drop the values too, there's a problem

final kiln Mar 24, 2024, 3:16 PM

#

I'm out of context here, who's Kanye

hollow sentinel Mar 24, 2024, 3:17 PM

#

final kiln I'm out of context here, who's Kanye

i was trying to be funny, i meant me

past meteor Mar 24, 2024, 3:17 PM

#

final kiln I'm out of context here, who's Kanye

it's a song, look it up

past meteor Mar 24, 2024, 3:18 PM

#

hollow sentinel there are some weird characters here when i open the file in excel, but when i p...

It could be that the character encoding of Excel isn't what it's supposed to be

final kiln Mar 24, 2024, 3:18 PM

#

Ah Kanye west

hollow sentinel Mar 24, 2024, 3:18 PM

#

past meteor It could be that the character encoding of Excel isn't what it's supposed to be

yea

final kiln Mar 24, 2024, 3:18 PM

#

I have a symmetric matrix, and for some reason my brain can't think of a way to optimize the matrix mul

past meteor Mar 24, 2024, 3:18 PM

#

final kiln I have a symmetric matrix, and for some reason my brain can't think of a way to ...

@wooden sail ?

final kiln Mar 24, 2024, 3:18 PM

#

The matrix is laid out as a 1d array

#

this is the context

#

it worked out great when it was M_kk' cuz the result was a number and I could partition the sum, but now Im stuck

hollow sentinel Mar 24, 2024, 3:21 PM

#

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv("/Users/rahuldas/Desktop/Tortilla Dataset/tortilla_prices.csv")
print(df.head)
print(df.info()) 
print(df.shape)
print(df.columns)
print(df.dtypes)
sns.distplot(df["Price per kilogram"])
plt.show()
price_per_kilogram_missing = df["Price per kilogram"].isna().sum()
print(price_per_kilogram_missing)
print("hello world")

final kiln Mar 24, 2024, 3:21 PM

#

s is symmetric in cc', and u=F(c,c'), F is the way Im flattening it

hollow sentinel Mar 24, 2024, 3:21 PM

#

!pastein

arctic wedgeBOT Mar 24, 2024, 3:21 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

hollow sentinel Mar 24, 2024, 3:21 PM

#

https://paste.pythondiscord.com/KRSQ

#

does anyone know why nothing is being outputted?

#

not even the hello world works

final kiln Mar 24, 2024, 3:21 PM

#

whhat's the error that it throws

hollow sentinel Mar 24, 2024, 3:22 PM

#

no error either

final kiln Mar 24, 2024, 3:22 PM

#

not even seg fault

hollow sentinel Mar 24, 2024, 3:22 PM

#

nada

final kiln Mar 24, 2024, 3:22 PM

#

that's suss

hollow sentinel Mar 24, 2024, 3:22 PM

#

final kiln that's suss

i posted it in the pastebin up top

#

i changed distplot to displot

final kiln Mar 24, 2024, 3:24 PM

#

you gonna have to move the print line by line to try to figure out the guilty line

#

something is crashing the program silently

#

and wtv it is kill it with fire cuz programs shouldnt do that

hollow sentinel Mar 24, 2024, 3:24 PM

#

final kiln and wtv it is kill it with fire cuz programs shouldnt do that

yea my thoughts exactly

hollow sentinel Mar 24, 2024, 3:25 PM

#

final kiln you gonna have to move the print line by line to try to figure out the guilty li...

https://www.reddit.com/r/learnpython/comments/18ddvcb/python1983120908209_warning_secure_coding_is_not/

From the learnpython community on Reddit

Explore this post and more from the learnpython community

#

2024-03-24 11:24:54.258 Python[61800:5006159] WARNING: Secure coding is automatically enabled for restorable state! However, not on all supported macOS versions of this application. Opt-in to secure coding explicitly by implementing NSApplicationDelegate.applicationSupportsSecureRestorableState:.

final kiln Mar 24, 2024, 3:25 PM

#

final kiln this is the context

im just gonna take the L on this one, primary mission objective has been achieved anyway

hollow sentinel Mar 24, 2024, 3:25 PM

#

my guy what in the world of fuck is that

final kiln Mar 24, 2024, 3:26 PM

#

hollow sentinel https://www.reddit.com/r/learnpython/comments/18ddvcb/python1983120908209_warnin...

looks like a warning, shouldnt be stoping the program

final kiln Mar 24, 2024, 3:27 PM

#

hollow sentinel 2024-03-24 11:24:54.258 Python[61800:5006159] WARNING: Secure coding is automati...

but also, mac strikes again

hollow sentinel Mar 24, 2024, 3:27 PM

#

final kiln but also, mac strikes again

fuck macs.

#

i can't w it

final kiln Mar 24, 2024, 3:27 PM

#

I think I might prefer remaining unemployed than being forced to use one again

hollow sentinel Mar 24, 2024, 3:27 PM

#

nah

#

ok i think i got the number. 6390.

#

there are 6390 rows of data missing for that specific column of Price Per Kilogram

#

based on that, is there any way i can conclude i should be imputing?

#

fuck it we ball, i'll impute anyways

lofty thorn Mar 24, 2024, 3:39 PM

#

i don't get it

hollow sentinel Mar 24, 2024, 3:40 PM

#

lofty thorn i don't get it

how strong is your statistical background?

lofty thorn Mar 24, 2024, 3:40 PM

#

very basic

hollow sentinel Mar 24, 2024, 3:40 PM

#

yea i would recommend augmenting that with some extra work w a course or two

#

that book really isn't beginner friendly

wooden sail Mar 24, 2024, 3:49 PM

#

past meteor <@467435887236612106> ?

if you know the matrix ahead of time and it's manageably small, you can think of diagonalizing it. aside from that, i think computing either y^T(Mx) or (y^TM)x is much more efficient than looping for every single element as naive einstein notation would suggest. if you're coding it yourself, you might consider using strassen's algorithm for whichever product you associate with the matrix

#

i pinged the wrong person

#

@final kiln

final kiln Mar 24, 2024, 3:53 PM

#

wooden sail if you know the matrix ahead of time and it's manageably small, you can think of...

no but im getting these through the gpu, im coding my own kernels so im not actually looping

#

im taking the L on it tho, I've already got what I wanted to do done

#

actually, that does give me the idea, I could just like, not care about memory and tank the repeated calculations, I'd still be squeezing out performance cuz I'd be doing two matrix mul in one operation, I'd just take the same amount of memory instead of half

hollow sentinel Mar 24, 2024, 4:01 PM

#

#

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv("/Users/rahuldas/Desktop/Tortilla Dataset/tortilla_prices.csv")
sns.set(rc={
    "figure.figsize" : (11.7, 8.27)
})
print(df.head)
print(df.info()) 
print(df.shape)
print(df.columns)
print(df.dtypes)
print("hello world")
price_per_kilogram_missing = df["Price per kilogram"].isna().sum()
print(price_per_kilogram_missing)


price_per_kilogram_missing_mean = df["Price per kilogram"].mean()
print(price_per_kilogram_missing_mean)
df["Price per kilogram"] = df["Price per kilogram"].fillna(price_per_kilogram_missing_mean)
print(df["Price per kilogram"].isna().sum())
sns.kdeplot(df["Price per kilogram"], shade=True)
#plt.show()
sns.barplot(df, x = "Price per kilogram", y = "State", hue = "State")
plt.show()

#

anyone have any ideas on how to make this visualization more eye-friendly?

#

i might remove the error bars

#

but when i did i still have one

#

aguascalientes

#

weird ash

final kiln Mar 24, 2024, 4:29 PM

#

i have no idea if any of this makes sense >.>

#

ig if backrprop dont work ill know why

hollow sentinel Mar 24, 2024, 4:35 PM

#

final kiln i have no idea if any of this makes sense >.>

what would you say i should do?

#

just curious

#

i mean there's 32 cities here

#

maybe show the top 5 values?

final kiln Mar 24, 2024, 4:35 PM

#

i think im making this more complicated than it needs to be, I can take the derivative with respect to one of the symbols already present, cuz it's symmetric with respect to the choice of either p

final kiln Mar 24, 2024, 4:35 PM

#

hollow sentinel maybe show the top 5 values?

let me see

hollow sentinel Mar 24, 2024, 4:36 PM

#

final kiln let me see

here's the OG plot

final kiln Mar 24, 2024, 4:36 PM

#

hollow sentinel

the bars def ocupy too much space and the rainbow doesn't look useful

hollow sentinel Mar 24, 2024, 4:36 PM

#

final kiln the bars def ocupy too much space and the rainbow doesn't look useful

that's what i'm thinking too

#

i always run into this problem with data visualizations

final kiln Mar 24, 2024, 4:37 PM

#

potentially normalize the height with respect to the largest bar

hollow sentinel Mar 24, 2024, 4:38 PM

#

final kiln potentially normalize the height with respect to the largest bar

#

whipped up a quick graph of what it could look like instead

#

the names are the problem tho

final kiln Mar 24, 2024, 4:44 PM

#

remove the gradient, just a solid color is standard

toxic mortar Mar 24, 2024, 4:44 PM

#

@final kiln GIGACHAD NAME bro

final kiln Mar 24, 2024, 4:44 PM

#

like, if it has no information it is not useful

final kiln Mar 24, 2024, 4:48 PM

#

toxic mortar <@935270247366271027> GIGACHAD NAME bro

is gigachad good? im getting old I cant keep up with the memeology no more

toxic mortar Mar 24, 2024, 4:49 PM

#

"cool name"

final kiln Mar 24, 2024, 4:55 PM

#

toxic mortar "cool name"

ty

final kiln Mar 24, 2024, 4:56 PM

#

final kiln i think im making this more complicated than it needs to be, I can take the deri...

this was not the case, I can't just pick one of the symbols cuz then I lose the case where they are equal

#

this is likely correct, when c = c' it becomes the derivative of x**2, which is 2x

#

kind of a crazy way to express it tho

hollow sentinel Mar 24, 2024, 5:09 PM

#

final kiln remove the gradient, just a solid color is standard

#

better?

#

weird

#

i can't get it to have the highest value at the top

final kiln Mar 24, 2024, 5:16 PM

#

hollow sentinel

they should have the same color, like blue or black

#

and try dividing by the height of the largest bar

hollow sentinel Mar 24, 2024, 5:17 PM

#

final kiln and try dividing by the height of the largest bar

not sure what that means

final kiln Mar 24, 2024, 5:17 PM

#

price per kg = (price per kg) / max(price per kg )

#

im gonna assume my crazy equations are correct and move on, cuz I gotta get stuff done

#

I'll have the opportunity to refine them once I have a unit test on the entire layer

hollow sentinel Mar 24, 2024, 5:22 PM

#

final kiln price per kg = (price per kg) / max(price per kg )

that did something weird

hollow sentinel Mar 24, 2024, 5:22 PM

#

final kiln price per kg = (price per kg) / max(price per kg )

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import colorcet as cc
df = pd.read_csv("/Users/rahuldas/Desktop/Tortilla Dataset/tortilla_prices.csv")
sns.set(rc={
    "figure.figsize" : (11.7, 8.27)
})
print(df.head)
print(df.info()) 
print(df.shape)
print(df.columns)
print(df.dtypes)
print("hello world")
price_per_kilogram_missing = df["Price per kilogram"].isna().sum()
print(price_per_kilogram_missing)


price_per_kilogram_missing_mean = df["Price per kilogram"].mean()
print(price_per_kilogram_missing_mean)

df.sort_values("Price per kilogram", ascending = False)

df["Price per kilogram"] = df["Price per kilogram"].fillna(price_per_kilogram_missing_mean)
print(df["Price per kilogram"].isna().sum())
df["Price per kilogram"] = (df["Price per kilogram"])/(df["Price per kilogram"].max())
sns.set_style("whitegrid")
sns.kdeplot(df["Price per kilogram"], shade=True);
#plt.show()
sns.barplot(df,estimator=np.median, x = "Price per kilogram", y = "State", color = "blue");
sns.despine(left = True
            );
plt.show()

#

#

what in the world is that

#

at the top part of the graph

#

very odd behavior

#

i don't think it likes it

final kiln Mar 24, 2024, 5:28 PM

#

hollow sentinel

if none of the bars hit the value of 1, you did something wrong

#

draft

#

#

and now im gonna curl up and cry cuz I still didnt get everything done and that means im gonna be working on this til nite

final kiln Mar 24, 2024, 5:38 PM

#

final kiln

there's something that doesn't sit quite right with me tho, even if they are equivalent, isn't the one that performs a project to a lower dimensional space still a bottleneck ? like, if the embedding dimension is 1000000, and the space where it's projected to has 10 dimensions, like, there should be a higher risk of loss of information

#

even tho I can multiply both weights and get a 1000000x1000000 matrix

#

I'm willing to accept that in the ideal math world I can funnel and recover any information regardless of how tight the bottleneck is, but I think there is gonna be some sort of limit IRL, if nothing else, just coming out of the fact that IRL we dont operate on the reals, we operate on the lattice of floating point numbers

#

im sure someone has already figured out this stuff

lapis sequoia Mar 24, 2024, 6:18 PM

#

i thought scipy optimize minimize is some insane math but all its actually doing is changing every single parameter and seeing how it affects the loss and that is what all solvers do apparently

#

it just changes them one by one

tidal bough Mar 24, 2024, 6:28 PM

#

i mean, it depends on the algorithm

neon crystal Mar 24, 2024, 6:33 PM

#

anyone have good recommendation for resources on time series forecasting in python? I have a course by jm portilla on udemy but that one is kinda outdated now.

lapis sequoia Mar 24, 2024, 6:34 PM

#

tidal bough i mean, it depends on the algorithm

i tried all of ones that dont require gradient, i made it recreate an image, and then train a pytorch model, so i tried "Nelder-Mead", "Powell", "CG","BFGS","TNC","COBYLA" and "SLSQP", and i logged the image it outputs after each time it evaluates it, it changes each pixel one at a time and then puts it back to original value and does the next pixel

gritty vessel Mar 24, 2024, 6:38 PM

#

hey how can i handle memory error it says unable to allovate 2.01 gigs

#

but i have 16 gigs

#

and i was monitoring on task manager enough memory was there

kindred isle Mar 24, 2024, 7:19 PM

#

Can anyone help with conda installation? I have two different ones installed and idk which one to keep

#

final kiln Mar 24, 2024, 7:49 PM

#

kindred isle Can anyone help with conda installation? I have two different ones installed and...

I think any will do

#

Unless you've installed stuff that took you a long time to install or anything like that

final kiln Mar 24, 2024, 7:51 PM

#

lapis sequoia i thought scipy optimize minimize is some insane math but all its actually doing...

I have found that everything can be put into words that make the thing sound simple, but when you get into the meat of it, you realize it's actually pretty complicated

honest reef Mar 24, 2024, 8:15 PM

#

How are you guys doing

supple inlet Mar 24, 2024, 8:17 PM

#

Im getting a Out of Memory error, im trying to run Mistral-7B-Instruct-v0.2 model and i have a tesla p40 (24gb vram):

OutOfMemoryError                          Traceback (most recent call last)
Cell In[7], line 2
      1 model_inputs = encodeds.to(device)
----> 2 model.to(device)

OutOfMemoryError: CUDA out of memory. Tried to allocate 224.00 MiB. GPU 0 has a total capacity of 23.87 GiB of which 47.00 MiB is free. Including non-PyTorch memory, this process has 23.82 GiB memory in use. Of the allocated memory 23.68 GiB is allocated by PyTorch, and 1.14 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

final kiln Mar 24, 2024, 8:18 PM

#

supple inlet Im getting a Out of Memory error, im trying to run Mistral-7B-Instruct-v0.2 mode...

Run nvidia-smi, see if any other process is eating away at the memory

supple inlet Mar 24, 2024, 8:19 PM

#

Its only the current task im doing

pale thunder Mar 24, 2024, 8:32 PM

#

When dealing with the gini index for the purposes of deciding a split in a decision tree, you compute the gini index of samples on either side of the split, then take a weighted average of the indexes. However, a gini index is supposed to be the probability of a sample being misclassified - that is, the probability of random.choice(samples).class_ != random.choice(samples).class_ - the correct way to compute this for the two splits would be a different formula entirely. Why is the weighted average used?

lapis sequoia Mar 24, 2024, 8:43 PM

#

wait, how do you deal with the gini index features = ["SEX","AGE","YEAR","EDUC","INCWAGE","WKSWORK2"]
sampling_weight = 'ASECWT'

df = sampled_df[features + [sampling_weight]]

df.isna().sum()

df = df[
(df['AGE'].between(21, 64)) &
(df['INCWAGE'].between(0, 99999998)) &
((df['WKSWORK2'] >= 1) & (df['WKSWORK2'] <= 6))
]

df['EDUC'] = df['EDUC']

df = df.drop('ASECWT',axis=1)

import numpy as np

df.info()
df.isna().sum()
a = 0.10
df['EDUC'] = df['EDUC'] * a

df['LOG_INCWAGE'] = np.log1p(df['INCWAGE'])

df['mean_hours_worked']= df['WKSWORK2'].mean()

df['WKSWORK2'] = df['WKSWORK2'].apply(lambda x: mean_hours_worked if 40 <= x <= 46 else x)

print(df)

df['LOG_INCWAGE_PER_WEEK'] = df['LOG_INCWAGE'] - np.log1p(df['WKSWORK2'])

df['MARKET_EXP'] = df['AGE'] - df['EDUC'] - 6

print(df)

print(df.dtypes)

df['YEAR_OF_EXP_SQUAREd'] = df['MARKET_EXP']**2

final kiln Mar 24, 2024, 8:50 PM

#

supple inlet Its only the current task im doing

Weird, I think I've been able to run that model even on my CPU

#

With 8gb of memory that also is occupied with the rest of the system

#

Tho I did make a larger a
Swap file

supple inlet Mar 24, 2024, 8:51 PM

#

final kiln Tho I did make a larger a Swap file

should i try this?

final kiln Mar 24, 2024, 8:51 PM

#

supple inlet Im getting a Out of Memory error, im trying to run Mistral-7B-Instruct-v0.2 mode...

Do you have more code before model.to(device)?

supple inlet Mar 24, 2024, 8:51 PM

#

final kiln Do you have more code before model.to(device)?

yh one second

final kiln Mar 24, 2024, 8:52 PM

#

supple inlet should i try this?

No, the swap file was for normal ram to dump memory

supple inlet Mar 24, 2024, 8:53 PM

#

heres all the code:

from transformers import AutoTokenizer, AutoModelForCausalLM
device = 'cuda'
model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]
encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
model_inputs = encodeds.to(device)
model.to(device)

final kiln Mar 24, 2024, 8:54 PM

#

Ah Ive never used the transformers lib. But, aren't you potentially loading the same model twice ?

supple inlet Mar 24, 2024, 8:55 PM

#

final kiln Ah Ive never used the transformers lib. But, aren't you potentially loading the ...

which libary do you use?

final kiln Mar 24, 2024, 8:55 PM

#

supple inlet which libary do you use?

I've been coding transformers from scratch, but also, I've used olama to experiment with the open source models

#

Not sure if you can use olama to finetune

supple inlet Mar 24, 2024, 8:57 PM

#

Ive heard alot about olama, i havent tried it yet.

supple inlet Mar 24, 2024, 8:58 PM

#

final kiln Ah Ive never used the transformers lib. But, aren't you potentially loading the ...

by loading twice do you mean the .to(device) lines?

final kiln Mar 24, 2024, 8:59 PM

#

supple inlet by loading twice do you mean the .to(device) lines?

I mean the from pertained, don't they load models directly to the GPU

#

Try to debug this line by line, start with loading only one model to the GPU

#

And see the effect of that on the memory

#

There's gonna be a line where it gets to 20gb, which I think it shouldn't right

supple inlet Mar 24, 2024, 9:01 PM

#

Thanks ill give it a go

final kiln Mar 24, 2024, 9:01 PM

#

Yeah Mistral 7b should be like 4gb on the gpu

supple inlet Mar 24, 2024, 9:03 PM

#

Just running the tokenizer line it takes up 150mb lol. And it successfully encodes my message to a tensor

supple inlet Mar 24, 2024, 9:04 PM

#

final kiln Yeah Mistral 7b should be like 4gb on the gpu

is this with any quantisation? im probably mistaken but i taught the formula was 7b parameter would mean (7*2bits) 14gb memory

final kiln Mar 24, 2024, 9:06 PM

#

supple inlet is this with any quantisation? im probably mistaken but i taught the formula was...

dont know, im just following this table

lapis sequoia Mar 24, 2024, 9:16 PM

#

how do you add weights to variables in python

wooden sail Mar 24, 2024, 9:17 PM

#

final kiln I'm willing to accept that in the ideal math world I can funnel and recover any ...

this is not true neither in math nor irl

#

especially in the linear case, the recoverability conditions are well known

final kiln Mar 24, 2024, 9:19 PM

#

wooden sail this is not true neither in math nor irl

Must be the case when you can decompose a matrix into two, making a bottleneck

wooden sail Mar 24, 2024, 9:19 PM

#

the link between unique recoverability of high dimensional vectors from low dimensional ones is through so-called "sparse recovery", where the constraint is that the projection matrix needs to satisfy special identifiability conditions and the vectors you're looking for in high dimensions are sparse or have a sparse linear representation

#

the property can be thought of as approximately preserving distances between the vectors in the original vector space even after projecting them to a lower dimensional one

#

a popular formulation is via the "restricted isometry property" using the johnson-lindenstrauss lemma

hollow sentinel Mar 24, 2024, 9:22 PM

#

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
df = pd.read_csv("/Users/rahuldas/Desktop/Tortilla Dataset/tortilla_prices.csv")

print(df.head)
print(df.info()) 
print(df.shape)
print(df.columns)
print(df.dtypes)
print("hello world")
price_per_kilogram_missing = df["Price per kilogram"].isna().sum()
print(price_per_kilogram_missing)


price_per_kilogram_missing_mean = df["Price per kilogram"].mean()
print(price_per_kilogram_missing_mean)
df["Price per kilogram"] = df["Price per kilogram"].fillna(price_per_kilogram_missing_mean)
print(df["Price per kilogram"].isna().sum())
sns.set_style("whitegrid")
sns.kdeplot(df["Price per kilogram"], shade=True);
#plt.show()
fig, ax = plt.subplots(figsize=(6, 6))
# drawing the plot
sns.boxplot(data=df, x = "Store type", y = "Price per kilogram", color = "lightblue", ax=ax);
plt.xticks(rotation=90)
sns.despine(left=True, right=True, top=True, bottom=True)
#plt.show()

df["Date"] = pd.to_datetime(df[["Year", "Month", "Day"]])
print(df.columns)
print(df.head)
sns.lineplot(x = "Date", y = "Price per kilogram", data=df)
plt.show()

#

Traceback (most recent call last):
  File "/Users/rahuldas/Desktop/Tortilla Dataset/Tortilla Data Analysis.py", line 34, in <module>
    sns.lineplot(x = "Date", y = "Price per kilogram", data=df)
  File "/Users/rahuldas/Library/Python/3.9/lib/python/site-packages/seaborn/relational.py", line 508, in lineplot
    p._attach(ax)
  File "/Users/rahuldas/Library/Python/3.9/lib/python/site-packages/seaborn/_base.py", line 1135, in _attach
    converter.update_units(seed_data)
  File "/Library/Python/3.9/site-packages/matplotlib/axis.py", line 1717, in update_units
    self._update_axisinfo()
  File "/Library/Python/3.9/site-packages/matplotlib/axis.py", line 1729, in _update_axisinfo
    info = self.converter.axisinfo(self.units, self)
  File "/Library/Python/3.9/site-packages/matplotlib/dates.py", line 1882, in axisinfo
    return self._get_converter().axisinfo(*args, **kwargs)
  File "/Library/Python/3.9/site-packages/matplotlib/dates.py", line 1799, in axisinfo
    majloc = AutoDateLocator(tz=tz,
  File "/Library/Python/3.9/site-packages/matplotlib/dates.py", line 1333, in __init__
    super().__init__(tz=tz)
  File "/Library/Python/3.9/site-packages/matplotlib/dates.py", line 1132, in __init__
    self.tz = _get_tzinfo(tz)
  File "/Library/Python/3.9/site-packages/matplotlib/dates.py", line 236, in _get_tzinfo
    raise TypeError(f"tz must be string or tzinfo subclass, not {tz!r}.")
TypeError: tz must be string or tzinfo subclass, not <matplotlib.category.UnitData object at 0x1291503a0>.
(base) rahuldas@Das ~ %

#

what does this error mean?

#

some kind of type mismatch

final kiln Mar 24, 2024, 9:23 PM

#

wooden sail the property can be thought of as approximately preserving distances between the...

I thought there was an isometry between any Rn to any other Rm

#

Not isometry, wait

#

Ah idk, I thought you could say they have the same cardinality

wooden sail Mar 24, 2024, 9:24 PM

#

even with n = m, general matrices are not invertible

#

with n != m, they cannot be invertible

hollow sentinel Mar 24, 2024, 9:25 PM

#

 #   Column              Non-Null Count   Dtype         
---  ------              --------------   -----         
 0   State               278886 non-null  object        
 1   City                278886 non-null  object        
 2   Year                278886 non-null  int64         
 3   Month               278886 non-null  int64         
 4   Day                 278886 non-null  int64         
 5   Store type          278886 non-null  object        
 6   Price per kilogram  278886 non-null  float64       
 7   Date                278886 non-null  datetime64[ns]

wooden sail Mar 24, 2024, 9:25 PM

#

one looks for special conditions under which left invertibility is possible

final kiln Mar 24, 2024, 9:25 PM

#

I might've not expressed what I meant correctly

hollow sentinel Mar 24, 2024, 9:25 PM

#

i used df.info()

#

i thought it accepted that type

final kiln Mar 24, 2024, 9:26 PM

#

If I have a linear map from Rn to Rm

hollow sentinel Mar 24, 2024, 9:26 PM

#

seemingly it doesn't

final kiln Mar 24, 2024, 9:26 PM

#

That's a matrix right, and I can decompose it into two matrix that multiply into the original

wooden sail Mar 24, 2024, 9:26 PM

#

the cardinality of a vector space is card(field)^dimension, so also the cardinality is not the same btw

final kiln Mar 24, 2024, 9:26 PM

#

So that means I have a map like

Rn -> Rk -> Rm

wooden sail Mar 24, 2024, 9:27 PM

#

yeah, and neither of the two will be invertible

final kiln Mar 24, 2024, 9:27 PM

#

If k is very small, I don't find it intuitive that this composition could represent the original one

#

Because there's information being compressed, whereas in the original, there was not

wooden sail Mar 24, 2024, 9:27 PM

#

that's what i'm telling you, the operation is not invertible in general

final kiln Mar 24, 2024, 9:28 PM

#

I'm saying I didn't express what I meant correcty

wooden sail Mar 24, 2024, 9:28 PM

#

what do you mean to say, then?

#

you had already said that at the beginning

final kiln Mar 24, 2024, 9:28 PM

#

final kiln If k is very small, I don't find it intuitive that this composition could repres...

This

#

There's compressiom going on right, so the second transformation must be recovering something from the first

wooden sail Mar 24, 2024, 9:29 PM

#

for one, it cannot be done linearly

final kiln Mar 24, 2024, 9:30 PM

#

Not sure I follow

wooden sail Mar 24, 2024, 9:30 PM

#

the dimension of the intermediate R^k also has to satisfy special properties

#

how's your linear algebra?

final kiln Mar 24, 2024, 9:30 PM

#

No like, n -> k and then k-> m

wooden sail Mar 24, 2024, 9:31 PM

#

if the dimension of the vector space the data is in originally is larger than k, than you irremediably lose data and can't do anything about it

final kiln Mar 24, 2024, 9:31 PM

#

final kiln No like, n -> k and then k-> m

The second dimention matches the first from the second matrix

wooden sail Mar 24, 2024, 9:31 PM

#

final kiln No like, n -> k and then k-> m

yes, that's what i'm telling you

#

what you're asking is exactly what the johnson-lindenstrauss lemma discusses

final kiln Mar 24, 2024, 9:32 PM

#

So again, I don't find it intuitive that the composition can fully represent the n -> m

#

But it's possible cuz it's a matrix mul

wooden sail Mar 24, 2024, 9:32 PM

#

you'll need to review your linear algebra

final kiln Mar 24, 2024, 9:33 PM

#

I disagree, understanding the math is different from having an intuitive picture of it

wooden sail Mar 24, 2024, 9:34 PM

#

i gave you an intuitive explanation in terms of isometry too but ok

#

at any rate, if you look up sparse recovery you'll find any level of abstraction and detail you like about the topic

final kiln Mar 24, 2024, 9:36 PM

#

wooden sail i gave you an intuitive explanation in terms of isometry too but ok

Different people will have different thresholds for what is or what is not intuitive

final kiln Mar 24, 2024, 10:01 PM

#

final kiln No like, n -> k and then k-> m

I think the explanation is gonna be that the set of maps that you can build, which come from a composition of linear maps like this one, and where k is much smaller than both m and n, are not gonna be very complicated from the get go, so there's not a lot of information that needs to flow from one side to the other

toxic mortar Mar 24, 2024, 10:59 PM

#

toxic mortar I want to build unsupervided learning semantic-based cluster grouping of key inf...

Regarding this question, I built a pipeline that is not giving me the results I was hoping for. Pipeline is following:

Input:
    List of file IDs for document body extraction.
Steps:
    1) get_document_sentences()
       Input Aggregates sentences within documents by semantic similarity. 
       Outputs a list of lists, where the inner list contains semantically grouped sentences, and the outer list aggregates these groups across documents.
       - Sentences are initially split using '. ' (dot_split).
       - Semantically similar sentences within a document are grouped using vectorization and clustering (semantic_split).
    2) cluster_sentences()
       - Further clusters the semantically grouped sentences across all documents to identify broader themes or contexts.
       - Takes as input a list of lists, where each list represents a context, and clusters these contexts across documents using specified clustering techniques (e.g., agglomerative, DBSCAN, kMeans).
       - Outputs a list of lists, with each inner list containing sentences from various documents that share a similar context.
Output:
    List of semantically grouped lists.

#

Problem is that I get one dense "centroid" cluster and the rest are very sparse, and I dont have optimal number of clusters. I fine-tune it for one example group, but I overfit it and cant generalize

#

These are hyperparams I use:

 methods = [
        ('agglomerative', {'distance_threshold': 1.2, 'linkage': 'ward'}),
        ('dbscan', {'eps': 4.0, 'min_samples': 2, 'n_neighbors': 50}),
        ('kmeans', {'n_clusters': 30})
    ]
def cluster_sentences(sentences, cluster_method='agglomerative', **kwargs):
    sentence_vectors = vectorize_text(sentences)
    if cluster_method == 'agglomerative':
        model = AgglomerativeClustering(n_clusters=None, **kwargs)
    elif cluster_method == 'dbscan':
        n_neighbors = min(len(sentences), kwargs.pop('n_neighbors'))
        nn_descent = NNDescent(sentence_vectors, n_neighbors=n_neighbors, metric='euclidean')
        distances, indices = nn_descent.neighbor_graph
        n_samples = sentence_vectors.shape[0]
        indptr = np.arange(0, n_samples * n_neighbors + 1, n_neighbors)
        precomputed_distance_matrix = csr_matrix((distances.ravel(), indices.ravel(), indptr), shape=(n_samples, n_samples))
        precomputed_distance_matrix = sort_graph_by_row_values(precomputed_distance_matrix)
        model = DBSCAN(metric='precomputed', **kwargs)
    elif cluster_method == 'kmeans':
        n_clusters = min(len(sentences), kwargs.get('n_clusters'))
        kwargs.pop('n_clusters', None)
        model = KMeans(n_clusters=n_clusters, **kwargs)
    else:
        raise ValueError("Unsupported clustering method.")

    if cluster_method != 'dbscan':
        model.fit(sentence_vectors)
        labels = model.labels_
    else:
        labels = model.fit_predict(precomputed_distance_matrix)

    return labels

#

I mean I can experiment with hyperparam tuning like gridsearch,random search, kfolds etc, but I am not sure how to establish validation metrics for unsupervised learning
If anyone did something similiar to what I am trying to build, please let me know if i fkd up pipeline logic

#

How to find an optimal number of clusters
How to evaluate clustering?
Everything I found about evaluation and hyperparam optimization is about supervised learning
Do I start labeling data?

lapis sequoia Mar 24, 2024, 11:40 PM

#

toxic mortar How to find an optimal number of clusters How to evaluate clustering? Everything...

elbow method

lucid wadi Mar 25, 2024, 5:18 AM

#

from analysis.inception import InceptionV3 isnt working in my python script:
PS C:\users\zayga\VATr-pp-main> python .\generate.py text --text hello
Traceback (most recent call last):
File "C:\users\zayga\VATr-pp-main\generate.py", line 2, in <module>
from generate import generate_text, generate_authors, generate_fid, generate_page, generate_ocr, generate_ocr_msgpack
File "C:\users\zayga\VATr-pp-main\generate_init_.py", line 1, in <module>
from generate.text import generate_text
File "C:\users\zayga\VATr-pp-main\generate\text.py", line 5, in <module>
from generate.writer import Writer
File "C:\users\zayga\VATr-pp-main\generate\writer.py", line 15, in <module>
from models.model import VATr
File "C:\users\zayga\VATr-pp-main\models\model.py", line 7, in <module>
from analysis.inception import InceptionV3
ModuleNotFoundError: No module named 'analysis.inception'

#

can sb help

#

dm me if you can because im exhausted ty sm if you can

feral wind Mar 25, 2024, 6:26 AM

#

hi guys so im working on a chatbot project and this is the error that i got:


You can run `openai migrate` to automatically upgrade your codebase to use the 1.0.0 interface.

Alternatively, you can pin your installation to the old version, e.g. `pip install openai==0.28`

A detailed migration guide is available here: https://github.com/openai/openai-python/discussions/742```

i tried doing the pip install openai==0.28 but it didnt work, and the tutorial on github is hard to understand, can someone explain to me

wooden sail Mar 25, 2024, 7:42 AM

#

final kiln I think the explanation is gonna be that the set of maps that you can build, whi...

right, with n and m > k, a low dimensional projection is performed. that also means the overal matrix is rank-deficient and non invertible, even if square

#

it would also mean that you have to work with a positive semidefinite metric tensor instead of a positive definite one. otherwise you lose the low dimensional projection, which arguably is what the authors intended to have

#

you had mentioned an approach where the matrices are identities and then they use max pooling. that's the same as a low dimensional projection

carmine wharf Mar 25, 2024, 9:39 AM

#

Hi everybody, I have a question. I have to buy a new laptop. I want to do some deep learning and some CNN for computer vision. Do you think it make sense to buy one with an Nvidia GPU such as RTX 4050 ? Would that be enough to train models ? or is it better to have a dedicated server or a googlecolab with GPU to do that ?

final kiln Mar 25, 2024, 9:45 AM

#

wooden sail it would also mean that you have to work with a positive semidefinite metric ten...

I'm gonna re read their paper, but from what I recall they don't motivate their choices, tho their code and choice of hyperparameters is very telling, they always use the k that make the projection more efficient than using a quadratic directly - I even have a suspicion they started with quadratic and came up with this, but my impression is that they were just building on top of previous approaches and "accidentally" stumbled upon this

final kiln Mar 25, 2024, 9:46 AM

#

carmine wharf Hi everybody, I have a question. I have to buy a new laptop. I want to do some d...

A low end gpu, nvidia 8gb vram, is very handy to have around for smaller models and general proof of concept work

#

but no reason to go overboard and you can get by with not having it, I don't and my setup is super efficient, I must spend on average less than 5 cents per day on gpu, some days I use more than others ofc, but avg it out and if you use it mindfully with a good setup, it's much cheaper

#

furthermore, and this is my personal take, the trend is gonna be towards decentralized ML training

#

if nothing else, I'll personally make it happen since I've had the idea in the back of my mind for while, but there's quite a lot of smart people pushing for it already

wooden sail Mar 25, 2024, 9:51 AM

#

final kiln I'm gonna re read their paper, but from what I recall they don't motivate their ...

do they say anything about "embedding" or "subspace" or "low dimension"? in any case, you achieve the same effect by making your tensor low rank

carmine wharf Mar 25, 2024, 9:51 AM

#

final kiln but no reason to go overboard and you can get by with not having it, I don't and...

thanks for your reply. That's what I was thinking. Also, GPU comes in with gaming laptop, which are huge and heavy. I agree on the decentralized ML. That was my first thought

final kiln Mar 25, 2024, 9:52 AM

#

wooden sail do they say anything about "embedding" or "subspace" or "low dimension"? in any ...

im gonna check

#

embedding they mention for sure ofc

#

the word space appears once when talking about the 1/sqrt()

#

"Due to the reduced dimension of each head, the total computational cost
is similar to that of single-head attention with full dimensionality" they do mention it

#

yeah in fact I think the "multi-headed" part was also central to the transformer innovation, which makes sense

#

#

I gotta re read the entire thing, they actually do motivate some of these things quite well

final kiln Mar 25, 2024, 10:00 AM

#

carmine wharf thanks for your reply. That's what I was thinking. Also, GPU comes in with gamin...

how much vRAM do you get with a gaming laptop ?

carmine wharf Mar 25, 2024, 10:07 AM

#

final kiln how much vRAM do you get with a gaming laptop ?

6 to 8 GB from what I saw

final kiln Mar 25, 2024, 10:11 AM

#

carmine wharf 6 to 8 GB from what I saw

that's pretty small in the context of ML, but still useful, and there's also cool tricks like gradient accumulation that let you train on more data than what fits in the gpu

carmine wharf Mar 25, 2024, 10:13 AM

#

ok, did not know that

#

I will keep investigating, see if I can make my mind

#

thanks for your help 😄

pure pond Mar 25, 2024, 12:05 PM

#

hello people, im trying to get more practical experience with pytorch and I have a question about the conventional way to select a row of a tensor. I have a standard scalar which has been trained on a dataset with (x,y) rows,cols. I also have a dataset with a getitem. It gets a row from my torch tensor, but in order to scale it, I need to transform the selected "row" from shape (y) into (1,y). (If im doing something weird here let me know)

My question then is, whats the more standard way to do it?
x = X_train[i : i+1], or
x = X_train[i].reshape(1, -1)

tidal bough Mar 25, 2024, 12:10 PM

#

the latter is IMO more readable.

winter nacelle Mar 25, 2024, 12:13 PM

#

I need to know how they got these calulation.
Can somebody help me?