#data-science-and-ml | Python | Page 89

wooden sail Nov 21, 2023, 10:44 PM

#

data driven only refers to how the optimization is done

#

i.e. analytically vs. stochastically from measurements

past meteor Nov 21, 2023, 10:45 PM

#

why does every subdomain use the same basic terminology to mean different things 😩

wooden sail Nov 21, 2023, 10:45 PM

#

any form of stochastic grad desc is data driven, regardless of what you're optimizing

wooden sail Nov 21, 2023, 10:45 PM

#

past meteor why does every subdomain use the same basic terminology to mean different things...

yeah it's dumb shit

#

i kinda trust this paper cuz this yonina eldar is a well known mathematician

past meteor Nov 21, 2023, 10:45 PM

#

iron basalt Nov 21, 2023, 10:46 PM

#

"Perceptron" is a specific thing though, you can't change its activation function. #data-science-and-ml message

past meteor Nov 21, 2023, 10:46 PM

#

They have model-based on a spectrum with data-driven on opposite ends

wooden sail Nov 21, 2023, 10:47 PM

#

past meteor

i'm not sure i fully agree with that

past meteor Nov 21, 2023, 10:47 PM

#

I think they use "model" in the same way as reinforcement learning uses "model" (the MPC-kind of "model")

past meteor Nov 21, 2023, 10:48 PM

#

wooden sail i'm not sure i fully agree with that

Do you mean you disagree with yonina eldar 👀

#

(I don't know who they are)

wooden sail Nov 21, 2023, 10:48 PM

#

lemme go check 😩 smh

past meteor Nov 21, 2023, 10:48 PM

#

I lifted this from the paper

wooden sail Nov 21, 2023, 10:48 PM

#

iron basalt "Perceptron" is a specific thing though, you can't change its activation functio...

oh i totally missed it said perceptron

#

i've been bamboozled

wooden sail Nov 21, 2023, 10:50 PM

#

past meteor I lifted this from the paper

i mean, in the text it says "purely data-driven" which is different from how it's shown in the figure

iron basalt Nov 21, 2023, 10:50 PM

#

wooden sail oh i totally missed it said perceptron

You might find "gaussian perceptron" but that is abusing the naming.

wooden sail Nov 21, 2023, 10:50 PM

#

cuz you can have model-based and data-driven at the same time

iron basalt Nov 21, 2023, 10:51 PM

#

That's like calling my layer with a non-linear activation function a "linear nonlinear layer."

wooden sail Nov 21, 2023, 10:51 PM

#

hybrid model-based/data-driven systems...```

#

the "model-based" part refers to architecture, while "data-driven" refers to how the parameters of the architecture are learned

#

they're letting people off easy by not calling it black-box + data-driven

past meteor Nov 21, 2023, 10:53 PM

#

wooden sail the "model-based" part refers to architecture, while "data-driven" refers to how...

Yes 🤔

#

I agree with this. The "fully" model driven (whatever that is) still need to estimate a handful or parameters

#

But what they can do is way more constrained

wooden sail Nov 21, 2023, 10:54 PM

#

yeah. and you can either do that from data stochastically or analytically

#

right, the one model will only work for a specific type of problem, usually

#

it can't adapt by just changing the parameters

past meteor Nov 21, 2023, 10:55 PM

#

The predictions are as good as the chosen model

#

So if the model used in the domain is already a massive oversimplification of reality

#

Like the ones are that try to model human stuff in silco

wooden sail Nov 21, 2023, 10:55 PM

#

that's one take on what model means though, not the only one

#

the other one is to take an optimizer which has convergence guarantees but involves expensive steps

#

then replace those expensive steps with a black-box network

#

this is independent of whether what you're "modelling" is modelled with a network or not

past meteor Nov 21, 2023, 10:57 PM

#

Ah yes, okay this makes sense

#

So you're not using it as an emulator - you're basically using it to converge your expensive model

#

Correct?

wooden sail Nov 21, 2023, 10:57 PM

#

you'll see a lot of ADMM on crack done this way

#

kinda, yeah

#

a good combo of these things is to take a network that learns a model for something complicated, and its parameters are learned by grabbing a well established optimization routine and intertwining it with the network

#

these usually end up somewhat like autoencoders

#

like input -> modelling network -> output -> optimization routine turned into network -> the parameters we care about

#

and you learn the parameters for the forward and inverse networks together using some fancy cost function, e.g. possibly enforcing the model network to solve a differential equation as part of its cost instead of only fitting data

past meteor Nov 21, 2023, 11:03 PM

#

I think this makes sense to me yes 🤔

#

Do you have a paper on this as well?

wooden sail Nov 21, 2023, 11:09 PM

#

not off the top of my head, but many recent papers solving inverse problems with physics-informed neural networks should be doing something similar

rugged comet Nov 22, 2023, 12:24 AM

#

Yeah, columns like zip code make that tough.

rugged comet Nov 22, 2023, 3:41 AM

#

I am writing a DecisionTreeClassifer from scratch for fun and to help me understand the algorithm better.
How are model weights/decision nodes saved within a DecisionTreeClassifier object? I'm thinking about perhaps creating a new class for the nodes similar to the Tree data structure.
I think that's how they might do it.

loud plaza Nov 22, 2023, 8:09 AM

#

rugged comet I am writing a DecisionTreeClassifer from scratch for fun and to help me underst...

print("Hi")

placid cedar Nov 22, 2023, 8:11 AM

#

#

guys, how do i solve this error

storm smelt Nov 22, 2023, 10:49 AM

#

Actually, I'm just confused about how to ask the question

wild wadi Nov 22, 2023, 1:17 PM

#

Hello everyone! Is there any experienced python webscraper around whos got 5 minutes for a few questions from a noob??? :$$$

serene scaffold Nov 22, 2023, 1:22 PM

#

wild wadi Hello everyone! Is there any experienced python webscraper around whos got 5 min...

This channel isn't about web scraping. So please open a thread in #1035199133436354600. And give enough information that people can start answering your question

serene scaffold Nov 22, 2023, 1:22 PM

#

storm smelt Actually, I'm just confused about how to ask the question

Just say as much as you can about what you're trying to do

wild wadi Nov 22, 2023, 1:23 PM

#

Done, thank you!

storm smelt Nov 22, 2023, 1:25 PM

#

serene scaffold Just say as much as you can about what you're trying to do

I want to make modeling using the KNN method but I can't understand the material yet. Is there anyone who wants to give me material or explain the material to me?

serene scaffold Nov 22, 2023, 1:29 PM

#

storm smelt I want to make modeling using the KNN method but I can't understand the material...

Try reading a different guide or watching a different video, and use this channel if you have a specific question. If someone writes up an explanation for you, it won't be fundamentally different from explanations you can already find online

#

But if you already have a more specific question than that that you can put words to, please go ahead.

pure palm Nov 22, 2023, 2:39 PM

#

I am actually looking for a team to participate in ML hackathons
I have worked on LLMs, autogen, diffusion models, VAEs
If anyone is interested pls let me know.

mild dirge Nov 22, 2023, 3:08 PM

#

@iron basalt I am now reading conscious MIND Resonant BRAIN, but I am coming across a lot of tiny mistakes (referencing the wrong images, mixing up rows/columns, saying someone lived from 1869 until 1854 etc.). Am I reading an old version or something, or is this something you found as well? (asking you because you recommended this book earier, the book is really interesting so far)

west cloak Nov 22, 2023, 3:22 PM

#

I have a question on classifiers and the bagging method. What classifiers gain from this method?

scarlet helm Nov 22, 2023, 3:24 PM

#

Hello, I am creating a sentiment analyzer with Python and keras vanilla RNN, the dataset consists of two columns sentence and label (Positive and negative), I tokenize these sentences, eliminate stop words and convert them to a number, currently the accuracy of my model is 52% How can I improve this?

#

Here is my model definition

serene scaffold Nov 22, 2023, 3:43 PM

#

scarlet helm Here is my model definition

!code

arctic wedgeBOT Nov 22, 2023, 3:43 PM

#

Formatting code on discord

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

For long code samples, you can use our pastebin.

scarlet helm Nov 22, 2023, 4:00 PM

#

def vainilla_rnn():
    model = Sequential()
    model.add(Embedding(vocab_size, 200, input_length=maxlen))
    model.add(SimpleRNN(200, input_shape=(maxlen,1),return_sequences=False))
    model.add(Dense(num_classes))
    model.add(Activation('sigmoid'))
    model.summary()

    adam= optimizers.Adam(lr=0.001)
    #model.compile(loss='sparse_categorical_crossentropy', optimizer=adam, metrics=['accuracy'])
    model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy']  )
    return model

#

Here is model definiton

serene scaffold Nov 22, 2023, 4:29 PM

#

scarlet helm ```py def vainilla_rnn(): model = Sequential() model.add(Embedding(vocab...

how many epochs did you run?

scarlet helm Nov 22, 2023, 4:32 PM

#

10 but If I increment the number of epochs I don't get a big increase in the performance

#

For example I tried with 30 and the result was similar

desert oar Nov 22, 2023, 4:37 PM

#

scarlet helm Hello, I am creating a sentiment analyzer with Python and keras vanilla RNN, the...

unless this is a known data set with known benchmark accuracy with various model types, there's always the possibility that the data are not clearly separated by sentiment. where are you getting the sentiment labels from?

#

and as a baseline, did you try something like logistic regression with tfidf features, or hashed features, or a simpler word embedding model like cbow/skipgram ?

iron basalt Nov 22, 2023, 4:38 PM

#

mild dirge <@119925597395877889> I am now reading conscious MIND Resonant BRAIN, but I am c...

It has several mistakes and could use a new edition. It also tends to go all over the place at times.

desert oar Nov 22, 2023, 4:39 PM

#

also how big is the dataset? maybe you don't have enough data to learn a useful embedding space, maybe you want to consider pre-trained word vectors from a bigger model & data set

mild dirge Nov 22, 2023, 4:39 PM

#

iron basalt It has several mistakes and could use a new edition. It also tends to go all ove...

Exactly haha. Really interesting book, but kinda hard to follow at certain points.

iron basalt Nov 22, 2023, 4:39 PM

#

mild dirge Exactly haha. Really interesting book, but kinda hard to follow at certain point...

Grossberg is known for having confusing presentations too.

mild dirge Nov 22, 2023, 4:40 PM

#

He talks about a few presentations on yt, probably going to watch one to get an idea

desert oar Nov 22, 2023, 4:40 PM

#

west cloak I have a question on classifiers and the bagging method. What classifiers gain f...

you're asking what's the benefit of using bagging? are you familiar with the more general concept of bootstrapping in statistics? bagging is more or less just bootstrapping for a predictive model

#

"bagging" is just a cute abbreviation of "bootstrap aggregating"

#

https://scikit-learn.org/stable/modules/ensemble.html#bagging

https://scikit-learn.org/stable/auto_examples/ensemble/plot_bias_variance.html#sphx-glr-auto-examples-ensemble-plot-bias-variance-py

scikit-learn

Single estimator versus bagging: bias-variance decomposition

This example illustrates and compares the bias-variance decomposition of the expected mean squared error of a single estimator against a bagging ensemble. In regression, the expected mean squared e...

scikit-learn

1.11. Ensembles: Gradient boosting, random forests, bagging, voting...

Ensemble methods combine the predictions of several base estimators built with a given learning algorithm in order to improve generalizability / robustness over a single estimator. Two very famous ...

scarlet helm Nov 22, 2023, 4:44 PM

#

desert oar unless this is a known data set with known benchmark accuracy with various model...

The dataset used is "Sentiment Labelled Sentences Dataset", from the UC Irvine Machine Learning Repository.
The sentences come from three different websites/fields:
amazon.com
imdb.com
yelp.com
https://archive.ics.uci.edu/dataset/331/sentiment+labelled+sentences

UCI Machine Learning Repository

Discover datasets around the world!

scarlet helm Nov 22, 2023, 4:45 PM

#

desert oar and as a baseline, did you try something like logistic regression with tfidf fea...

I also implemented a Dummy Classifier and it got a acurracy of 51%

agile owl Nov 22, 2023, 4:53 PM

#

what's the right way to find a model that fits a time series best on average across all segments if you split it up into, say 20ths

#

if I fit it overall there are certain periods that dominate and lead it to performing quite poorly in others and I want it to perform more consistently over different time periods

#

I started by just leaving out 10% of the data as test but nothing fits both train and test that well doing it the way I'm doing it

#

I wasn't just going to do average but average/std because just average would lead to the same result as fitting it the way I'm doing now I believe

#

I want a model to consistently perform well across all the segmentations without being necessarily optimal for any one of them

peak thorn Nov 22, 2023, 5:02 PM

#

hi guy i m new in ml can you just me difference between weights,parameter and hyperp... in simple words it making me more confuse on internet...

agile owl Nov 22, 2023, 5:02 PM

#

weights are endogeneously fitted by the model

#

hyperparameters are exogeneous to the model

#

parameters I beieve can refer to both

past meteor Nov 22, 2023, 5:03 PM

#

peak thorn hi guy i m new in ml can you just me difference between weights,parameter and hy...

weights = parameters = coefficients

hyperparameters are the "settings" of your model.

agile owl Nov 22, 2023, 5:06 PM

#

if my question isn't clear happy to clarify btw

serene scaffold Nov 22, 2023, 5:08 PM

#

past meteor weights = parameters = coefficients hyperparameters are the "settings" of your ...

you know wannabe linkedin influencers who post bullshit programming suggestions?
I saw one who referred to keyword arguments of a pandas method as hyperparameters Pepega

past meteor Nov 22, 2023, 5:10 PM

#

serene scaffold you know wannabe linkedin influencers who post bullshit programming suggestions?...

that's just ... I have no words

agile owl Nov 22, 2023, 5:10 PM

#

I like the ones who post poll questions about Python or ML and their answers are WRONG

left tartan Nov 22, 2023, 5:10 PM

#

past meteor that's just ... I have no words

I was trying to find an emoji that expressed my feelings.

past meteor Nov 22, 2023, 5:10 PM

#

agile owl what's the right way to find a model that fits a time series best on average acr...

Time series cross validation exists

agile owl Nov 22, 2023, 5:10 PM

#

I have a lot of ppl in my linkedin feeds posting "Python" or "ML" quizzes and their questions are either totally impractical or the answer are flat out wrong

past meteor Nov 22, 2023, 5:11 PM

#

agile owl I want a model to consistently perform well across all the segmentations without...

https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html

scikit-learn

sklearn.model_selection.TimeSeriesSplit

Examples using sklearn.model_selection.TimeSeriesSplit: Time-related feature engineering L1-based models for Sparse Signals Visualizing cross-validation behavior in scikit-learn

agile owl Nov 22, 2023, 5:11 PM

#

@past meteor right, so this is making a bunch of train and test sets

#

where the test set for each train set comes after the train set

peak thorn Nov 22, 2023, 5:12 PM

#

i still don't get it sorry help me init

agile owl Nov 22, 2023, 5:12 PM

#

I guess it's better than before but I was worried that you might get different results just depending on how you split it up

#

so I wanted the training data to contain everything before a certain date

#

but control for performance on different splits of the train data

#

I guess I'll try this approach

#

and see how it works

past meteor Nov 22, 2023, 5:15 PM

#

agile owl so I wanted the training data to contain everything before a certain date

yes, time series split does this

agile owl Nov 22, 2023, 5:16 PM

#

thx will look into it more

past meteor Nov 22, 2023, 5:16 PM

#

In reality what you do is this:

Decide how large your test set is in %
Find the date that aligns with this %
Split everything before this data into the training set. Everything after into the validation set.
Use time series validation to make N training sets that only contain data before the test sets it's evaluating on.
Do 1 final evaluation on the validation set

agile owl Nov 22, 2023, 5:17 PM

#

gotcha

past meteor Nov 22, 2023, 5:17 PM

#

peak thorn i still don't get it sorry help me init

parameters = "the model", hyper parameters = settings used to create the model

agile owl Nov 22, 2023, 5:19 PM

#

there are some numerical values or boolean values you might need to provide to the model

#

so it performs a certain way when doing its fitting to the data

#

that's another way to think about it

#

the model can't generate it itself because it's an assumption

serene scaffold Nov 22, 2023, 5:19 PM

#

you can also think of hyperparameters as the parameters of the parameters

#

should have been called metaparameters

agile owl Nov 22, 2023, 5:20 PM

#

yes

serene scaffold Nov 22, 2023, 5:20 PM

#

then facebook will say that only they can do ml

agile owl Nov 22, 2023, 5:20 PM

#

hah

peak thorn Nov 22, 2023, 5:20 PM

#

@past meteor thanks

placid cedar Nov 22, 2023, 5:21 PM

#

hi guys, are there any great ways to improve my CNN model's performance?

#

i have already done data augmentation, regularisation

agile owl Nov 22, 2023, 5:21 PM

#

one common example is like, if you want to regularize the coefficients of a model (push them closer to zero to avoid overfitting) the penalty you apply to the sum of the weights is a hyperparameter

placid cedar Nov 22, 2023, 5:21 PM

#

but my validation data just cant seem to increase

#

always around the 0.72 - 0.77 range

agile owl Nov 22, 2023, 5:22 PM

#

technically the sum of the abs of the weights or the square of the weights

serene scaffold Nov 22, 2023, 5:22 PM

#

placid cedar hi guys, are there any great ways to improve my CNN model's performance?

there's no way to answer that question in general. you have to say what the model does, how you're currently going about training it, how it's currently performing, and some high-level properties of your training data. without at least all that information, people can only give random suggestions that might be useless.

#

!paste please never show code as text

arctic wedgeBOT Nov 22, 2023, 5:23 PM

#

Pasting large amounts of code

If your code is too long to fit in a codeblock in Discord, you can paste your code here:
https://paste.pythondiscord.com/

After pasting your code, save it by clicking the Paste! button in the bottom left, or by pressing CTRL + S. After doing that, you will be navigated to the new paste's page. Copy the URL and post it here so others can see it.

placid cedar Nov 22, 2023, 5:23 PM

#

ah alright

past meteor Nov 22, 2023, 5:24 PM

#

placid cedar hi guys, are there any great ways to improve my CNN model's performance?

https://karpathy.github.io/2019/04/25/recipe/

A Recipe for Training Neural Networks

Musings of a Computer Scientist.

placid cedar Nov 22, 2023, 5:27 PM

#

model_6 = models.Sequential()

model_6.add(layers.Conv2D(58, (3, 3), activation='relu',
                        input_shape=(img_size, img_size, 3))) # 3 colours rgb
model_6.add(layers.MaxPooling2D((2, 2)))

model_6.add(layers.Conv2D(116, (3, 3), kernel_regularizer=regularizers.l2(0.0001), activation='relu'))
model_6.add(layers.MaxPooling2D((2, 2)))

model_6.add(layers.Conv2D(232, (3, 3), kernel_regularizer=regularizers.l2(0.0001), activation='relu'))
model_6.add(layers.MaxPooling2D((2, 2)))

model_6.add(layers.Conv2D(232, (3, 3), kernel_regularizer=regularizers.l2(0.0001), activation='relu'))
model_6.add(layers.MaxPooling2D((2, 2)))
model_6.add(layers.Dropout(0.5))

model_6.add(layers.Flatten())
model_6.add(layers.Dropout(0.5))

model_6.add(layers.Dense(10, activation='softmax'))

model_6.compile(loss='categorical_crossentropy',
              optimizer=optimizers.Adam(learning_rate= 5e-4),
              metrics=['acc'])

model_6.summary()

#

current number of parameters is 903070

#

the current results

past meteor Nov 22, 2023, 5:28 PM

#

The link I sent you are good tips

#

I could tell you what I do but it's more or less what's in the link

placid cedar Nov 22, 2023, 5:29 PM

#

mmm i see

placid cedar Nov 22, 2023, 5:34 PM

#

placid cedar the current results

should i try to make my model more complex till it reaches an accuracy of above 0.9, then start to regularise?

#

seems that the training data is starting to show slower learning rate alongside with the increasing epoches

past meteor Nov 22, 2023, 5:36 PM

#

placid cedar should i try to make my model more complex till it reaches an accuracy of above ...

it's all in the link 😄

#

Start by looking at the mistakes you're making

agile owl Nov 22, 2023, 5:36 PM

#

@past meteor "Note that unlike standard cross-validation methods, successive training sets are supersets of those that come before them." Does this not bias the weights towards fitting the starting period the best since it is included the most times?

past meteor Nov 22, 2023, 5:37 PM

#

when I was doing comp vision in the past I noticed cases where the model was failing and I was like "damn I couldn't get that right either"

past meteor Nov 22, 2023, 5:38 PM

#

agile owl <@260493929047130113> "Note that unlike standard cross-validation methods, succe...

Oh that's an interesting remark

#

I don't think there's anything you can do about this

agile owl Nov 22, 2023, 5:38 PM

#

I guess part of my problem is also that the overfitting is being done by DEAP so I'm not sure how to reincoporate the validation results into the parameters explicitly

#

but I can work on that

#

https://github.com/DEAP/deap

GitHub

GitHub - DEAP/deap: Distributed Evolutionary Algorithms in Python

Distributed Evolutionary Algorithms in Python. Contribute to DEAP/deap development by creating an account on GitHub.

#

I wanted to see if you can use this for online training

#

but wasn't able to find anything

past meteor Nov 22, 2023, 5:40 PM

#

Why DEAP?

agile owl Nov 22, 2023, 5:40 PM

#

because it's for trading strategies and I didn't want to make any assumptions that are necessary for likelihood based methods

#

to be fair part of it was just that I learned about it and thought it was exciting and wanted to explore this approach too

#

not necessarily extremely deliberate

#

but I know other people do this too so there's probably some basis

past meteor Nov 22, 2023, 5:43 PM

#

Evolutionary algorithms are typically a waste of compute imho 😄

agile owl Nov 22, 2023, 5:44 PM

#

why do you say that

past meteor Nov 22, 2023, 5:44 PM

#

Because they are black-box optimization methods that use little to no information of the problem being solved

agile owl Nov 22, 2023, 5:45 PM

#

it also avoids false precision from false assumptions though

past meteor Nov 22, 2023, 5:45 PM

#

Yesn't

#

You still need a fitness function

agile owl Nov 22, 2023, 5:45 PM

#

my thought process behind it is that a lot of people are making unjustified assumptions

#

the fitness function is much easier to think about though

past meteor Nov 22, 2023, 5:46 PM

#

If you have a fitness function that is say the MSE you've devolved into just maximum likelihood

agile owl Nov 22, 2023, 5:46 PM

#

just do something like total profit * (total_profit/max_drawdown) or something like that you know what you want to get out of it

agile cobalt Nov 22, 2023, 5:46 PM

#

agile owl my thought process behind it is that a lot of people are making unjustified assu...

the AI may as well still learn a bunch of unjustified assumptions that only apply to the training data and will not transfer at all to the real world

agile owl Nov 22, 2023, 5:46 PM

#

right

#

that's why I'm trying to avoid overfitting

#

I've realized it's a problem

past meteor Nov 22, 2023, 5:47 PM

#

I did a full course on EA's in uni. Fun stuff, really enjoyed it. Waste of compute for most problems.

agile owl Nov 22, 2023, 5:47 PM

#

where would you say they are best suited then

past meteor Nov 22, 2023, 5:48 PM

#

Multi objective optimization
Combintorial optimization

#

So actually I'd say: use it

#

It's fun to play around with because then you'll see the limitations more clearly after a while 😄

agile cobalt Nov 22, 2023, 5:50 PM

#

agile owl because it's for trading strategies and I didn't want to make any assumptions th...

one way or the other, just remember that you'll be competing with a few billion dollar companies if you are thinking about algorithmic trading.
Be fully prepared to lose any and all money you throw into it.

echo mesa Nov 22, 2023, 5:52 PM

#

Guys, do you guys know a good paper that explains linear regression from scratch both mathematically and having examples in python with it?

desert oar Nov 22, 2023, 6:01 PM

#

echo mesa Guys, do you guys know a good paper that explains linear regression from scratch...

It sounds like you want a textbook, not a paper. Look up Introduction to Statistical Learning with Python, it should be freely available on the author's site as PDF (although I personally enjoy having hardcopy books when I can)

echo mesa Nov 22, 2023, 6:02 PM

#

desert oar It sounds like you want a textbook, not a paper. Look up Introduction to Statist...

gotcha, thanks

agile owl Nov 22, 2023, 6:07 PM

#

agile cobalt one way or the other, just remember that you'll be competing with a few billion ...

I've worked for billion dollar companies and I see the opportunity for disruption

#

lots of what they do isn't actually that advanced at all

buoyant vine Nov 22, 2023, 6:07 PM

#

no need :P Just need to be faster than you

agile owl Nov 22, 2023, 6:07 PM

#

well exactly

#

I'm not playing that game I'm not trying to be an algo market maker

#

I'm going for superior signals

buoyant vine Nov 22, 2023, 6:08 PM

#

Often they simply can afford what you cannot, and since most of the time they're doing HFT and are as close to the exchange as possible they just snuff out the little guys

agile owl Nov 22, 2023, 6:08 PM

#

it's not HFT to be clear

#

I know that is not a space you want to compete with bigger players

#

momentum and value exist on different timescales though

#

I've been doing strategies based on analyzing the degree of mean reversion via hurst exponents

#

and using deap to fit parameters around how these things are calculated, windows, etc.

#

cutoff levels for different behaviors

#

like if it's trending then go with the trend unless the trend coefficient goes above a certain number then bet against the trend continuing

#

etc.

#

It wasn't clear to me that there's a framework that is meant for fitting parameters in this way which is why I went for DEAP because it doesn't require hardly any assumptions

#

you just need to give it the fitness function

jagged hedge Nov 22, 2023, 6:12 PM

#

i have a macbook pro 13 inch M2 with 8GB memory how can i run LLMs like 66B parameters or 100B parameters on it is there a way i can do so on cloud like aws or colab pro and are they really worth and can run such big models??

desert oar Nov 22, 2023, 6:12 PM

#

just keep in mind you're not the only person doing this, certainly not the first to have this idea

agile owl Nov 22, 2023, 6:13 PM

#

the reason why trading strategies remain profitable over time is this: constrained behavior

#

lots of the biggest players have constrained or dumb behavior due to operational constraints

#

so you are siphoning money off of mutual funds and pensions and insurance companies that are not as concerned about short term profits

#

and do suboptimal behaviors

#

here's an example

#

the Goldman Sachs Commodities Index

desert oar Nov 22, 2023, 6:14 PM

#

right. if you have a model that seems to work, go for it. just remember that backtesting only gets you so far. it sounds like you have more domain knowledge than the usual person who wanders in here asking about algo trading bots, so maybe you know what you're doing

agile owl Nov 22, 2023, 6:15 PM

#

I have decent knowledge of markets I could get better at algos though

desert oar Nov 22, 2023, 6:15 PM

#

but then i'm sure you know that you're not the only person attempting to profit off of well-known behavior

agile owl Nov 22, 2023, 6:15 PM

#

yeah but it's about money-weighted views not people-weighted views

#

if the people with the most money act like whales then you can be the barnacle

#

or the fish that eat their dead skin

#

if the big companies do things on a scale that such things don't matter to them

#

then you can profit from the inefficiency it creates

#

every monthend they rebalance their portfolios etc.

#

just the fact that they wait until the end of the month and do it at the end of the month every time is an exploitable ineffcieicny

agile cobalt Nov 22, 2023, 6:18 PM

#

jagged hedge i have a macbook pro 13 inch M2 with 8GB memory how can i run LLMs like 66B para...

locally: I don't think so
(also, remember that RAM memory and GPU memory are two different things)
on the cloud: Yes, but it can get pretty expensive
"are they worth it": depends. If you need to ask, probably not.

desert oar Nov 22, 2023, 6:19 PM

#

agile owl just the fact that they wait until the end of the month and do it at the end of ...

for sure. like i said you seem to know what you're doing, so best of luck

agile owl Nov 22, 2023, 6:19 PM

#

I have some idea, some vision

#

I think saying I know what I'm doing is a stretch 😄

#

I do have an idea of where inefficiencies exist

#

and why they exist

#

but how to exploit them most effetively is what I'm trying to discover

#

if there's a better way to fit parameters for trading strategies than DEAP I'd like to find it too

jagged hedge Nov 22, 2023, 6:21 PM

#

agile cobalt locally: I don't think so (also, remember that RAM memory and GPU memory are two...

yup i am asking if they are worth it cause i am a student trying to test their capability can cannot bear monthly costs of 200-300USD to merely running a model for learning

agile owl Nov 22, 2023, 6:22 PM

#

I just assumed the cost function's topology was too ill-defined to use other techniques

#

but that might not be true

#

things like what combination of weights on moving averages, lenth of moving average windows, number of lags in hurst exponent calculations, etc.

#

there is no "y"-value

#

in a traditional sense

#

I think what I would use if I weren't using DEAP is reinforcement learning

#

but I still need to learn more about that and it takes more setup

desert oar Nov 22, 2023, 6:25 PM

#

agile owl I just assumed the cost function's topology was too ill-defined to use other tec...

is there a tldr for how your algorithm actually works?

agile owl Nov 22, 2023, 6:25 PM

#

it looks at the alignment of different periodicities of hurst exponents and moving averages

desert oar Nov 22, 2023, 6:25 PM

#

maybe you can adjust or constrain it somehow, or reparameterize it to be something that can be optimized more easily

agile owl Nov 22, 2023, 6:25 PM

#

and decides to go with or against trends

#

and scales bracket orders to conditional volatility

#

and the objective function itself is an accumulation of profit

#

that arises from its actions

#

overfitting is a serious problem though because what will generate the most profit changes over time periods

desert oar Nov 22, 2023, 6:27 PM

#

i don't know what a hurst exponent is, i'm not a finance person myself. but it sounds like you have some kind of iterative thing, where the procedure collects data for some fixed period of time, then takes an action, then repeats?

agile owl Nov 22, 2023, 6:28 PM

#

yes it's iterative

#

it takes an action given a signal that exists at one time

#

and we don't know what the value of that action is until some indeterminate time in the future

#

when it meets an exit criteria

#

whether that is a stop out or a take profit

desert oar Nov 22, 2023, 6:28 PM

#

i see, yeah that definitely sounds like it could be a problem. it sounds like you were on the "high variance" side of the bias-variance tradeoff

agile owl Nov 22, 2023, 6:29 PM

#

yes

desert oar Nov 22, 2023, 6:29 PM

#

first things first, i'm not familiar with the technicalities of these models, but i know there is quite an extensive literature of reinforcement learning for exactly this kind of iterative agent scenario

#

so you first might want to just check to see what already exists to avoid reinventing the wheel

agile owl Nov 22, 2023, 6:30 PM

#

I was avoiding the investment into RL because you have to design the game

#

which is a bit more involved than setting up DEAP

#

but a bullet I will have to bite eventually

desert oar Nov 22, 2023, 6:30 PM

#

but in general, there are two broad approaches in machine learning and statistics to avoid overfitting: reduce the amount of information the model can obtain from the training data set, or try to generate a large collection of realistic synthetic data sets, and average across many model fits on those data sets

agile owl Nov 22, 2023, 6:31 PM

#

first approach seems more feasible here

desert oar Nov 22, 2023, 6:31 PM

#

agile owl I was avoiding the investment into RL because you have to design the game

i believe RL is a broader literature than the stuff we've been seeing in the last few years with "AI plays Mario" type of thing. there are traditional algorithms like the "multi armed bandit" that as far as I know fall under the category of RL

agile owl Nov 22, 2023, 6:31 PM

#

one thing I was thinking about doing is actually inverting the relationship between test and train

#

I train on the smaller, later dataset

#

and then see if it performed adequately on the larger one

#

in case we revert to a different historical regime

#

afaik I have no theoretical basis for doing that though

desert oar Nov 22, 2023, 6:32 PM

#

so what are the actual parameters in this model? what is being learned/fitted here?

#

fitting on a smaller data set might just make the overfitting worse, hard to say

agile owl Nov 22, 2023, 6:33 PM

#

trigger criteria, window lengths, and weights that sum to one

#

and a couple scaling values

#

toolbox.attr_fast_period,
toolbox.attr_med_period,
toolbox.attr_slow_period,
toolbox.attr_ma_signal_period,
toolbox.attr_hurst_signal_period,
toolbox.attr_hurst_lags,
toolbox.attr_trend_fast_weight,
toolbox.attr_trend_med_weight,
toolbox.attr_trend_slow_weight,
toolbox.attr_reversion_fast_weight,
toolbox.attr_reversion_med_weight,
toolbox.attr_reversion_slow_weight,
toolbox.attr_reversion_sigma_open,
toolbox.attr_reversion_sigma_close,
toolbox.attr_trend_sigma_open,
toolbox.attr_trend_sigma_close,
toolbox.attr_trend_sigma_stop,
toolbox.attr_reversion_sigma_stop,
toolbox.attr_hurst_kill_period_reversion,
toolbox.attr_hurst_kill_period_trend,
toolbox.attr_hurst_trending_trigger,
toolbox.attr_hurst_reversion_trigger,
toolbox.attr_trend_fundamental_scaler,
toolbox.attr_reversion_fundamental_scaler,
toolbox.attr_fast_fundamental_period,
toolbox.attr_med_fundamental_period,
toolbox.attr_slow_fundamental_period,
toolbox.attr_reversion_fast_fundamental_weight,
toolbox.attr_reversion_med_fundamental_weight,
toolbox.attr_reversion_slow_fundamental_weight,
toolbox.attr_trend_fast_fundamental_weight,
toolbox.attr_trend_med_fundamental_weight,
toolbox.attr_trend_slow_fundamental_weight,
toolbox.attr_hurst_bottomout_trigger,
toolbox.attr_hurst_topout_trigger

#

things like this

desert oar Nov 22, 2023, 6:33 PM

#

i see

agile owl Nov 22, 2023, 6:34 PM

#

it behaves differently in trending and reverting environments so they have independent parameters

desert oar Nov 22, 2023, 6:34 PM

#

are these all or mostly continuous numerical values bounded in some known range? you might want to try something in the category of bayesian blackbox optimization instead of evolutionary algorithm

agile owl Nov 22, 2023, 6:34 PM

#

some of them are ints

#

some of them are real numbers

desert oar Nov 22, 2023, 6:34 PM

#

I don't have much experience with the latter, but I have modest experience with the former for hyperparameter tuning of machine learning models

past meteor Nov 22, 2023, 6:34 PM

#

both of them can be used for the same set of problems

#

Bayesian optimization is sequential of nature

#

EA are made for parallelism

agile owl Nov 22, 2023, 6:35 PM

#

parallelism is strong here

#

the time complexity is high

desert oar Nov 22, 2023, 6:35 PM

#

in bayes opt you can still use parallelism to get better exploration of the parameter space at each step

past meteor Nov 22, 2023, 6:36 PM

#

Bayes opt also tries to be really efficient in the amount of iterations it needs to find the optimum

#

I think bayes opt is closer to exploitation on the exploration - exploitation scale

desert oar Nov 22, 2023, 6:37 PM

#

my thinking is that you might get better regularization out of it, fitting something closer to a smooth curve over the parameter space

#

not sure if that intuition is off

agile owl Nov 22, 2023, 6:38 PM

#

I'll look into it

#

is there a lib you can recommend

desert oar Nov 22, 2023, 6:38 PM

#

optuna and hyperopt in python. i've had good results with the latter specifically

#

i also wonder if you can simplify this somewhat by learning individual sub models instead of trying to optimize the entire thing all at once

agile owl Nov 22, 2023, 6:39 PM

#

I think all of the parameters interact with each other though

desert oar Nov 22, 2023, 6:39 PM

#

for example, if you need to forecast something in order to make a decision, you can fit a separate probabilistic model for that specific thing, and use the distribution of predicted forecasts as input to some decision component

past meteor Nov 22, 2023, 6:39 PM

#

I'm still thinking of the EA vs bayesian opt

agile owl Nov 22, 2023, 6:39 PM

#

like if I add a set of genes to the EA to do someting

#

they will change the optimal values of the other genes

past meteor Nov 22, 2023, 6:39 PM

#

If evaluating your f isn't expensive I'd always take EA's

desert oar Nov 22, 2023, 6:39 PM

#

yeah, i'm just thinking of heuristics that might get you closer to something that works well without overfitting, and might be faster/easier to iterate on

desert oar Nov 22, 2023, 6:40 PM

#

past meteor If evaluating your `f` isn't expensive I'd always take EA's

that would be like training your model and evaluating its performance?

#

which in this case doesn't seem to require intensive numerical computation, so maybe it's cheap to evaluate, in which case you can take advantage of the high exploration potential of EA?

#

i'm curious in that case if there are just some parameters you can tune to reduce sensitivity to data

past meteor Nov 22, 2023, 6:41 PM

#

desert oar that would be like training your model and evaluating its performance?

Yes

agile owl Nov 22, 2023, 6:41 PM

#

yeah I have one that scales fundamental signals based on a separate model

#

I tried making it a bool at first

#

and the version where it was nonzero always won

past meteor Nov 22, 2023, 6:41 PM

#

You can turn it into a multi-objective optimization problem and add an objective tangentially related to overfitting if regularization isn't enough

agile owl Nov 22, 2023, 6:42 PM

#

so I changed it to scale freely

#

i see

#

I tried to do that to some extent by making it scale to the overall number/size of transactions in the fitness function

#

but it was insufficient

#

I think the problem is the market switches dynamics

#

over time

#

so either have to figure out a way to segment the time series beforehand using something like an HMM state prediction

#

(but that presumes your HMM has the data that explains the switch in regimes)

#

but that will also produce its own form of overfitting I thin

#

where it assumes everything is too much like the other datapoints that are assigned that state

#

i'll look into bayes though thanks everyne

#

btw if you work with time series that exhibit heteroskedasticity or other types of heterogeneous behavior over time I think hurst exponents can be a valuable piece of information for signal processing

#

might be valuabe in other domains too

#

it's just a measure of how much something is like brownian noise

#

vs trending against the mean vs reverting to a mean

#

https://research.macrosynergy.com/detecting-trends-and-mean-reversion-with-the-hurst-exponent/#:~:text=The Hurst exponent is a single scalar value that indicates,the rate of diffusive behavior.

Macrosynergy Research

Editor

Detecting trends and mean reversion with the Hurst exponent | Macro...

The Hurst exponent is a statistical measure of long-term memory of time series. The existence and form of such memory are of great interest in financial markets, as financial returns are not generally governed by random walks. The Hurst exponent is a single scalar value that indicates if a time series is purely random, trending, […]

#

probably has applications to fraud detection, etc.

#

when something begins to trend that wasn't before

#

like is the number of users registering for your app from north macedonia trending hard

#

although you could probably catch something like that more easily

#

…[The concept] was originally developed in hydrology for the practical matter of determining the optimum size of the dam for the Nile river by Harold Edwin Hurst [but] can help us classify the pattern of time series of prices under a certain time horizon.” [Bui and Ślepaczuk]

desert oar Nov 22, 2023, 6:59 PM

#

in my usual understanding of RL, part of what the algorithm does is adapt dynamically to the environment, for example the multi arm bandit. part of why i suggested looking into that literature is because maybe you could get ideas for how to make your model adaptive, rather than every parameter exactly to whatever happens to be in your historical data, which i suspect will always overfit to some extent

#

or maybe that's what those parameters you showed me are doing, i don't know enough about the models and domain to comment on that

agile owl Nov 22, 2023, 6:59 PM

#

right I wanted to find a way to make this online

#

There is a layer of conditionality that is imposed by my apriori assumptions of what might be good

#

and the DEAP algo results either confirm or reject my hypothesis (at least on the same data)

#

the problem is generalizability which I think online learning would solve

#

at least as well as one could expect to solve it

#

 if self.trending_bull:
                self.state_history.append(1)
                fundamental_factor = vol_est * self.fundamental_signal_trend * self.p.trend_fundamental_scaler
                if self.data.close < self.target_trend - self.p.trend_sigma_open * vol_est - fundamental_factor:
                    stop_px = self.target_trend - self.p.trend_sigma_stop * vol_est - fundamental_factor
                    limit_px = self.target_trend - self.p.trend_sigma_close * vol_est - fundamental_factor
                    trade = self.sell_bracket(limitprice=limit_px, stopprice=stop_px)
                    self.log(f'TRENDING BULL - ENTRY {trade[0].price} - STOP LOSS {stop_px} - TAKE PROFIT {limit_px}')
``` so for example all of this logic was something I proposed a priori and simply fit the parameters the logic uses using DEAP

#

and I've been iteratively adding to it depending on what improves fitness and what doesn't

#

anything that is under the p attribute is a parameter exposed to DEAP directly

#

an example genome looks something like this

        champ = (
            [18, 52, 100, 6, 55, 9, 55, 94, 60, 15, 47, 149, 3.342307816105754, 1.094156334970098, 1.13173368372055, 10,
             -0.7274971304499664, 58, 24, 4, 0.6331356749123428, 0.3652205723791574, 7.9251262018994435, 39, 60, 134, 239,
             3, 38, 57, 123, 97, 9, 0.17026288765418376, 0.7678126560406366]
        )```

#

the ints are either windows or weights I designed to sum to one by taking the sum of multiple parameters as the denominator

#

I figured ints are fine for that since it would be false precision to use floats anyway

#

it's already producing a float at the end

#

where each int is the numerator and the sum is the denominator

#

this is one of the fitness functions I played around with to try to reduce overfitting by increasing the number of actions by the agent
fitness = profit * (profit / (max_dd if max_dd else 1)) ** 2 * np.sqrt(no_trades)

#

where no_trades is the number of actions that have been consummated

#

dd = drawdown

#

they get instantited from these prior distributions:

    toolbox.register("attr_fast_period", random.randint, 10, 51)
    toolbox.register("attr_med_period", random.randint, 15, 101)
    toolbox.register("attr_slow_period", random.randint, 20, 201)
    toolbox.register("attr_hurst_signal_period", random.randint, 1, 101)
    toolbox.register("attr_ma_signal_period", random.randint, 1, 101)
    toolbox.register("attr_hurst_lags", random.randint, 7, 20)
    toolbox.register("attr_trend_fast_weight", random.randint, 1, 100)
    toolbox.register("attr_trend_med_weight", random.randint, 1, 100)
    toolbox.register("attr_trend_slow_weight", random.randint, 1, 100)
    ...
    toolbox.register("attr_fast_fundamental_period", random.randint, 20, 120)
    toolbox.register("attr_med_fundamental_period", random.randint, 60, 240)
    toolbox.register("attr_slow_fundamental_period", random.randint, 180, 360)
    toolbox.register("attr_reversion_fast_fundamental_weight", random.randint, 1, 101)
    toolbox.register("attr_reversion_med_fundamental_weight", random.randint, 1, 101)
    toolbox.register("attr_reversion_slow_fundamental_weight", random.randint, 1, 101)
    toolbox.register("attr_trend_fast_fundamental_weight", random.randint, 1, 101)
    toolbox.register("attr_trend_med_fundamental_weight", random.randint, 1, 101)
    toolbox.register("attr_trend_slow_fundamental_weight", random.randint, 1, 101)
    toolbox.register("attr_hurst_bottomout_trigger", random.uniform, 0.1, 0.3)
    toolbox.register("attr_hurst_topout_trigger", random.uniform, 0.55, 0.8)

#

when I do booleans I use random.choice on a tuple containing 0 and 1

#

the overfitting in practice

#

train

#

test

#

the topline should become more positive and should be large in comparison to the red line beneath it if things are going well

desert oar Nov 22, 2023, 7:48 PM

#

yeah, i think my only suggestion at this point is to look at reinforcement learning to see how they do it

#

it sounds like you have a good conceptual framework, which is probably the most important thing

agile owl Nov 22, 2023, 8:06 PM

#

https://arxiv.org/pdf/1504.08168.pdf I found this on EA overfitting

lapis sequoia Nov 22, 2023, 8:33 PM

#

I am so bored with supervised learing. Need new ideas

#

Do any of you think that put 3000 hours into python and data stuff is kind of insane for the time span of a single year?

agile cobalt Nov 22, 2023, 8:37 PM

#

3000 hours in a year sounds pretty insane for anything at all

torpid quartz Nov 22, 2023, 8:38 PM

#

Anyone got practical ML/AL resources for beginners? Something that shows the process of making a ML project and doesn’t get too deep into math.

lapis sequoia Nov 22, 2023, 8:38 PM

#

agile cobalt 3000 hours in a year sounds pretty insane for anything at all

Why?

torpid quartz Nov 22, 2023, 8:39 PM

#

My mind kind of blanks when I look at a math formula

agile cobalt Nov 22, 2023, 8:39 PM

#

lapis sequoia Why?

that's more than 8 hours per day every single day?

lapis sequoia Nov 22, 2023, 8:39 PM

#

Yes.

#

There was a time period for 3 months straight were I would program 14 hours a day every single day and go through dataset after dataset.

agile cobalt Nov 22, 2023, 8:41 PM

#

torpid quartz Anyone got practical ML/AL resources for beginners? Something that shows the pro...

machine learning is literally built upon math/statistics
you'll have to get used to math to some extent if you want to build machine learning models

#

is your issue just the notation, or the math itself?

torpid quartz Nov 22, 2023, 8:43 PM

#

agile cobalt is your issue just the notation, or the math itself?

Mostly the notation. A lot of ML math seems to be pretty simple, but the formulas have all sorts of weird symbols and stuff. I guess I’m not used to them

#

I dunno calculus though.

lapis sequoia Nov 22, 2023, 8:43 PM

#

Dude, math is a very vast thing. The most vast thing ever

lapis sequoia Nov 22, 2023, 8:43 PM

#

torpid quartz I dunno calculus though.

learn it

#

it is not that bad

agile cobalt Nov 22, 2023, 8:44 PM

#

you could try keeping a reference sheet with the meaning of common symbols, or just spend some time properly learning it, but you'll have to get used to it one way or the other

torpid quartz Nov 22, 2023, 8:45 PM

#

agile cobalt you could try keeping a reference sheet with the meaning of common symbols, or j...

K ig

lapis sequoia Nov 22, 2023, 8:46 PM

#

@torpid quartz What is the highest math course you have ever took?

torpid quartz Nov 22, 2023, 8:46 PM

#

lapis sequoia <@828438193909006426> What is the highest math course you have ever took?

Geometry

lapis sequoia Nov 22, 2023, 8:46 PM

#

ok, not bad

#

take trig, then calc

#

If you want. I took calc1-3 before I touched stats

torpid quartz Nov 22, 2023, 8:47 PM

#

I don’t think trig is a course where I am

lapis sequoia Nov 22, 2023, 8:47 PM

#

which, I do not knoe about

lapis sequoia Nov 22, 2023, 8:47 PM

#

torpid quartz I don’t think trig is a course where I am

What kind of ML are you trying to do?

torpid quartz Nov 22, 2023, 8:48 PM

#

lapis sequoia What kind of ML are you trying to do?

Idk, I think computer vision is pretty cool

lapis sequoia Nov 22, 2023, 8:48 PM

#

Like I took insane grad level optimization classes and I barely use scipy,optimize

lapis sequoia Nov 22, 2023, 8:48 PM

#

torpid quartz Idk, I think computer vision is pretty cool

just find a way to do it

torpid quartz Nov 22, 2023, 8:49 PM

#

The tutorials online only seem to touch the surface of opencv and CNNs

lapis sequoia Nov 22, 2023, 8:49 PM

#

Ok, what have you done so far in ML?

torpid quartz Nov 22, 2023, 8:50 PM

#

Uhh… train a decision tree with scikit learn and train the mninst dataset

#

Basically nothing

#

Bit of face detection, but not with a NN

lapis sequoia Nov 22, 2023, 8:51 PM

#

what do you use the most for ML?

torpid quartz Nov 22, 2023, 8:52 PM

#

Huh? Like what language?

#

I know a bit of scikit learn I guess

lapis sequoia Nov 22, 2023, 8:53 PM

#

torpid quartz Huh? Like what language?

yes

torpid quartz Nov 22, 2023, 8:53 PM

#

I know python and rust pretty well in terms of general programming

#

Bit of c++, bit of Haskell, bit of ts

lapis sequoia Nov 22, 2023, 8:54 PM

#

just do whatever you want. No one is stopping you

torpid quartz Nov 22, 2023, 8:55 PM

#

Ok I guess

#

I’m thinking of if there’s a way to recognize hand gestures using ML

lapis sequoia Nov 22, 2023, 8:56 PM

#

just do it

#

that is it

#

you just do it

torpid quartz Nov 22, 2023, 8:56 PM

#

With… what?

#

I’ve jumped headfirst into stuff before, but I have no idea how to even start this

lapis sequoia Nov 22, 2023, 8:58 PM

#

Straight up, I would get more comfortable with basic stuff before jumping in

torpid quartz Nov 22, 2023, 9:03 PM

#

lapis sequoia Straight up, I would get more comfortable with basic stuff before jumping in

Basic stuff as in math, or concepts, or simple algorithms that aren’t NNs?

lapis sequoia Nov 22, 2023, 9:05 PM

#

I do not know, you are confusing me.

torpid quartz Nov 22, 2023, 9:07 PM

#

Ok sorry

agile owl Nov 22, 2023, 9:12 PM

#

lapis sequoia I am so bored with supervised learing. Need new ideas

try iterative agents 🙂

#

reinforcement learning

past meteor Nov 22, 2023, 9:12 PM

#

torpid quartz Anyone got practical ML/AL resources for beginners? Something that shows the pro...

kaggle.com is good

agile owl Nov 22, 2023, 9:13 PM

#

reinforcement learning I think addresses a much more interesting class of problems than supervised learning

#

not "what should the next value be" or "what class does this individual belong to" but "how should I behave over time to optimize some metric"

#

I haven't gotten into it as much as I'd like myself

iron basalt Nov 22, 2023, 9:37 PM

#

lapis sequoia Do any of you think that put 3000 hours into python and data stuff is kind of in...

Not insane, just probably not something most people have the time to do.

lapis sequoia Nov 22, 2023, 9:40 PM

#

agile owl reinforcement learning

No

agile owl Nov 22, 2023, 9:40 PM

#

ok

lapis sequoia Nov 22, 2023, 9:40 PM

#

Not now at least

umbral charm Nov 22, 2023, 9:52 PM

#

import numpy as np
import matplotlib.pyplot as plt
#do not truncate
np.set_printoptions(threshold=np.inf)
x, y, z  = np.loadtxt(fname = 'data.csv', unpack = True, delimiter = ',', skiprows = 1,) #load data
X, Y = np.meshgrid(x, y)
Z = (2/5)*np.e**(-X**2/2) + (2/5)*np.e**(-Y**2/2) - (3/5)
r = (Z - z)**2
print(np.max(r))
print(np.min(r))
print(np.where(r == 1.7784100967330392e-16))
print(r[107, 107])

#

this retuns

#

0.2544509833800965
1.7784100967330392e-16
(array([ 4, 36], dtype=int64), array([107, 107], dtype=int64))
0.0015725117183072214

#

how come i cant find what index 1.77....E-16 is

desert oar Nov 22, 2023, 9:54 PM

#

umbral charm ```py import numpy as np import matplotlib.pyplot as plt #do not truncate np.set...

!d numpy.argmin

arctic wedgeBOT Nov 22, 2023, 9:54 PM

#

numpy.argmin


numpy.argmin(a, axis=None, out=None, *, keepdims=<no value>)```
Returns the indices of the minimum values along an axis.

desert oar Nov 22, 2023, 9:55 PM

#

!e ```python
import numpy as np
x = np.array([0, -1e-16, 2, 4, 6, 8])
i_min = np.argmin(x)
print((i_min, x[i_min]))

umbral charm Nov 22, 2023, 9:55 PM

#

desert oar !d numpy.argmin

Would this work

arctic wedgeBOT Nov 22, 2023, 9:55 PM

#

@desert oar :white_check_mark: Your 3.12 eval job has completed with return code 0.

(1, -1e-16)

umbral charm Nov 22, 2023, 9:55 PM

#

on 2d arrays

desert oar Nov 22, 2023, 9:56 PM

#

umbral charm on 2d arrays

yes, but check the docs. the axis= function controls how it works on arrays of > 1 dimension

umbral charm Nov 22, 2023, 9:56 PM

#

desert oar yes, but check the docs. the `axis=` function controls how it works on arrays of...

will do

#

So what was wrong with np.where?

desert oar Nov 22, 2023, 9:57 PM

#

umbral charm So what was wrong with np.where?

nothing, but read the docs carefully to see what that output means

umbral charm Nov 22, 2023, 9:57 PM

#

desert oar nothing, but read the docs carefully to see what that output means

Yep

desert oar Nov 22, 2023, 9:57 PM

#

well, the problem is that you're looking for exact floating point equality, which is squirrely

umbral charm Nov 22, 2023, 9:57 PM

#

just realised (array([ 4, 36], dtype=int64), array([107, 107], dtype=int64)) this means it in index [4, 107]

desert oar Nov 22, 2023, 9:57 PM

#

argmin is going to be less fussy

#

also, the axis keyword is a bit funky. it tells you which axis/dimension is "consumed" by the operation. so axis=0 means that it will find the argmin by "consuming" the 0th (outermost) axis, returning a result with the other axes intact

#

!e ```python
import numpy as np
x = np.arange(9).reshape((3, 3))
print(np.argmin(x, axis=0))

arctic wedgeBOT Nov 22, 2023, 9:58 PM

#

@desert oar :white_check_mark: Your 3.12 eval job has completed with return code 0.

[0 0 0]

desert oar Nov 22, 2023, 9:59 PM

#

!e ```python
import numpy as np
import numpy.random

x_flat = np.arange(345)
np.random.shuffle(x_flat)

x = x_flat.reshape((3, 4, 5))

print(np.argmin(x, axis=-1))

etc.

arctic wedgeBOT Nov 22, 2023, 9:59 PM

#

@desert oar :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | [[4 0 4 1]
002 |  [1 4 1 1]
003 |  [3 1 1 0]]

desert oar Nov 22, 2023, 10:00 PM

#

(oops, it doesn't like argmin over multiple axes... TIL)

umbral charm Nov 22, 2023, 10:00 PM

#

Im guessing argmax

#

is also asthing

desert oar Nov 22, 2023, 10:00 PM

#

indeed, but it's better to check the docs than guess 😉

#

https://numpy.org/doc/stable/reference/routines.html

agile owl Nov 22, 2023, 10:55 PM

#

@past meteor what should the number of generations and size of the population be functions of when deciding those hyperparameters for EA? the crossover prob and mutation probs seem like pure shots in the dark but I imagine that you can reason about what you want here. Specifically, I'm wondering if having too many generations leads to overfitting.

#

seems like it should to me

#

seems like NGEN should be picked relative to the size of the dataset and number of codons in an individual but not sure if there's some rule of thumb here

lapis sequoia Nov 22, 2023, 11:53 PM

#

What IDE do you guys use?

agile owl Nov 23, 2023, 12:19 AM

#

pycharm

#

I'm thinking one way to get around this problem is exponential decay of profit so that profits from a long time ago add less to the fitness

#

the other idea I had is to use two fitness functions, one that calculates a reduced result over several segments and one that operates just on the final test

#

which isn't ideal but it seems to be a very direct way to get the existing algorithm to do what i want

agile owl Nov 23, 2023, 12:50 AM

#

        inner_fitnesses.append(inner_fitness)
    fitness = np.median(inner_fitnesses)``` let's see where this black magic leads

#

it just werks

#

that's the other nice thing about EAs

#

I can do whatever I want and it's up to me to decide whether it makes sense or I like it

#

I can make it more conservative just by switching out median for min or some percentile below 50

#

the numbers are so much smaller it makes me sad but that's proof it's working

#

it takes 90 minutes across 16 high end Ryzen mobile cores from 2021

#

idk if that's considered expensive to you or not for EA

lapis sequoia Nov 23, 2023, 3:09 AM

#

why is there like a facebook link under one of my repos?

cold osprey Nov 23, 2023, 3:35 AM

#

lapis sequoia why is there like a facebook link under one of my repos?

wdym lel

lapis sequoia Nov 23, 2023, 3:38 AM

#

cold osprey Nov 23, 2023, 3:42 AM

#

wheres that

lapis sequoia Nov 23, 2023, 4:35 AM

#

Under traffic

agile owl Nov 23, 2023, 4:42 AM

#

I ended up finding a way to reduce the variance by fitting two separate models based on an innate state segmentation of the training data (i.e., whether the central bank is raising interest rates or not as opposed to doing some latent state model)

#

I also did the segmentation into different time periods and calculating the median fitness thing

#

I consider it a workable solution for now but will probably follow up on the oter suggestions you guys made here

rugged comet Nov 23, 2023, 5:14 AM

#

loud plaza print("Hi")

?

rugged comet Nov 23, 2023, 5:16 AM

#

rugged comet I am writing a DecisionTreeClassifer from scratch for fun and to help me underst...

Also bumping my question.

agile owl Nov 23, 2023, 5:24 AM

#

is the idea that you want to write one that will be plug and play with the rest of scikit learn

#

or just a decision tree classifier in general

#

because I think that class name is specifically one from sklearn isn't it

#

the decision tree algorithm in general there's a lot of resources

rugged comet Nov 23, 2023, 5:26 AM

#

So sklearn already has a DecisionTreeClassifier. I'm trying to create my own without looking at sklearns source code.
Just a decision tree clasifier in general.

agile owl Nov 23, 2023, 5:26 AM

#

https://towardsdatascience.com/entropy-how-decision-trees-make-decisions-2946b9c18c8

Medium

Entropy: How Decision Trees Make Decisions

You’re a Data Scientist in training. You’ve come a long way from writing your first line of Python or R code. You know your way around…

#

this might be helpful

#

I was just asking because oftentimes people want to create custom versions of library classes and want them to play nicely with the rest of the library

#

which is a lot harder to do than just writing something

rugged comet Nov 23, 2023, 5:27 AM

#

agile owl I was just asking because oftentimes people want to create custom versions of li...

That shouldn't be necessary.

agile owl Nov 23, 2023, 5:30 AM

#

I skimmed it and it seems as good as any other discussion of how DTs work inside

rugged comet Nov 23, 2023, 5:30 AM

#

agile owl https://towardsdatascience.com/entropy-how-decision-trees-make-decisions-2946b9c...

From a quick glance-over, I don't think this goes into how model weights/decisions are stored within the object.

agile owl Nov 23, 2023, 5:31 AM

#

well part of the joy of writing custom classes is you can decide how to do that yourself

rugged comet Nov 23, 2023, 5:31 AM

#

ah

agile owl Nov 23, 2023, 5:32 AM

#

if you want to see how sklearn does it you can try to go into their source code but I'm sure that it's an implementation of a base class and you'll have to jump all over to find the full picture

#

sounded like you had an interesting idea why not go for it and see what works and what doesn't

rugged comet Nov 23, 2023, 5:38 AM

#

I'm thinking about how to store the decision at each node in the tree. Writing something like feature < threshold for a continuous column wouldn't work because it just gets evauated and doesn't save the condition itself.

#

Perhaps I could store the operator and the operands separately.

iron basalt Nov 23, 2023, 5:40 AM

#

rugged comet I'm thinking about how to store the decision at each node in the tree. Writing s...

You can do something cute here in Python and use operator overloading.

rugged comet Nov 23, 2023, 5:43 AM

#

Wouldn't I have to overload the operator in whatever class the feature is and whatever class the threshold is? Or maybe I could create a new Decision class...

iron basalt Nov 23, 2023, 5:43 AM

#

rugged comet Wouldn't I have to overload the operator in whatever class the feature is and wh...

Yes.

rugged comet Nov 23, 2023, 5:44 AM

#

iron basalt Yes.

Yes to making a new class or yes to overloading in the operands' classes?

iron basalt Nov 23, 2023, 5:44 AM

#

rugged comet Yes to making a new class or yes to overloading in the operands' classes?

Classes and overloads, but you don't really need that, it's just so you can do this fancy thing of writing feature < threshold in Python.

rugged comet Nov 23, 2023, 5:45 AM

#

If I don't need that, I can't think of another way to save a conditional statement without it getting evaluated.

iron basalt Nov 23, 2023, 5:50 AM

#

!e ```py
def decision(a, b):
def foo():
return a < b
return foo

d = decision(10, 20)
print(d())

arctic wedgeBOT Nov 23, 2023, 5:50 AM

#

@iron basalt :white_check_mark: Your 3.12 eval job has completed with return code 0.

True

rugged comet Nov 23, 2023, 5:52 AM

#

Yes, that's what I was thinking for making a new class, more or less.

iron basalt Nov 23, 2023, 5:56 AM

#

!e ```py
import ast
print(ast.dump(ast.parse("x < y"), indent=2))

arctic wedgeBOT Nov 23, 2023, 5:56 AM

#

@iron basalt :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | Module(
002 |   body=[
003 |     Expr(
004 |       value=Compare(
005 |         left=Name(id='x', ctx=Load()),
006 |         ops=[
007 |           Lt()],
008 |         comparators=[
009 |           Name(id='y', ctx=Load())]))],
010 |   type_ignores=[])

rugged comet Nov 23, 2023, 5:57 AM

#

arctic wedge <@119925597395877889> :white_check_mark: Your 3.12 eval job has completed with r...

Can you explain what's going on here?

iron basalt Nov 23, 2023, 5:57 AM

#

rugged comet Can you explain what's going on here?

Directly using Python's parser to parse the code into an abstract syntax tree.

rugged comet Nov 23, 2023, 6:01 AM

#

Something like this perhaps

class Decision():
    def __init__(self, left_operand, right_operand, operator):
        self.left_operand = left_operand
        self.right_operand = right_operand
        self.operator = operator

    def evaluate(self):
        return self.left_operand.operator(self.right_operand)

iron basalt Nov 23, 2023, 6:01 AM

#

rugged comet Something like this perhaps ```py class Decision(): def __init__(self, left_...

Works too, you have a bunch of options.

#

op(left, right)```

rugged comet Nov 23, 2023, 6:03 AM

#

d = Decision(7, 8, __eq__)

It's saying the __eq__ isn't defined.
Oh preobably because it's a method.

iron basalt Nov 23, 2023, 6:05 AM

#

Can use lambda here too.

rugged comet Nov 23, 2023, 6:05 AM

#

That's a good idea.

iron basalt Nov 23, 2023, 6:07 AM

#

!e py less = lambda x, y: x < y print(less(10, 20))

arctic wedgeBOT Nov 23, 2023, 6:07 AM

#

@iron basalt :white_check_mark: Your 3.12 eval job has completed with return code 0.

True

rugged comet Nov 23, 2023, 6:07 AM

#

class Decision():
    def __init__(self, left_operand, right_operand, operator):
        self.left_operand = left_operand
        self.right_operand = right_operand
        self.operator = operator

    def evaluate(self):
        return self.operator(self.left_operand, self.right_operand)

d = Decision(7, 8, lambda x, y: x == y)
print(d.evaluate())

False

#

I think this will work but it feels kind of weird.

#

Using a function to evaluate an operator.

iron basalt Nov 23, 2023, 6:09 AM

#

Yup, and if you prefer __call__ instead of evaluate.

#

This is functional programming, It's cumbersome in Python since it does not support functional programming well, but it works.

rugged comet Nov 23, 2023, 6:10 AM

#

I didn't know __call__ existed. Thanks.
I guess this is rather new to me.

iron basalt Nov 23, 2023, 6:16 AM

#

rugged comet I didn't know `__call__` existed. Thanks. I guess this is rather new to me.

Haskell example: ```haskell
ghci> add a b = a + b
ghci> add 10 20
30
ghci> add_five x = add x 5
ghci> add_five 10
15
ghci> add_ten_and_five = add_five 10
ghci> add_ten_and_five
15

rugged comet Nov 23, 2023, 6:17 AM

#

Neat.

iron basalt Nov 23, 2023, 6:18 AM

#

https://en.wikipedia.org/wiki/Currying

Currying

In mathematics and computer science, currying is the technique of translating the evaluation of a function that takes multiple arguments into evaluating a sequence of functions, each with a single argument. For example, currying a function

    f
  

{\displaystyle f}

that takes three arguments creates a nested ...

desert oar Nov 23, 2023, 6:37 AM

#

rugged comet If I don't need that, I can't think of another way to save a conditional stateme...

you don't need to save the entire conditional statement in a binary decision tree, you just need to save the split point

rugged comet Nov 23, 2023, 6:38 AM

#

I think I'm making an n-ary decision tree.

desert oar Nov 23, 2023, 6:38 AM

#

with arbitrary n?

rugged comet Nov 23, 2023, 6:38 AM

#

Yeah

desert oar Nov 23, 2023, 6:39 AM

#

i'd say even then you probably can just store the split points in a tuple/list, no?

rugged comet Nov 23, 2023, 6:39 AM

#

Can you elaborate on what you mean by "split points"?

desert oar Nov 23, 2023, 6:39 AM

#

which decision tree algorithm are you implementing?

rugged comet Nov 23, 2023, 6:40 AM

#

It's a classification one if that's what you're asking. If not, then I don't know.

desert oar Nov 23, 2023, 6:41 AM

#

respectfully, i recommend clarifying the actual algorithm you want to implement before trying to implement it

#

there are standard algorithms for decision trees in ML

rugged comet Nov 23, 2023, 6:41 AM

#

https://www.youtube.com/watch?v=_L39rN6gz7Y
For the most part, I'm trying to do this. But with multiple classes instead of two classes.

YouTube

StatQuest with Josh Starmer

Decision and Classification Trees, Clearly Explained!!!

Decision trees are part of the foundation for Machine Learning. Although they are quite simple, they are very flexible and pop up in a very wide variety of situations. This StatQuest covers all the basics and shows you how to create a new tree from scratch, one step at a time.

NOTE: This is an updated and revised version of the Decision Tree St...

▶ Play video

desert oar Nov 23, 2023, 6:42 AM

#

i see. let me at least skim the video to see what they're presenting

rugged comet Nov 23, 2023, 6:42 AM

#

Okay.

desert oar Nov 23, 2023, 6:44 AM

#

this looks very much like a binary decision tree

#

i think you might be confused between the arity of the tree and the number of classes being predicted

rugged comet Nov 23, 2023, 6:44 AM

#

Can I have a binary decision tree with 3+ classes?

#

I was thinking about that. For creating the split points, I could do "if it is this class, go to the left. all other classes go to the right."

desert oar Nov 23, 2023, 6:45 AM

#

sure. the class score at each node is just the % of data points with that class in the node

#

well you can't use the classes to create the split points... otherwise, how could you make predictions on new data?

rugged comet Nov 23, 2023, 6:47 AM

#

I oversimplified a little bit. You would still determine which class is the best predictor for each split point using the Gini Impurity.

desert oar Nov 23, 2023, 6:48 AM

#

i don't follow, sorry

#

let's reserve the term "class" for the thing we are trying to predict - the output of the model

#

and let's use the term "category" for the inputs to the model

rugged comet Nov 23, 2023, 6:52 AM

#

desert oar i don't follow, sorry

So to build the tree, you determine which category gives the best split, the lowest Gini impurity. Then that category becomes the root. You send all rows that are True to the condition to the left and the rest to the right. WIth your new set of rows, for the left, you again determine the category that gives the best split. You do the same for the right. You continue until all leaves are pure, they only contain samples of one class.

desert oar Nov 23, 2023, 7:02 AM

#

rugged comet So to build the tree, you determine which category gives the best split, the low...

sure. you also need to consider that if you have multiple features, you need to choose the best category to split on from among all features

rugged comet Nov 23, 2023, 7:03 AM

#

desert oar sure. you also need to consider that if you have multiple features, you need to ...

How do you differentiate feature and category?

desert oar Nov 23, 2023, 7:03 AM

#

rugged comet How do you differentiate feature and category?

i'm admittedly surprised you're asking this question because i feel like you've been doing machine learning related things for a while

#

a feature is a column in the input data. a category is... a category. e.g. a feature might be "eye color" and some categories of eye color might be "blue" and "brown"

rugged comet Nov 23, 2023, 7:04 AM

#

I was just confirming.

desert oar Nov 23, 2023, 7:05 AM

#

got it. i'm not sure if you're learning this material in english or another language

#

and you don't need to keep splitting until the tree is perfectly pure. scikit-learn for example provides several stopping criteria

#

e.g. you can refuse to split any leaf that's below a certain size. or you can refuse to split any leaf where the purity gain is lower than some threshold.

#

or you can refuse to split beyond a certain maximum depth

rugged comet Nov 23, 2023, 7:08 AM

#

So are you saying that instead of finding the feature that gives the best split, you have to find the category that gives the best split among all features? I assume continuous features are also taken into consideration. Like which numeric threshold gives the best split. And then compare that with the rest of the categorical splits?

#

Say I have to features that each have 3+ categories, I would need to find the best category in order to determine the root node or any nodes thereafter?

desert oar Nov 23, 2023, 7:13 AM

#

rugged comet So are you saying that instead of finding the feature that gives the best split,...

yes, that's how a typical decision tree works

rugged comet Nov 23, 2023, 7:14 AM

#

I assumed that the video I linked was implying that each split happens feature-wise, not category-wise.

desert oar Nov 23, 2023, 7:14 AM

#

rugged comet I assumed that the video I linked was implying that each split happens feature-w...

can you clarify what you mean by "feature-wise"?

#

you only split on one feature at a time

#

but how do you choose which feature to split on? and how do you choose where to split?

#

the answer is that you choose the best split for each feature, and then you choose the feature whose best split is the best overall

rugged comet Nov 23, 2023, 7:18 AM

#

desert oar can you clarify what you mean by "feature-wise"?

I mean you pick a feature, such as Loves Soda, a binary feature, send all rows for which the condition is true to the left and the rest to the right.
Another example with 3 classes: Eye color. Send the blue-eyed rows to the left and the brown and green eyed rows to the right.
You'd determine which feature to split on by calculating the Gini Impurity for that split.

desert oar Nov 23, 2023, 7:20 AM

#

rugged comet I mean you pick a feature, such as Loves Soda, a binary feature, send all rows f...

You'd determine which feature to split on by calculating the Gini Impurity for that split.
be more specific. if i choose eye color, how do i decide that blue goes left and brown/green goes right? what if i have eye color, hair color, leg length, and forearm length all in the same dataset? how do i decide which feature to use for splitting each node?

rugged comet Nov 23, 2023, 7:25 AM

#

desert oar > You'd determine which feature to split on by calculating the Gini Impurity for...

if i choose eye color, how do i decide that blue goes left and brown/green goes right?
Would you not calculate the Gini Impurity for blue-left, green-left, and brown-left with the others going right? Whichever eye color gives the lowest total Gini Impurity, you select that as the Gini Impurity to represent that feature as you go on to compare it with the other features.

desert oar Nov 23, 2023, 7:25 AM

#

rugged comet > if i choose eye color, how do i decide that blue goes left and brown/green goe...

yes, that's right. but then how do i decide to split on eye color, or hair color, or one of the other features?

rugged comet Nov 23, 2023, 7:30 AM

#

desert oar yes, that's right. but then how do i decide to split on eye color, or hair color...

Say blue left was the best split for eye color. The Gini Impurity for that split is X. You then find the hair color that gives the best split. The Gini Impurity for that split is Y. You do this for all features. You now have a value for Gini Impurity that represents each feature. You select the feature who had the lowest impurity. You then split that feature by its best category.

#

That's how I assume it would be done, but I don't have any evidence.

desert oar Nov 23, 2023, 7:36 AM

#

rugged comet That's how I assume it would be done, but I don't have any evidence.

right, it's good to be very clear about what the algorithm is, before you try to implement it!

desert oar Nov 23, 2023, 7:37 AM

#

rugged comet Say blue left was the best split for eye color. The Gini Impurity for that split...

right, exactly. but that's equivalent to just looking at all splits of all categories of all features

past meteor Nov 23, 2023, 7:38 AM

#

agile owl <@260493929047130113> what should the number of generations and size of the popu...

No idea. This was also one of my takeaways of EAs very sensitive to hyperparameters and they're problem specific.

agile owl Nov 23, 2023, 7:41 AM

#

they can't make me redundant if I'm the only one who knows how to tune the hyperparameters 😎

#

./s

past meteor Nov 23, 2023, 7:42 AM

#

You don't need to implement decisions trees like this but it's good to remember they're also called recursive partitioning. If you know this you can conceptually simplify the problem to this "how do I split in one node" and "when do I stop splitting?"

rugged comet Nov 23, 2023, 7:46 AM

#

past meteor You don't need to implement decisions trees like this but it's good to remember ...

I was struggling while attempting to implement recursive node creation. I think I'll be able to do it though. I should be able to set the stop condition to be when a leaf is pure. I can add max depth and min samples later probably.
Have to change my code to split on category instead of feature though.

#

This project is fun so far. I'm glad I started it.

desert oar Nov 23, 2023, 7:49 AM

#

rugged comet I was struggling while attempting to implement recursive node creation. I think ...

you can still write your code to find the best split of each feature, and then choose the feature with the best split

#

the point i'm trying to emphasize here is that "splitting on feature" itself is an ill-defined concept

#

(also, terminology note: usually we think of every feature having its own distinct categories. so the categories of "eye color" are eye colors, the categories of "hair color" are hair colors, etc)

rugged comet Nov 23, 2023, 7:51 AM

#

desert oar (also, terminology note: usually we think of every feature having its own distin...

Every categorical feature or every feature including continuous numeric ones?

desert oar Nov 23, 2023, 7:51 AM

#

rugged comet Every categorical feature or every feature including continuous numeric ones?

in general, only categorical features have categories... it's in the name!

in the case of a decision tree specifically, we artificially create categories by splitting.

rugged comet Nov 23, 2023, 7:53 AM

#

desert oar in general, only categorical features have categories... it's in the name! in t...

in general, only categorical features have categories... it's in the name!
Just clarifying.
in the case of a decision tree specifically, we artificially create categories by splitting.
That makes sense. So like saying age > 25 creates categories out of continuous data.

past meteor Nov 23, 2023, 7:58 AM

#

In uni we had great slides on DT's I can send them to you @rugged comet

rugged comet Nov 23, 2023, 7:58 AM

#

past meteor In uni we had great slides on DT's I can send them to you <@188467763558350849>

I would appreciate that.

past meteor Nov 23, 2023, 8:04 AM

#

Maybe it goes into a bit too much detail towards the end. I think it's fine to go to slide 57, implement the tree and then go back and do the other 40

wooden sail Nov 23, 2023, 8:08 AM

#

send that to me as well :x

blissful meadow Nov 23, 2023, 9:09 AM

#

I have this AI model Im using flask to build. Our other backend is in node. The model takes 3 hours to get back with a response. How should I architect a request?

Should the node backend send an async post request with the data and wait for the the flask backend to respond. Or should the node just post and the flask says ok and does the process and post the results to the node backend separately?

cold osprey Nov 23, 2023, 9:23 AM

#

3 hours wut

#

Inference time of 3 hours?

blissful meadow Nov 23, 2023, 9:27 AM

#

Lets just say, it continously outputs what is needed and the end is a long list of things

#

I even wonder if I should go serverless

zealous tartan Nov 23, 2023, 9:35 AM

#

hey, do you have suggestion for getting started in data science and ai with python. I am only a high school student who will start college next summer. I know the basics of python like defining a function, lists, loops, file functions, conditions, etc. i also know few basic modules like math, mysql.connector, random. I would only invest in some paid courses if they are actually worth it and also that doesn't really have a prerequisite like uni level maths cause i cant really cram uni level maths in few months can i? or maybe i can who knows. Well your help is appreciated just ping me or dm me the suggestions Thanks :D Also I dont understand git hub even a bit tuitorial for that would be nice too I am dumb like a 5 year old excited about magicians(aka cs engineers)

past meteor Nov 23, 2023, 9:57 AM

#

blissful meadow I even wonder if I should go serverless

If it takes three hours you should absolutely not go serverless you'll burn money

past meteor Nov 23, 2023, 9:59 AM

#

blissful meadow I have this AI model Im using flask to build. Our other backend is in node. The ...

Can you explain what this model is? Why is it taking 3 hours? Why do you have a backend in flask and one in node?

blissful meadow Nov 23, 2023, 10:04 AM

#

past meteor Can you explain what this model is? Why is it taking 3 hours? Why do you have a ...

the node backend is our normal application backend. Our AI model is in python. While it is possible to run python in node etc, we dont want to do that. So, the other choice was to make another ec2 instance and deploy the rest api using flask.
It takes 3 hours because it does a lot of work. One output is fed to another and in total takes a lot of time. Since the output is depended on each other, the work is sequential. Honestly, somone else wrote it and Im sure it could be made faster but it is what it is right now

#

Its a large language model

past meteor Nov 23, 2023, 10:05 AM

#

Is it a neural network?

#

ah okay, that's an important thing to mention

#

CPU or GPU?

cold osprey Nov 23, 2023, 10:05 AM

#

has to be gpu rightt

blissful meadow Nov 23, 2023, 10:05 AM

#

yeah

#

we're calling an already build llm. Each call takes a minute. The amount of calls end up taking hours. The resource is not intensive on our end

past meteor Nov 23, 2023, 10:06 AM

#

cold osprey has to be gpu rightt

Stuff like BERT is still possible on CPU hence the question 😄

blissful meadow Nov 23, 2023, 10:07 AM

#

Hence the serverless consideration

past meteor Nov 23, 2023, 10:09 AM

#

Is this work you can do in-batch "whenever" and then give it to your main app? Does it need to be a web app?

brisk sage Nov 23, 2023, 10:55 AM

#

Hey guys, I’m trying to analyze a relatively small dataset (128 observations with subgroups as small as 20). This data is continuous, but very nonlinear (independent vs dependent variable).
Do you have any suggestions for a good model to analyze this?

OLS and GLM won’t work since it’s nonlinear, I’ve read and implemented a random Forest analysis, but with such a small sample size it might be prone to overfitting. Would a reduction of trees (say to 20/40 instead of 100/1000) work?

past meteor Nov 23, 2023, 11:12 AM

#

brisk sage Hey guys, I’m trying to analyze a relatively small dataset (128 observations wit...

polynomial features and transformations with GLMs
regularisation on your random forest (tune the cost complexity parameter). Reducing trees on random forest will not help you, it averages the trees. Reducing the amount of trees will likely cause more overfitting.
Consider using gradient boosting. Usually it has stronger defaults than RF. You can tune it by reducing the amount of trees
Try RBF-SVMs. They only have 2 or 3 hyper parameters for respectively classification and regression. They're more or less the best model on small datasets but they just scale poorly 🙂

blissful meadow Nov 23, 2023, 11:15 AM

#

past meteor Is this work you can do in-batch "whenever" and then give it to your main app? D...

no need for web app. I was going to create rest apis. Yeah, its essentially python code making requests to llm models and processing the results

past meteor Nov 23, 2023, 11:17 AM

#

blissful meadow no need for web app. I was going to create rest apis. Yeah, its essentially pyth...

If you're able to do things in batch: in real-world settings doing 5 vector-vector multiplications is slower than 1 matrix vector multiplication. Same idea applies for tensors. Basically, if you can pool requests and evaluate a bunch of them at once it'll be faster than doing them one-by-one.

#

You might be already doing that, in that case: I don't know 🤷 .

blissful meadow Nov 23, 2023, 11:47 AM

#

Im going to look into web hooks, pub/sub and the like.

past meteor Nov 23, 2023, 12:01 PM

#

blissful meadow Im going to look into web hooks, pub/sub and the like.

Yeah. I don't know the product you're building but you can always "answer" immediately from node and give them an endpoint to the Flask "worker" in this case which they can poll for the model's results.

#

But that won't work if it's 3 hours ofc 🤣

brisk sage Nov 23, 2023, 12:02 PM

#

past meteor 1. polynomial features and transformations with GLMs 2. regularisation on your...

Which of these would you recommend the most?
I have adapted the random Forest and created a SVM model, both of which return more less the same results. I have just been creating so many models lately and every new one is returning new results that I’m quite confused which is the best to use 😅
A friend who knows a thing or two about statistics recommended GLM, however with nonlinear data, a linear model wouldn’t do well, thus I came up with random forest

past meteor Nov 23, 2023, 12:03 PM

#

brisk sage Which of these would you recommend the most? I have adapted the random Forest an...

Have you tuned them or are you just using the default parameters?

#

GLMs work well if you transform your input

left tartan Nov 23, 2023, 12:12 PM

#

zealous tartan hey, do you have suggestion for getting started in data science and ai with pyth...

This is a better question for #career-advice : it sounds like you probably need to first practice your Python programming a bit and reinforce what you’ve learned, but you can also tackle some basic ML coding projects (see Kaggle.com/learn and CS50 for AI) to learn some of the coding basics. For preparing for college: make sure you math fundamentals are strong. Calculus is the typical first year course, and it’s not hard to prepare for if you put a little time in. I wouldn’t worry about AI/ML math if you haven’t started college yet. Just be ready for calculus.

brisk sage Nov 23, 2023, 12:43 PM

#

past meteor Have you tuned them or are you just using the default parameters?

I have adapted the hyperparameters max_depth, min sample split and min samples at each leaf. Also, I have used the sklearn cross validation for the MSE which at times does result in a rather big difference from the one calculated by the random forest

brisk sage Nov 23, 2023, 12:44 PM

#

past meteor GLMs work well if you transform your input

Since my data is right skewed, you’re referring to a logarithmic transformation of the input?
This might be difficult, since it contains many 0 values.
Is there no other way?
SVM seemed to work, what did you mean by scaling poorly?

weak mortar Nov 23, 2023, 12:59 PM

#

hello i have a quick question about interacting with HuggingFace API. you think here is the place to ask or some other room more appropriate ?

past meteor Nov 23, 2023, 1:07 PM

#

brisk sage Since my data is right skewed, you’re referring to a logarithmic transformation ...

Yeah it doesn't need to be logarithmic. You can also take abritrary polynomials https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PolynomialFeatures.html or even splines https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.SplineTransformer.html#sklearn.preprocessing.SplineTransformer

If you scroll down to the docs there's typically examples of how to use them.

past meteor Nov 23, 2023, 1:08 PM

#

brisk sage Since my data is right skewed, you’re referring to a logarithmic transformation ...

SVMs are possibly the most powerful model there is, but they're not spoken of / used a lot because they don't work if you have a lot of data. You don't, so it's perfect here. Be sure to tune its parameters as well though 🙂

serene scaffold Nov 23, 2023, 1:19 PM

#

past meteor SVMs are possibly the most powerful model there is, but they're not spoken of / ...

what about xgboost

past meteor Nov 23, 2023, 1:32 PM

#

serene scaffold what about xgboost

We'd have to run a benchmark. The appeal of SVMs are a universal approximator (contingent on tuning just needs 2 hyperparamers). Gradient boosting is a different beast, lots of dials you can turn.

In practice I always use Xgboost, CatBoost or HistGradientBoostingClassifier and I rarely use SVMs because:

Gradient boosting performs really strongly with 0 hyperparam tuning, same can't be said for SVMs. I rarely can be bother to tune them in reality.
I rarely have datasets small enough for support vector machines not to OOM.

#

On smallish datasets in the past I've run into situations where I couldn't outperform SVMs with gradient boosting tho

brisk sage Nov 23, 2023, 1:51 PM

#

past meteor We'd have to run a benchmark. The appeal of SVMs are a universal approximator (c...

Xgboosting is a lot easier to implement than data adaption via polynomials or splines.

However, now I have GLM, random forest, SVM, xgboosting, gradient boosting from sklearn and decision trees - how do I figure out which of all those p-values is the correct one?

past meteor Nov 23, 2023, 1:51 PM

#

brisk sage Xgboosting is a lot easier to implement than data adaption via polynomials or sp...

p-values? 🤔 What are you doing?

brisk sage Nov 23, 2023, 1:58 PM

#

Im trying to perform a multivariate analysis on my data, measuring the impact of something like age/time/temperature on amplitude values (given in percentage, hence the 0s).

Using these systems of course I get MSE values and have calculated a p value using a t test in the residuals.

I’m trying to publish a manuscript in medicine and the readers are rather fond of p values

past meteor Nov 23, 2023, 1:59 PM

#

Okay, that's a very important detail.

#

I'd use a GLM then as well. I'd ask this question to a statistician as well, not people doing ML.

west cloak Nov 23, 2023, 2:06 PM

#

I have a code

#

Can anyone look over it?

#

It is reinforcement learning and project is balls in bins

#

#

I am on b

brisk sage Nov 23, 2023, 2:13 PM

#

past meteor I'd use a GLM then as well. I'd ask this question to a statistician as well, not...

The problem is with the nonlinearity of my data. All attempts of adapting the code have resulted in endless errors, so I was looking for an alternative

#

Do you think a SVM would work here?

desert oar Nov 23, 2023, 2:37 PM

#

brisk sage The problem is with the nonlinearity of my data. All attempts of adapting the co...

do you have some scientific model in mind, or are you just guessing it's nonlinear because the model didn't seem to fit well?

#

p-values are problematic for several reasons, but i agree you are definitely looking for a more "statistical" approach

desert oar Nov 23, 2023, 2:40 PM

#

brisk sage The problem is with the nonlinearity of my data. All attempts of adapting the co...

it sounds like you're having trouble with the code, but it also sounds like you're struggling with the modeling. not a good combination, it can be overwhelming and hard to figure out what your problem is

#

so i suggest first figuring out some kind of modeling strategy first, and then getting it to work in code

#

you'll have to do that iteratively. start with a modeling strategy, then get it working in code. if the model is no good, repeat. don't try to do both simultaneously if you aren't confident with the code.

#

in statistical inference it's usually good practice anyway to have a model in mind before trying to fit a model. otherwise you end up just digging around for results, and that's how you get spurious invalid non-replicable results

buoyant vine Nov 23, 2023, 3:10 PM

#

How can I convert a pytorch tensor from 1D to 2D?

Effectively, I have a single record which is a 1D tensor, but the model expects a batch, making it 2D which is shape of (N, 25) so how can I convert my 1D to be effectively a [[1, 2, 3...]] instead of a [1, 2, 3...]

serene scaffold Nov 23, 2023, 3:48 PM

#

buoyant vine How can I convert a pytorch tensor from 1D to 2D? Effectively, I have a single ...

. reshape(-1, 25)

#

Or .reshape(1, -1)

#

Negative one gets solved for whatever integer completes the product

buoyant vine Nov 23, 2023, 3:49 PM

#

ah

#

I see, let me try this

#

I had (1, 25) originally

#

ah, I think I might be dumb and forgetting my input is actually already 2D and it wants 3D

#

FacePalm I forget this is a GloVe model rather than 1D embeddings

serene scaffold Nov 23, 2023, 4:05 PM

#

buoyant vine ah, I think I might be dumb and forgetting my input is actually already 2D and i...

You can also do (1, 1, -1)

#

I'm on mobile or I'd explain better

buoyant vine Nov 23, 2023, 4:13 PM

#

yeah, I got it working, thanks!

last lodge Nov 23, 2023, 4:42 PM

#

Two questions:

How can I add only the highlighted axis lines?
How can I add padding to the axis labels?

#

found how to add padding, just need 1.

umbral charm Nov 23, 2023, 5:17 PM

#

import numpy as np
import matplotlib.pyplot as plt
x, y, z = np.loadtxt(fname = 'data.csv', unpack = True, delimiter = ',', skiprows = 1) #load data
ax = plt.axes(projection='3d')
ax.scatter(x, y, z, cmap = 'turbo', s = 25, c = z, edgecolors = 'black') #creates a 3d scatter plot
plt.title('3d scatter plot of data.csv')
ax.set_xlabel('x')
ax.set_ylabel('y')
ax.set_zlabel('z')
plt.colorbar()
plt.savefig('data.png', dpi = 300)
plt.show()

#

why wont my oclour bar show

#

i get

#

raise RuntimeError('No mappable was found to use for colorbar '
RuntimeError: No mappable was found to use for colorbar creation. First define a mappable such as an image (with imshow) or a contour set (with contourf).
NVM I FIXED IT

brisk sage Nov 23, 2023, 5:20 PM

#

desert oar do you have some scientific model in mind, or are you just guessing it's nonline...

I was told to perform a multivariate analysis on my data and so I tried. According to the scatterplot and histplot it is no linear data (eg Int grades from 1-6 on the x axis, with amplitudes on the y axis). It’s not possible to fit a valid regression line through it, hence I’m looking for an alternative to GLM

agile owl Nov 23, 2023, 5:31 PM

#

brisk sage I was told to perform a multivariate analysis on my data and so I tried. Accordi...

what do you mean by "valid regression line"?

agile owl Nov 23, 2023, 5:59 PM

#

past meteor We'd have to run a benchmark. The appeal of SVMs are a universal approximator (c...

I've found that even on small datasets LightGBM usually beats SVM hard without the arduous tuning process

#

like n~=500

#

but of course it all depends on the specific use case

#

I don't really have any motivation to use SVM's anymore though

#

they are slow and usually don't work well without tuning which wastes even more time

#

am I wrong for dismissing SVMs?

past meteor Nov 23, 2023, 6:01 PM

#

Tuning SVMs is not hard, it's 2 hyperparameters

agile owl Nov 23, 2023, 6:01 PM

#

it's not hard but it can be tedious

past meteor Nov 23, 2023, 6:01 PM

#

How?

agile owl Nov 23, 2023, 6:01 PM

#

i'm a perfectionist I guess

#

I am never satisfied and keep trying to tune

past meteor Nov 23, 2023, 6:02 PM

#

Then you'd love them. Just make a grid of parmaters on a log scale and search

agile owl Nov 23, 2023, 6:02 PM

#

XD

past meteor Nov 23, 2023, 6:02 PM

#

Grab coffee while it's running

agile owl Nov 23, 2023, 6:02 PM

#

it's slow to iterate

past meteor Nov 23, 2023, 6:02 PM

#

huh

#

ime they're fast

agile owl Nov 23, 2023, 6:02 PM

#

I had a problem where an SVM took like 50x longer to fit than LightGBM and it performed way worse

#

and then to tune it on top of that?

past meteor Nov 23, 2023, 6:03 PM

#

How much data did you have?

agile owl Nov 23, 2023, 6:03 PM

#

it was wide data but n=~1000 or so

#

m ~= 100

past meteor Nov 23, 2023, 6:03 PM

#

They're pretty much invariant to the amount of columns. At least, a lot more than other methods.

#

That's actually the main appeal of the method

#

As mentioned, the reason why people don't use them is that you need to compute the kernel matrix which is a N x N data structure

agile owl Nov 23, 2023, 6:05 PM

#

my LightGBM estimates were actually correlated with the test data too and the SVM formed a cross on the scatterplot XD

#

I had never seen anything like that before

brisk sage Nov 23, 2023, 6:05 PM

#

agile owl what do you mean by "valid regression line"?

The datapoints are scattered widely, you just can’t fit a regression line through it. Imagine x axis 1-6 and y axis 0-2,5. the regression line would be horizontally

past meteor Nov 23, 2023, 6:05 PM

#

Using 32-bit floating point you can compute the upper limit of how many data points you can use with SVMs based on your RAM, it's quite low

agile owl Nov 23, 2023, 6:06 PM

#

SVM y_hat vs y_test plot be looking like +

#

admittedly without tuning

#

but when I saw that I didn't have any motivation to tune it when the boosting model already was better

past meteor Nov 23, 2023, 6:07 PM

#

Yeah, that's the issue 😩

past meteor Nov 23, 2023, 6:07 PM

#

past meteor We'd have to run a benchmark. The appeal of SVMs are a universal approximator (c...

.

agile owl Nov 23, 2023, 6:08 PM

#

Is there a reason why you don't use LightGBM?

past meteor Nov 23, 2023, 6:09 PM

#

I also use LightGBM 🙂 LightGBM, Xgboost, CatBoost, HistGradientBoosting, ...

agile owl Nov 23, 2023, 6:09 PM

#

I found this new thing called Natural Gradient Boosting out of Stanford

#

I thought it was really interesting but the variance estimates aren't that great compared to dedicated variance models like GARCH

#

one of the appeals is it estimates mean and variance at the same time

#

https://stanfordmlgroup.github.io/projects/ngboost/

NGBoost: Natural Gradient Boosting for Probabilistic Prediction

NGBoost: Natural Gradient Boosting for Probabilistic Prediction.

#

#

#

these gifs sold me

#

but then it didn't seem to really be great on large n data anyway vs the others

#

also it performs worse on the scale parameter than machine learning algorithms regressing y^2 on lagged y

#

for time series data at least

#

I tried using it in the same way I would use LightGBM or a RandomForestRegressor and it did comparably but the allure was not having to have a separate model for the variance with this approach and that didn't pan out

#

(it drastically underestimated true variance)

lapis sequoia Nov 23, 2023, 6:33 PM

#

is there a way to run quantization in transformers on an amd gpu

serene scaffold Nov 23, 2023, 6:34 PM

#

lapis sequoia is there a way to run quantization in transformers on an amd gpu

You should be able to do it on any gpu with CUDA.

lapis sequoia Nov 23, 2023, 6:34 PM

#

AMD only has rocm

serene scaffold Nov 23, 2023, 6:35 PM

#

Then no.

lapis sequoia Nov 23, 2023, 6:35 PM

#

oh

#

would there be a way to quantize the model from, e.g google colab then download to my system?

serene scaffold Nov 23, 2023, 6:35 PM

#

Yes

lapis sequoia Nov 23, 2023, 6:36 PM

#

ah

agile owl Nov 23, 2023, 6:36 PM

#

I am very disappointed that people don't demand an alternative to Nvidia

#

are we just going to keep lining Jensen's pockets forever

serene scaffold Nov 23, 2023, 6:37 PM

#

There should definitely be more options, yeah

agile owl Nov 23, 2023, 6:37 PM

#

I think Intel is ironically probably closer to supporting more GPU compute than AMD last I checked

#

idk what AMD is doing

lapis sequoia Nov 23, 2023, 6:37 PM

#

yeah

#

only disadvantage of an amd gpu

#

the price and performance is amazing but no development stuff

agile owl Nov 23, 2023, 6:38 PM

#

I think they overfocus on the budget and gaming markets

#

prob bc they figured there's no way to beat Nvidia at CUDA

lapis sequoia Nov 23, 2023, 6:38 PM

#

yea lol

agile owl Nov 23, 2023, 6:38 PM

#

but they seem to just chronically underinvest in ROCm

lapis sequoia Nov 23, 2023, 6:38 PM

#

i thnk i might get intel next, oneapi seems cool

#

BROO my google colab environment reset

#

Now i have to upload llama 2 7b AGAIN

#

i hate this

agile owl Nov 23, 2023, 6:39 PM

#

this is why I just bit the bullet and made my own ML workstation although being able to do that makes me a bit privileged

serene scaffold Nov 23, 2023, 6:40 PM

#

lapis sequoia Now i have to upload llama 2 7b AGAIN

Tbh, even after you fine time llama2-7b on colab, you probably won't even be able to deploy it locally without a gpu

lapis sequoia Nov 23, 2023, 6:40 PM

#

i have a gpu

#

well it wont recognize it

serene scaffold Nov 23, 2023, 6:40 PM

#

Does it have cuda

lapis sequoia Nov 23, 2023, 6:40 PM

#

bruh

serene scaffold Nov 23, 2023, 6:40 PM

#

Because if you don't have cuda, that's the same as not having a GPU as far as ML is concerned

lapis sequoia Nov 23, 2023, 6:41 PM

#

apparently someone was able to get rocm working with transformers

#

let me see how

agile owl Nov 23, 2023, 6:41 PM

#

there are SOME things that support Rocm

#

you can do SOME ML

serene scaffold Nov 23, 2023, 6:41 PM

#

Hmm, like what?

agile owl Nov 23, 2023, 6:41 PM

#

I used it years ago i don't remember

#

I did get frustrated though

serene scaffold Nov 23, 2023, 6:41 PM

#

I'm not doubting you. This is just the first I've heard

agile owl Nov 23, 2023, 6:41 PM

#

there were lots of caveats

lapis sequoia Nov 23, 2023, 6:41 PM

#

ive used tensorflow with rocm before i think

agile owl Nov 23, 2023, 6:41 PM

#

and only the most popular and basic algorithms had any support

lapis sequoia Nov 23, 2023, 6:42 PM

#

wait but i also need my gpu to be recognized in wsl

agile owl Nov 23, 2023, 6:42 PM

#

I had some surplus AMD gpus

#

and I wanted to see if I could use them in a machine of redundant parts

#

to get some value for ML purposes

#

the project was aborted when I realized how limiting ROCm is

#

but it's not zero

lapis sequoia Nov 23, 2023, 6:43 PM

#

my gpu in wsl is a Microsoft Corporation Device 008e

#

so i need to make it recognize my amd gpu

agile owl Nov 23, 2023, 6:44 PM

#

is using WSL worth it if you have a L1 hypervisor Linux VM or baremetal

#

I still haven't used it

lapis sequoia Nov 23, 2023, 6:57 PM

#

ok well i finally have my llama 2 7b back

#

but i have 1gb left

#

ayyy its loading shards

#

i just hope it doesnt take disk

#

it works guys

agile owl Nov 23, 2023, 7:07 PM

#

glad it worked out for you

lapis sequoia Nov 23, 2023, 7:13 PM

#

thanks!

keen quartz Nov 23, 2023, 8:10 PM

#

hey can anyone help me ?
need a data base containg real resumes for a model cant get any data set for it can you all upload some on a form that i will share ?

iron basalt Nov 23, 2023, 8:20 PM

#

agile owl I think Intel is ironically probably closer to supporting more GPU compute than ...

This is not really AMD's fault in any way, you can do ML just fine on an AMD GPU. The issue is that everyone has locked into Nvidia by writing everything with CUDA. AMD has been working on conversion layers that let you effectively run CUDA on AMD GPUs. They do work, but do not have everything implemented. If the open source community wanted to, they could implement everything with OpenCL or Vulkan and then we would not be GPU vendor locked.

agile owl Nov 23, 2023, 8:20 PM

#

AMD has NOT done the marketing or outreach

#

most people don't even know ROCm exists

#

it is totally their fault

#

Stelercus didn't even know it did anything at all

#

that's AMD's fault

iron basalt Nov 23, 2023, 8:20 PM

#

agile owl AMD has NOT done the marketing or outreach

They have not actively pursued the market, but it's totally doable.

#

(Until recently)

agile owl Nov 23, 2023, 8:21 PM

#

you are blaming the coding community for not doing the thing but AMD didn't really try to make it happen either

#

they could have invested more into subverting NVIDIA's dominance but they shyed away to focus on gaming and budget computers

iron basalt Nov 23, 2023, 8:22 PM

#

agile owl you are blaming the coding community for not doing the thing but AMD didn't real...

I'm not blaming anyone, they just wrote their stuff for CUDA, and now they are locked in because it's a lot of work to rewrite it all.

agile owl Nov 23, 2023, 8:22 PM

#

I think ROCm also made some weird design decision

#

where everything has to be an atomic operation

#

whereas in CUDA that isn't the case

#

at least that was what was going on last time I used it

iron basalt Nov 23, 2023, 8:22 PM

#

Nvidia did give them more support, which was the whole plan. Nvidia knew that they had an opportunity here, and they took it.

iron basalt Nov 23, 2023, 8:23 PM

#

agile owl I think ROCm also made some weird design decision

Does not even need ROCm, I would prefer something cross platform, like OpenCL.

agile owl Nov 23, 2023, 8:23 PM

#

ROCm was very ambitious I think

iron basalt Nov 23, 2023, 8:23 PM

#

(Then we can even do FPGAs)

agile owl Nov 23, 2023, 8:23 PM

#

trying to make it so you can use an array of any kind of different GPUs

#

and to make that work they had to enforce atomicity

#

but CUDA code isn't written like that

#

so not only do they provide less resources but they raised the bar

#

so many problems with that project

iron basalt Nov 23, 2023, 8:24 PM

#

ROCm has many problems, I don't think it should be used, but there are other options that work fine.

agile owl Nov 23, 2023, 8:25 PM

#

to be clear I think ROCm's way is better in theory

#

but in practice it will never get there

#

like GNU Hurd

iron basalt Nov 23, 2023, 8:25 PM

#

Yes, in theory, but it's AMD and AMD flops when it comes to SDKs.

#

(Even in gaming)

#

Nvidia does now have a lock-in monopoly on deep learning, but AMD also did not really care / cared too late. And Intel is just doing their thing, not sure what they are really going for.

#

Many ML libs used to have OpenCL actually for a while.

past meteor Nov 23, 2023, 8:27 PM

#

I know ROCm exists and stel probably as well but the issue remains if you want to deal with that uncertainty of it being runnable or not on AMD

iron basalt Nov 23, 2023, 8:28 PM

#

OpenCL would be really nice to have again, since it's any GPU, CPU, or even other things like FPGAs.

past meteor Nov 23, 2023, 8:28 PM

#

If you get an NVIDIA card you know ahead of time it'll run

#

If you go down the AMD ROCm route sooner or later you'll hit a brick wall. This can be after 1 day or after 1 year.

wooden sail Nov 23, 2023, 8:29 PM

#

intel has something similar to nvidia's chokehold through mkl

#

lots of computing software runs better on intel processors thanks to it

agile owl Nov 23, 2023, 8:30 PM

#

Stel actually said he didn't know ROCm was usable at all

#

which to be fair is a close approximation

#

xD

iron basalt Nov 23, 2023, 8:30 PM

#

The best option are those CUDA-ROCm layers, like the one torch has.

#

I forgot their arconym, it's part of that whole family of HPC stuff.

past meteor Nov 23, 2023, 8:33 PM

#

I went for an AMD CPU with an NVIDIA gpu 🥴

iron basalt Nov 23, 2023, 8:33 PM

#

wooden sail lots of computing software runs better on intel processors thanks to it

At least Intel does not intentionally make the other options run slower... (they intentionally crippled the OpenCL drivers back when OpenCL was still 1.0 to get everyone on to CUDA, now they don't care and it works ok).

#

(I remember getting an Nvidia GPU that said it supported OpenCL on the box! But it did not)

wooden sail Nov 23, 2023, 8:34 PM

#

iron basalt At least Intel does not intentionally make the other options run slower... (they...

don't they? until recently, anything using mkl and detecting it's not running on intel would immediately turn off avx2 even if it was available

#

that would make things like matlab chug on amd processors

past meteor Nov 23, 2023, 8:34 PM

#

Damn is that why I can't run avx on our servers

wooden sail Nov 23, 2023, 8:34 PM

#

it could be

past meteor Nov 23, 2023, 8:34 PM

#

I'll check tomorrow

wooden sail Nov 23, 2023, 8:34 PM

#

you may need to replace mkl with openblas

agile owl Nov 23, 2023, 8:34 PM

#

what happened to OpenCL

#

did everyone just abandon it to live in Jensen Huang' sworld

iron basalt Nov 23, 2023, 8:35 PM

#

wooden sail don't they? until recently, anything using mkl and detecting it's not running on...

That is a bit different, basically Intel has paid for features that are unlocked. Nvidia has this too, but it's not the same as saying you support something only to intentionally make it slower so your thing (CUDA in this case) looks better.

iron basalt Nov 23, 2023, 8:35 PM

#

agile owl what happened to OpenCL

It's still there, still works fine, still is heavily used.

#

Just deep learning specifically uses CUDA for everything.

#

Because the popular frameworks are built on it.

agile owl Nov 23, 2023, 8:36 PM

#

Honestly someone should just antitrust them

#

I'm sure they're lobbying against it hard

iron basalt Nov 23, 2023, 8:37 PM

#

agile owl I'm sure they're lobbying against it hard

They almost got ARM, so they are pretty much as mega giant corporation as it gets.

agile owl Nov 23, 2023, 8:37 PM

#

it's not the size it's the lack of competition

#

you can be a huge conglomerate in every market and that's fine

#

it's when you maliciously take control of one market

#

like Intel and NVIDIA try to do

iron basalt Nov 23, 2023, 8:37 PM

#

Yeah, the problem is that computer hardware is really hard to get into.

agile owl Nov 23, 2023, 8:37 PM

#

and MSFT

iron basalt Nov 23, 2023, 8:38 PM

#

To make HPC stuff.

wooden sail Nov 23, 2023, 8:38 PM

#

intel arc to the rescue 😩

agile owl Nov 23, 2023, 8:38 PM

#

I understand why that was the case in the past

iron basalt Nov 23, 2023, 8:38 PM

#

And it does not help that nobody has interest in adopting anything else, the advice is still just "buy an Nvidia GPU."

agile owl Nov 23, 2023, 8:38 PM

#

but we are in a new era

#

that demands more competition

#

this isn't bleeding edge anymore it's mainstream

#

the industrial organization has become a major hindrance

#

Nividia just captures all the economic surplus

iron basalt Nov 23, 2023, 8:39 PM

#

Computer hardware is very complicated, involves geopolitics.

#

(Since it directly translates to military power)

agile owl Nov 23, 2023, 8:40 PM

#

they should simply force Nvidia to split its software division off from the hardware one

iron basalt Nov 23, 2023, 8:40 PM

#

Probably, and split in other ways.

agile owl Nov 23, 2023, 8:44 PM

#

I think it's amazing they got everyone worried about SkyNet instead of their massive monopoly you gotta wonder if that's a PR campaign

#

"don't look here, look THERE"

#

anyway this is why I've long been a skeptic of intellectual property

#

if everyone could freely use available knowledge we'd all be better off

echo mesa Nov 23, 2023, 9:03 PM

#

Guys I've gone through the introduction to ml andrew ng course and have read the data science first principles with python book. What would the next step be? Should I learn more statistics and then read the statistical learning with python book or what should I do, to deepen my knowledge? Perhaps I should learn more mathematics overall?

agile owl Nov 23, 2023, 9:05 PM

#

try to do projects and try to learn how to improve their results

#

it's always best if you have something to work on

iron basalt Nov 23, 2023, 9:05 PM

#

agile owl I think it's amazing they got everyone worried about SkyNet instead of their mas...

"During a gold rush, sell shovels."

echo mesa Nov 23, 2023, 9:08 PM

#

agile owl try to do projects and try to learn how to improve their results

That's true

past meteor Nov 23, 2023, 9:10 PM

#

echo mesa Guys I've gone through the introduction to ml andrew ng course and have read th...

Alternate reading and doing. I highly recommend getting on www.kaggle.com and doing the "basic" courses there and the basic projects.

echo mesa Nov 23, 2023, 9:11 PM

#

past meteor Alternate reading and doing. I highly recommend getting on www.kaggle.com and do...

Yeah that's true I should keep learning and improving my knowledge while working on things and try to implement my learnt knowledge

agile owl Nov 23, 2023, 9:13 PM

#

having something to work on gives you material to self-direct your education in a way that is most meaningful to you, raises important questions that often won't arise in coursework, and gives you more satisfaction (at least in my experience) and therefore motivation

echo mesa Nov 23, 2023, 9:20 PM

#

agile owl having something to work on gives you material to self-direct your education in ...

exactly

rugged comet Nov 23, 2023, 11:29 PM

#

What are the tradeoffs of using a binary tree vs. an n-ary tree for decision tree classifiers?
Are most decision tree classifier binary trees?

wary vortex Nov 24, 2023, 12:46 AM

#

Hello, how can I create a text to text chatbot using pytorch and a dataset consisting of questions and answers? The chatbot should respond to questions asked(it is going to be a mental help chatbot specifically). I am new to pytorch and I can not figure out how to do it.

agile owl Nov 24, 2023, 1:50 AM

#

is there any problem that an n-ary tree can solve that a binary tree can't?

#

What if you have a problem where only one split actually increases information?

#

you have to be careful with mental health but I'm sure you know that

rugged comet Nov 24, 2023, 2:01 AM

#

agile owl is there any problem that an n-ary tree can solve that a binary tree can't?

I don't know.

agile owl Nov 24, 2023, 3:29 AM

#

It seems to me like the tradeoff depends on whether more than one split per node is worth it

#

and you will only know that for a specific problem

#

I don't think it will end up mattering

#

but I might be wrong

#

my intuition says that it will collapse into the same solution and may perform slightly better or worse

#

the tree will be more shallow

#

so if you would have otherwise run into the depth limit that might be avoided which will lead to more total calculations running

#

that's my best guess

#

I think it's a bit easier to reason about a binary tree

edgy frost Nov 24, 2023, 5:12 AM

#

I'm currently fine-tuning/training a sentiment analysis model DistilBERT with a dataset with 60,000+ entries using K-Folds, is it normal for it to take a while? (i tried 36,000+ before it took 1 hr and 30 minutes for 1 epoch alone) Just using my personal computer to run the training code.

agile owl Nov 24, 2023, 5:59 AM

#

does anyone know of a somewhat frequently updated retail commerce prices paid dataset that's available at a reasonable cost or free

#

I imagine that this data is extremely valuable so the answer is probably hell no

boreal nest Nov 24, 2023, 6:54 AM

#

hello everyone, has anyone tried prefect-flow for etl pipelines?

odd meteor Nov 24, 2023, 6:58 AM

#

edgy frost I'm currently fine-tuning/training a sentiment analysis model DistilBERT with a ...

It's quite common for it to take time if you're doing this on a CPU. However, if you'd like to speed up things, try using a GPU if possible.

If you're using PyTorch, it's even easier to push your model and data to GPU, and do your fine tunning from there.

edgy frost Nov 24, 2023, 7:01 AM

#

odd meteor It's quite common for it to take time if you're doing this on a CPU. However, if...

i see, i just finished my data set with 60k entries a single epoch took 3 hrs and 11 minutes altho not really in a rush to train this. will look into trying using my gpu.

agile owl Nov 24, 2023, 7:07 AM

#

trying to do BERT without a GPU is a fool's errand

cold osprey Nov 24, 2023, 7:20 AM

#

LOL

#

Just funny haha

#

Can use colab or kaggle free if the data fits in the free gpu

primal ice Nov 24, 2023, 7:50 AM

#

Is anyone know chatgpt dan prompt still work

past meteor Nov 24, 2023, 8:01 AM

#

agile owl trying to do BERT without a GPU is a fool's errand

This is not actually true

#

The nuance is that you want to train on GPU yes, but doing inference on CPU makes a lot of sense. BERT only needs 400 MB ram or so.

agile owl Nov 24, 2023, 8:02 AM

#

he is training though

#

he's fine tuning

#

that's what I meant

past meteor Nov 24, 2023, 8:03 AM

#

Alright! 😄 it's still a statement I'd try and nuance as much as possible. It's kind of important especially beginners know CPU inference is possible with a decent latency

#

GPU stuff just costs so much more so it's a good one to know. Base Bert takes like sub 150ms but you can get it to sub 10 if you try hard.

cold osprey Nov 24, 2023, 8:16 AM

#

Any example of something that outright won't work on CPU?

odd meteor Nov 24, 2023, 8:58 AM

#

I think anything that works on GPU can as well work on CPU. The major question would be, at what computational cost?

Some task are better off done in GPU than CPU (and vice versa)

cold osprey Nov 24, 2023, 9:19 AM

#

Ah ok

zealous tartan Nov 24, 2023, 11:45 AM

#

left tartan This is a better question for <#470889390588035082> : it sounds like you probabl...

thanks. My calculus practice is pretty strong since my last year of hs has it in the syllabus. Also should i look in statistics too as a maths fundamental??

wooden sail Nov 24, 2023, 11:55 AM

#

the short answer is yes. the long answer is yeeeeeeeees. almost everything in ML and data science either straight up IS statistics, or involves it in some way. it's easier to pick up after calc and linalg though

left tartan Nov 24, 2023, 11:55 AM

#

zealous tartan thanks. My calculus practice is pretty strong since my last year of hs has it in...

See the pins for some reading material but I’m just advising on how to succeed as an undergraduate studying CS: the main topics are Calc, Linear Algebra, Statistics, and Discrete Math. Some programs let you do 3-4 semesters of Calc OR 2+Linear; you should choose linear. After undergrad, there’s a separate question of preparing for data science which other folks can answer

left tartan Nov 24, 2023, 12:01 PM

#

zealous tartan thanks. My calculus practice is pretty strong since my last year of hs has it in...

from a ‘preparing for college’ perspective: doesn’t really matter what you study. From a learning something interesting and fun, I’d say stats, no question.

gloomy parrot Nov 24, 2023, 12:07 PM

#

Hello eveyone, im currently using detectron2 for objec detection but im having a problem when it comes to predicting, it gives me a wrong prediction? How can i solve this?

zealous tartan Nov 24, 2023, 12:16 PM

#

left tartan See the pins for some reading material but I’m just advising on how to succeed a...

Ohh okie

lapis sequoia Nov 24, 2023, 12:16 PM

#

is there a way to download a 4bit converted model from colab

zealous tartan Nov 24, 2023, 12:16 PM

#

left tartan from a ‘preparing for college’ perspective: doesn’t really matter what you study...

thanks

lapis sequoia Nov 24, 2023, 12:16 PM

#

when i try push_to_Hub (huggingface) it says "NotImplementedError: You are calling save_pretrained on a 4-bit converted model. This is currently not supported"

dense crane Nov 24, 2023, 3:51 PM

#

what can be other ways to deal with that loss instead of changing the generator architecture?

#

i mean i will improve the generator but is there something else what can i do?

simple shuttle Nov 24, 2023, 4:30 PM

#

Anyone has any suggestions on how to start learning machine learning please? Should I simply go to youtube ?

serene scaffold Nov 24, 2023, 4:50 PM

#

simple shuttle Anyone has any suggestions on how to start learning machine learning please? Sho...

Machine learning is where you optimize a mathematical function based on data. So it's all applied math. How well do you understand differential calculus, probability and statistics, and arithemtic using vectors/arrays/matrices?

simple shuttle Nov 24, 2023, 4:51 PM

#

I think I am capable of the math required for machine learning as I did computational finance for my master degree

serene scaffold Nov 24, 2023, 4:51 PM

#

simple shuttle I think I am capable of the math required for machine learning as I did computat...

does that include at least all of the things I said?

simple shuttle Nov 24, 2023, 4:51 PM

#

serene scaffold Machine learning is where you optimize a mathematical function based on data. So...

thanks for replying btw

simple shuttle Nov 24, 2023, 4:51 PM

#

serene scaffold does that include at least all of the things I said?

Yes, I know all the maths

serene scaffold Nov 24, 2023, 4:52 PM

#

okay, because I wouldn't expect that degree to include linear algebra. but I wouldn't know, either.

there are these three textbooks: #data-science-and-ml message

and then there are more suggestions on our website

#

!resources data science

arctic wedgeBOT Nov 24, 2023, 4:52 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

simple shuttle Nov 24, 2023, 4:53 PM

#

Thanks a lot man

#

I'll get right on that

odd meteor Nov 24, 2023, 8:48 PM

#

simple shuttle Anyone has any suggestions on how to start learning machine learning please? Sho...

Adding to what pope Stelercus said, you can also check https://Kaggle.com/learn

Learn Python, Data Viz, Pandas & More | Tutorials | Kaggle

Practical data skills you can apply immediately: that's what you'll learn in these no-cost courses. They're the fastest (and most fun) way to become a data scientist or improve your current skills.

echo mesa Nov 24, 2023, 11:29 PM

#

Guys, besides gradient descent, what other methods are there to find the best fitting line for a data set? Why is gradient descent so popular, is it the "best" and most effective method?

#

Also is it any helpful that as I go thru different statistics concepts i would try to implement them in python?

lapis sequoia Nov 25, 2023, 12:41 AM

#

C? Data Science? Why not?

echo mesa Nov 25, 2023, 12:43 AM

#

lapis sequoia C? Data Science? Why not?

not sure what you mean, if this was meant for me

lapis sequoia Nov 25, 2023, 12:43 AM

#

No. I just wondered if that would just be complete nonsense.

#

like, can the economy support 40,000 devs? I do not know

lapis sequoia Nov 25, 2023, 12:47 AM

#

echo mesa Also is it any helpful that as I go thru different statistics concepts i would t...

Yes. Honestly, I took calc 1-3 before I took stats and barley payed attention and got an A, and I litteraly had to relearn everything. Yes you should do that.

echo mesa Nov 25, 2023, 12:50 AM

#

lapis sequoia Yes. Honestly, I took calc 1-3 before I took stats and barley payed attention an...

Alright, thanks- Any good statistics book you've read?

lapis sequoia Nov 25, 2023, 12:53 AM

#

No, just went to school. Uh, actually, I cannot remember the name of the youtube channel, but they wrote a book called something like 'Intro to statistical learning' or something. I read that entire text and watched all of thier videos. Also, a couple of other good channels: Data School, Statquest,Zedstatistics

#

That is all you need really

echo mesa Nov 25, 2023, 12:53 AM

#

lapis sequoia No, just went to school. Uh, actually, I cannot remember the name of the youtube...

Yeah thats this book: https://www.statlearning.com/

An Introduction to Statistical Learning

echo mesa Nov 25, 2023, 12:54 AM

#

lapis sequoia No, just went to school. Uh, actually, I cannot remember the name of the youtube...

Gotcha thanks very much

lapis sequoia Nov 25, 2023, 12:54 AM

#

yes. Damn.

#

Mine was in R. Damn, that was my introduction to machine learning years ago

echo mesa Nov 25, 2023, 12:55 AM

#

lapis sequoia Mine was in R. Damn, that was my introduction to machine learning years ago

yeah there are two versions, I think I'll start with the R one

lapis sequoia Nov 25, 2023, 12:55 AM

#

R is just so bad and I am not saying that because this is a python server

echo mesa Nov 25, 2023, 12:55 AM

#

lapis sequoia Mine was in R. Damn, that was my introduction to machine learning years ago

but the other one which is "...applications in python" is literally the same but in python so the content of the book mathematically does not change

lapis sequoia Nov 25, 2023, 12:55 AM

#

user-defined functions in R are just the worst

#

Youll be fine then

echo mesa Nov 25, 2023, 12:56 AM

#

lapis sequoia R is just so bad and I am not saying that because this is a python server

Is it? I havent actually tried it yet, but ive heard people dont really like it

#

I mean its kinda general to say this, but in the book "data science first principles in python" he said that he doesnt like r and he would rather focus on python

lapis sequoia Nov 25, 2023, 12:57 AM

#

That text just changed my life in such a drastic way, That hit me hard.

echo mesa Nov 25, 2023, 12:59 AM

#

lapis sequoia That text just changed my life in such a drastic way, That hit me hard.

Jeez, thats not too positive, you think i shouldnt get into it?

willow sorrel Nov 25, 2023, 1:47 AM

#

hey theree guys i need a little help around computer vision for a hackathon, its probably a basic thing to work around with but i got no experience with computer vision or such libraries, i'd be really grateful if someone can help me around it. its basically detecting the gun at the first stage, at the second stage detect the gun and the person holding it and capture the person, in the third stage we need to detect if the person is holding it in a stance of firing or holding it neutrally, im done with the first stage of the problem cant figure anything from there on, i'd be really grateful if anyone can guide me towards the second stage atleast

lapis sequoia Nov 25, 2023, 2:27 AM

#

echo mesa Jeez, thats not too positive, you think i shouldnt get into it?

How was that not positive?

#

I meant it in a good way

odd meteor Nov 25, 2023, 3:46 AM

#

echo mesa Guys, besides gradient descent, what other methods are there to find the best fi...

Gradient Descent isn't the only method. You can use OLS as well.

Although some people would argue that OLS isn't an optimization algorithm but an estimation technique. However, OLS and Gradient Descent tend to find the line of best fit using different approach.

odd meteor Nov 25, 2023, 4:07 AM

#

echo mesa Also is it any helpful that as I go thru different statistics concepts i would t...

Most of those statistical concept are already implemented in a lot libraries like Scipy, Statsmodel, Sklearn, PyTorch, TensorFlow etc... So yeah it's cool to implement them in Python.

You might find the latest edition of ISL very useful since its last edition is in Python.

I guess my only let down is that, it doesn't cover conformal prediction yet. Hopefully they'll add a chapter on Conformal Prediction in their subsequent edition.

The free pdf Version of ISL can be downloaded via https://www.statlearning.com/

An Introduction to Statistical Learning

odd meteor Nov 25, 2023, 4:20 AM

#

lapis sequoia R is just so bad and I am not saying that because this is a python server

😅 When it comes to stats and better viz, leave it for R. I was taught R in school but I left it and moved to Python (I don't even have any concrete reason. I guess I find python more 'customer-friendly') 👀

lapis sequoia Nov 25, 2023, 4:22 AM

#

I don’t know, I used to use R a lot but learned more in python because there was a snobby Data Science community on a different discord server that just spit on R

lapis sequoia Nov 25, 2023, 4:43 AM

#

Reading that text and knowing everything they are talking about makes me want to cry. Like, I don’t know. The hardest course I ever took was a optimization grad class which was so hard that I suggest that no one does it. I don’t, lol. I was shooting heroin two years ago for five years. I was sick of that life and litterally replaced it with DS when I decided to go to grad school and get my masters. I was so inspired by that R version of that text. It is hard to explain how m uh that means to me. Really nothing means more to me than that. And the person who introduced me to this. Your mind is powerful and you can do whatever you want if your conviction is true. No one has ever failed when they genuinely tried. Sorry for that long text, just had to say that. You can do whatever you want and your mind is reality.

past meteor Nov 25, 2023, 7:31 AM

#

echo mesa Guys, besides gradient descent, what other methods are there to find the best fi...

The reason why gradient descent works so well is people prioritize loss functions that are convex, basically U shaped. If the optimization surface has this shape you can easily use gradients to iteratively move to the "bottom" of the U, that's where the derivative is 0, that's the idea.

Now, there's equivalences between doing this and maximizing the (negative log) likelihood. Maximising the likelihood is basically choosing the weights such that P(y|X) is as high as possible. In words "the probability of observing the target variable given the data is as high as possible". As emyrs said, there is also an equivalence between gradient descent, maximum likelihood and certain matrix decomposition methods that come out of linear algebra. OLS for instance gives you a solution that 1) maximizes the likelihood 2) has a gradient of 0 with a closed-form solution.

Last but not least, there's fun touchpoints with computer science. (Stochastic) Gradient descent is very efficient in that it's optimized for not using a lot of memory, it scales well for large datasets. If your dataset is small you can use second derivatives, conjugate gradient, BFGS and so on. Matrix decomposition also uses way more memory and thus doesn't scale as well to large datasets. Also good to know there's algorithms like SVM that have a "problem" that have non-convex surfaces, they use more exotic things like quadratic programming.

echo mesa Nov 25, 2023, 9:10 AM

#

odd meteor Most of those statistical concept are already implemented in a lot libraries lik...

Thanks very much, in the book are the statistics concepts explained or they are just presented with python, reason why im asking is whether I have to read a statistics book first to get started with it?

echo mesa Nov 25, 2023, 9:10 AM

#

past meteor The reason why gradient descent works so well is people prioritize loss function...

Thanks very much

wooden sail Nov 25, 2023, 9:51 AM

#

i would round this out by mentioning that only linear least squares is this nicely behaved

#

if you formulate the maximum likelihood problem using a neural network, for example, the cost function is not convex. in these cases, the solution you get from gradient descent depends on how close you were to a particular local minimum or saddle point

oblique quarry Nov 25, 2023, 10:21 AM

#

does anyone know why Im getting negative eigenvalues? ```py
def sqrtOfMatrix(data):
eigenValues, eigenVectors = np.linalg.eigh(data)
assert (eigenValues >= 0).all(), "Matrix should be positive semi-definite"
return eigenVectors * np.sqrt(eigenValues) @ eigenVectors

matrix = np.random.randn(5,5)
matrix = matrix - matrix.mean(axis=0, keepdims=True)
covMatrix = matrix.T.dot(matrix)
sqrtMatrix = sqrtOfMatrix(covMatrix)

this book, these matrices, however, will be assumed to be positive definite. In view of this
assumption, these matrices will also admit their respective inverses."

wooden sail Nov 25, 2023, 10:57 AM

#

oblique quarry does anyone know why Im getting negative eigenvalues? ```py def sqrtOfMatrix(dat...

you're looking for the sqrt of covMatrix, not matrix

#

the original matrix you made has no special properties other than having only nonnegative entries

viscid socket Nov 25, 2023, 10:58 AM

#

hey can anyone give me some nlp project ideas?

oblique quarry Nov 25, 2023, 11:07 AM

#

wooden sail the original matrix you made has no special properties other than having only no...

Thank you, changed the code accordingly but it still produces the same error. I assume that it has to be numerical imprecision. I assume that this is because the majority of the variance is explained in the first principal component so that eventually the magnitude of represented by the eigenvalues becomes so small that it virtually becomes zero. Mabye adding a regularization term will do?

#

yeah it seemed to have work, but if theres a better way than just outright increasing the determinant, let me know ```py
def sqrtOfMatrix(data):
data += np.eye(len(data)) * 1e-12
eigenValues, eigenVectors = np.linalg.eigh(data)
assert (eigenValues >= 0).all(), "Matrix should be positive semi-definite"
return eigenVectors * np.sqrt(eigenValues) @ eigenVectors

matrix = np.random.randn(5,5)
matrix = matrix - matrix.mean(axis=0, keepdims=True)
covMatrix = matrix.T.dot(matrix)
sqrtMatrix = sqrtOfMatrix(covMatrix)```

wooden sail Nov 25, 2023, 11:11 AM

#

oblique quarry Thank you, changed the code accordingly but it still produces the same error. I ...

can you show an example of the eigvals you get? you can always make them arbitrarily large by adding a scaled identity matrix

#

random matrices should have exponentially decaying eigenvalues, off the top of my head. there should be some papers discussing this... at least for the case of matrices with gaussian entries. then when squaring, you get a rather poor condition number unless you load the main diagonal

oblique quarry Nov 25, 2023, 11:14 AM

#

[-1.84880995e-15 1.78080773e-01 1.89576151e+00 4.99760224e+00
1.34696135e+01]

oblique quarry Nov 25, 2023, 11:15 AM

#

wooden sail random matrices should have exponentially decaying eigenvalues, off the top of m...

do you have any resources which i can use to read up on that?

wooden sail Nov 25, 2023, 11:20 AM

#

oblique quarry do you have any resources which i can use to read up on that?

i'll have to look around. as for your example, notice that the lragest and smallest eigenvalues are roughly 1e16 apart, which puts the dynamic range on the order of the machine epsilon. the smallest eigval is 0 to within machine precision

oblique quarry Nov 25, 2023, 11:27 AM

#

👍

wooden sail Nov 25, 2023, 11:28 AM

#

section 3 here https://link.springer.com/content/pdf/10.1155/2007/71953.pdf and this other paper https://arxiv.org/pdf/2101.02928.pdf may shed some light

#

the high level idea being that, even in the best case (depending on the distribution), you will only get a diagonal covariance for infinitely long vectors or with infinitely many realizations averaged out. in the finite case this means the vector in the matrix are not orthogonal, which will have an impact on the eigenvalues

oblique quarry Nov 25, 2023, 11:33 AM

#

Much appreciated

buoyant vine Nov 25, 2023, 1:32 PM

#

\o/ For the first time, I have actually connected a training dashboard to my AI runs so I can actually see what the model is doing, neptune is cool, but i'm wondering if MLFlow is a better cheaper alternative pithink

buoyant vine Nov 25, 2023, 2:35 PM

#

Has anyone used the XLMR bert model before btw on larger datasets (800k+ points) my loss seems to be higher than I expected and the change seems to not really reducing and I can't quite work out if I should stop it or not...

It has early termination setup, but you can see the loss is changing quite aggressively

#

I think part of the reason might be the dataset itself isn't shuffed (which maybe I should do that 😅 )

#

so very similar pieces of text are already likely part of the same batch

narrow tiger Nov 25, 2023, 3:01 PM

#

any free gpt like repo i can clone and use for basic chating like reply with true or false
"sun sets in west"
"is message "xxxxxx" a spam"

buoyant vine Nov 25, 2023, 3:01 PM

#

narrow tiger any free gpt like repo i can clone and use for basic chating like reply with tru...

look at ollama with llama2

#

requires a reasonably decent machine to run the smallest models still tho

narrow tiger Nov 25, 2023, 3:03 PM

#

https://lmstudio.ai/

#

https://twitter.com/LMStudioAI

#

found this they say u can use local host running a model as replacment for openai API too

buoyant vine Nov 25, 2023, 3:04 PM

#

yes there are lots of alternatives mostly build on llama

#

that being said, you need a very big machine to run the bigger models

#

and realistically its the biggest models which are the ones more comparable to openai

narrow tiger Nov 25, 2023, 3:05 PM

#

thanks

#

this is insane literally free pithink

#

also saw a video/image generative model

narrow tiger Nov 25, 2023, 3:11 PM

#

buoyant vine look at ollama with llama2

wait very noob question but how do i try this out? ig i can use only the very basic with 8b

#

#

is llama2 different then ollama?

buoyant vine Nov 25, 2023, 3:12 PM

#

ollama is a tool for running llama and other llms

#

Install ollama then you can do ollama run llama2

#

which will run the 7B param model

#

https://github.com/jmorganca/ollama#model-library

GitHub

GitHub - jmorganca/ollama: Get up and running with Llama 2 and othe...

Get up and running with Llama 2 and other large language models locally - GitHub - jmorganca/ollama: Get up and running with Llama 2 and other large language models locally

narrow tiger Nov 25, 2023, 3:12 PM

#

thanks

narrow tiger Nov 25, 2023, 4:04 PM

#

it's too frank 🤣

#

isit supposed to be this frank out of the box lol?

#

ok iis codelama better almost as good as phind or chatgpt?

#

what are some cool things you guys are builduing/testing using these models?

echo mesa Nov 25, 2023, 4:30 PM

#

Guys, Im about to read the statistical learning with application in python book, im just wondering whether i should learn a book about just statistics so that i can understand those concepts in the book? Or would it make sense to just read about general statistics and implement the concepts in python and then move onto the statistical learning book?