jaunty helm May 19, 2024, 1:43 PM

#

basically I know there's data imbalance, but after down sampling the models still can't predict Enrolled too well, so I'm asking what are some techniques I can apply here (multiclass classification, 1 class performs worse than all others)

cedar tusk May 19, 2024, 2:12 PM

#

from what i see, without feature engineering this data can yield at most .76 accuracy

jaunty helm May 19, 2024, 2:15 PM

#

I guess I'm more asking for techniques that I might not know which can help in this situation
obv good feature engineering probably helps, but are there other methods that I should know about? that kinda stuff

cedar tusk May 19, 2024, 2:16 PM

#

well, if i were you i would begin with feature selection

#

then do either factor analysis or pca

#

to see if you can get more knowledge out of the data

#

to be honest this 3/4 accuracy is related to there being too many categorical variables

#

if there was more numerics it would have been better

noble topaz May 19, 2024, 2:20 PM

#

Hello guys. I have a question. How can i know how many hidden layers has a CNN? I want ro build one with 5 hidden layers but i am getting stuck

cedar tusk May 19, 2024, 2:20 PM

#

noble topaz Hello guys. I have a question. How can i know how many hidden layers has a CNN? ...

if u are building a neural net with keras it will look somewhat like this:

model = Sequential()
model.add(Dense(64, input_dim=X_train.shape[1], activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(16, activation='relu'))
model.add(Dense(3, activation='softmax'))

#

every line that begins with model.add(Dense is a layer, the last one being the output.

noble topaz May 19, 2024, 2:22 PM

#

I need to do this in pytorch. Thanks for tou fast reply

#

Your*

cedar tusk May 19, 2024, 2:22 PM

#

pytorch has a similar syntax

#

# Define the neural network model
class BinaryClassificationNN(nn.Module):
    def __init__(self, input_size):
        super(BinaryClassificationNN, self).__init__()
        self.fc1 = nn.Linear(input_size, 16)
        self.fc2 = nn.Linear(16, 8)
        self.fc3 = nn.Linear(8, 1)
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        x = torch.relu(self.fc1(x))
        x = torch.relu(self.fc2(x))
        x = self.fc3(x)
        x = self.sigmoid(x)
        return x

#

this is a binary classification neural net in pytorch

noble topaz May 19, 2024, 2:24 PM

#

3 hidden layers and one output

#

Am i mistaken?

cedar tusk May 19, 2024, 2:25 PM

#

1 input, 1 hidden, 1 out

#

ther nn linear takes the arguments input number, output number

#

and u add a sigmoid to the end to take the value in the last neuron and make it into a probability

molten elk May 19, 2024, 2:42 PM

#

Here it is:

autumn gull May 19, 2024, 3:05 PM

#

actually im trying to build a personal vistural assistant and ive trained a tfidf vectorizer on dataset

#

which part of code should i share

past meteor May 19, 2024, 3:22 PM

#

Can you make a confusion matrix? This is the case where it shines

jaunty helm May 19, 2024, 3:42 PM

#

past meteor Can you make a confusion matrix? This is the case where it shines

you mean this right?

pd.DataFrame(
    confusion_matrix(y_true, y_pred, labels=['Dropout', 'Enrolled', 'Graduate']),
    index=['Dropout', 'Enrolled', 'Graduate'],
    columns=['Dropout', 'Enrolled', 'Graduate']
)

# original
-----SVC-----
          Dropout  Enrolled  Graduate
Dropout       206        29        49
Enrolled       31        52        76
Graduate        9        19       414

# class_weight='balanced'
-----SVC-----
          Dropout  Enrolled  Graduate
Dropout       193        57        34
Enrolled       28        92        39
Graduate       12        83       347

# down sample
-----SVC-----
          Dropout  Enrolled  Graduate
Dropout       134        50        16
Enrolled       30       131        38
Graduate        7        30       163

past meteor May 19, 2024, 4:16 PM

#

jaunty helm you mean this right? ```py pd.DataFrame( confusion_matrix(y_true, y_pred, la...

Have you tuned your SVC? What Kernel are you using?

jaunty helm May 19, 2024, 4:18 PM

#

past meteor Have you tuned your SVC? What Kernel are you using?

I haven't done any tuning, am using rbf.
other untuned models (RandomForest, lightgbm, catboost) also showed a similar precision-recall-f1 trend, I haven't checked their confusion matrices tho

past meteor May 19, 2024, 4:19 PM

#

jaunty helm I haven't done any tuning, am using `rbf`. other untuned models (`RandomForest`...

Can you try tuning it? Just tune the C and the gamma parameters

jaunty helm May 19, 2024, 4:20 PM

#

mkay

#

should I use class_weight='balanced'?

past meteor May 19, 2024, 4:20 PM

#

Have you split off some data properly? If so, just looking at those records that are misclassified helps

past meteor May 19, 2024, 4:21 PM

#

jaunty helm should I use `class_weight='balanced'`?

Looking at your confusion matrix class weights made it worse

jaunty helm May 19, 2024, 4:22 PM

#

past meteor Have you split off some data properly? If so, just looking at those records that...

split off some data properly what does that mean?

past meteor May 19, 2024, 4:22 PM

#

train test split followed by cross validation

#

And not infinitely checking the misclassified rows of your test set (that's cheating)

jaunty helm May 19, 2024, 4:26 PM

#

past meteor train test split followed by cross validation

you mean this right? (I didnt do hyperparams search yet tho)

train_X, test_X, train_y, test_y = train_test_split(X, y)
# hyperparam search / tune on CV(train_X), check model on test_X

jaunty helm May 19, 2024, 4:34 PM

#

jaunty helm mkay

tuning it a lil got me

{'C': 15.705979599964326, 'gamma': 0.0019505526555292866}

# original
-----SVC-----
              precision    recall  f1-score   support

     Dropout       0.85      0.72      0.78       284
    Enrolled       0.54      0.38      0.45       159
    Graduate       0.77      0.93      0.84       442

    accuracy                           0.76       885
   macro avg       0.72      0.68      0.69       885
weighted avg       0.76      0.76      0.75       885

          Dropout  Enrolled  Graduate
Dropout       204        30        50
Enrolled       27        61        71
Graduate        9        22       411


# down sampled
-----SVC-----
              precision    recall  f1-score   support

     Dropout       0.78      0.67      0.72       200
    Enrolled       0.61      0.66      0.63       199
    Graduate       0.75      0.80      0.77       200

    accuracy                           0.71       599
   macro avg       0.71      0.71      0.71       599
weighted avg       0.71      0.71      0.71       599

          Dropout  Enrolled  Graduate
Dropout       133        51        16
Enrolled       30       132        37
Graduate        7        34       159

jaunty helm May 19, 2024, 4:52 PM

#

welp gotta leave
guess I'll just let optuna run for a while and see if it comes up with smthn better

lapis sequoia May 19, 2024, 4:54 PM

#

Does anyone have any experience with derivative free minimization

#

Are there any new methods that are better

past meteor May 19, 2024, 5:07 PM

#

jaunty helm you mean this right? (I didnt do hyperparams search yet tho) ```py train_X, test...

exactly

past meteor May 19, 2024, 5:07 PM

#

jaunty helm welp gotta leave guess I'll just let optuna run for a while and see if it comes ...

Yeah, you only have 2 parameters so this is something I'd grid search and not use fancy optuna algos 😄

#

GridSearchCv from scikit is enough here

river cape May 19, 2024, 6:59 PM

#

Hello everyone

#

#

I have used a virtual environment and installed these packages

#

but I cannot get them to be imported

serene scaffold May 19, 2024, 7:03 PM

#

@river cape your editor must be using a different environment than the one where you installed stuff

river cape May 19, 2024, 7:05 PM

#

serene scaffold <@753493387864309761> your editor must be using a different environment than the...

OH yea

#

But it isnt showing me the environment which I want

serene scaffold May 19, 2024, 7:13 PM

#

@river cape what editor are you using

river cape May 19, 2024, 7:13 PM

#

VS code

serene scaffold May 19, 2024, 7:14 PM

#

river cape VS code

You need to figure out where myenv is and tell Vs code to use that. You'll also need to restart Jupyter.

river cape May 19, 2024, 7:16 PM

#

serene scaffold You need to figure out where myenv is and tell Vs code to use that. You'll also ...

Thank you it worked

river cape May 19, 2024, 7:17 PM

#

serene scaffold You need to figure out where myenv is and tell Vs code to use that. You'll also ...

Actually it is showing to virtualenv of a folder called Notebooks

#

I deleted that

#

Yet it still points it to that

river cape May 19, 2024, 7:18 PM

#

serene scaffold You need to figure out where myenv is and tell Vs code to use that. You'll also ...

#

I have created another environment for this current folder , it doesnt show up in this

clever cipher May 19, 2024, 8:02 PM

#

I have a question... What would be the logistics of running a small local language model in my simple 2d game that returns strings which are commentaries of various game events? (could be stuff like dialogue)

I'm very new to this, so please forgive the broadness of the question, but how feasible would this be? Where would be a good place to start regarding training my own small scale models?

serene scaffold May 19, 2024, 8:25 PM

#

clever cipher I have a question... What would be the logistics of running a small local langua...

You won't be able to train anything that is anything like what most people think language models are, unfortunately.

#

Models like ChatGPT are specifically generative and interactive language models

#

But language models are actually a much broader class of model than that.

#

You can make a generative language model that's based on markov chains on your laptop. Though I'm not sure it would produce coherent responses to game events

#

You would also have to encode each game event as a natural language statement.

serene scaffold May 19, 2024, 8:29 PM

#

serene scaffold You can make a generative language model that's based on markov chains on your l...

In particular, it wouldn't be able to understand game state.

#

Or remember it.

spring field May 19, 2024, 9:38 PM

#

serene scaffold You won't be able to train anything that is anything like what most people think...

alright, but training aside, surely they could at least fine-tune an existing model
to some extent anyway

#

or if not fine-tune then throw prompts at it to make it behave the way they want

serene scaffold May 19, 2024, 9:42 PM

#

spring field alright, but training aside, surely they could at least fine-tune an existing mo...

If The model is small enough that they can run it on their laptop then yes

#

I should have said that, I was distracted by pycon

spring field May 19, 2024, 9:49 PM

#

speaking of language models, my last attempt at improving the multilayer GRU next token predictor was to add layer normalization between them, to... speed up training? at least that's what I understand all these normalization layers are for (I actually managed to ~~read~~ skim over a couple papers on the topics I wanted to find out more about (currently reading the paper on attention)), not sure how much of an impact those layers had, but nonetheless, without handling class imbalance it converged to predicting only . pretty much, handling class imbalance the test loss just went up and up, so uhhh, idk what could be the issue (maybe I should MLFlow through some hyperparameters), maybe the dataset is not large enough, maybe this or maybe that, I don't know, I'll now proceed over to transformers though and yeah, that's kind of my little update 😁

past meteor May 19, 2024, 11:32 PM

#

spring field speaking of language models, my last attempt at improving the multilayer GRU nex...

Yup, tune the parameters 🚀

late lichen May 20, 2024, 1:08 AM

#

I want to do a topological sorting

#

But I don't know how

Pls someone help me

#

My input data is Like

{
   "Node_ID":["bias : int",
            "Activator : callable",
            [["Descendant_ID : int ,
               weight : float"], ...]],...
}

keen crow May 20, 2024, 1:09 AM

#

Hi guys! Im currently working on a ML project which consists of training a Resnet 18 model to learn to predict tire thread depth. I have a fully working code right now, but its not achieveing desired accuracy. I have tried a lot of different stuff but still cant seem to achieve the desired goal. Would someone mind helping me figure out a solution to get better accuracy? I would really
it!

late lichen May 20, 2024, 1:11 AM

#

late lichen My input data is Like ```py { "Node_ID":["bias : int", "Activator...

This project will be use to sort the nodes while training the network using NEAT

#

Yes I will use DAGs network

late lichen May 20, 2024, 1:38 AM

#

Dead chat?

agile cobalt May 20, 2024, 1:44 AM

#

late lichen Dead chat?

if you're working with graphs, you odds are you should be using one of these, so here:

https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.dag.topological_sort.html (python library)
https://neo4j.com/docs/graph-data-science/current/algorithms/dag/topological-sort/ (graph database)
if you are not using either of them, then first load your data into either of them then use them.

#

alternatively, you could use the source code of NetworkX as a reference to how to implement it yourself if you truly must

late lichen May 20, 2024, 1:52 AM

#

Thanks

spring field May 20, 2024, 2:19 AM

#

I mean that I pretty much make the distribution of the words the same, i.e., factor / occurences, that is apparently something one should do and it definitely makes sense for classification tasks or regression and such. It might be that I need to increase the threshold for when a token gets put into the group of rare tokens. I think can see how in next token prediction class imbalance might be desired, but it just converges to the most common token it seems, could be an issue with my model obviously, is GRU used for next token prediction even? There are frankly tons of variables here at play obviously, but I think for now I'll certainly just move on and direct my efforts towards understanding transformers.

spring field May 20, 2024, 5:06 AM

#

yeah, I remember, I'll just assume something's off with the model then
and continue with transformers
cuz I also need to cover ViT afterwards

vestal imp May 20, 2024, 7:15 AM

#

"Layer 'normalization' expected 3 variables, but received 0 variables during loading. Expected: ['normalization/mean:0', 'normalization/variance:0', 'normalization/count:0']" guys does anyone knows abt this error?

#

so far online source doesnt rlly provide a good answer

vestal imp May 20, 2024, 7:58 AM

#

For my case i have trained the model on gg colab which used tf 2.15 and now I'm loading it out using tf 2.15, mayb the process of downloading the file from gg drive to my machine has problems?

#

So u mean like the version is matched so what if left is the .keras file download got probs?

#

Ah gotcha, i will look into it, ty for the help mate

jaunty helm May 20, 2024, 8:04 AM

#

past meteor Yeah, you only have 2 parameters so this is something I'd grid search and not us...

welp, I'll keep that in mind the next time 😅
anyways, this is what I got

{'C': 31973.23413892633, 'gamma': 9.61611073865163e-05}

# tuned on CV(X_train) and checked on X_test
-----SVC-----
              precision    recall  f1-score   support

     Dropout       0.82      0.70      0.76       284
    Enrolled       0.51      0.39      0.44       159
    Graduate       0.78      0.91      0.84       442

    accuracy                           0.75       885
   macro avg       0.70      0.67      0.68       885
weighted avg       0.74      0.75      0.74       885

          Dropout  Enrolled  Graduate
Dropout       200        34        50
Enrolled       30        62        67
Graduate       13        25       404

# no hyperparams
-----SVC-----
              precision    recall  f1-score   support

     Dropout       0.83      0.70      0.76       284
    Enrolled       0.51      0.36      0.43       159
    Graduate       0.76      0.92      0.83       442

    accuracy                           0.75       885
   macro avg       0.70      0.66      0.67       885
weighted avg       0.74      0.75      0.74       885

          Dropout  Enrolled  Graduate
Dropout       198        34        52
Enrolled       26        58        75
Graduate       14        21       407

jaunty helm May 20, 2024, 8:17 AM

#

jaunty helm welp, I'll keep that in mind the next time 😅 anyways, this is what I got ```py...

sooo I don't think the hyperparams did much

slim finch May 20, 2024, 9:01 AM

#

Hello everyone,

I have developed a Python application for Windows that transcribes speech using OpenAI's Whisper model. I've also created a small UI for this app. However, I'm running into issues when trying to create a .exe file to share the program.

The main problem is with the backend: when I try to transcribe speech using the .exe version, I encounter various errors. It appears that not all dependencies are being included in the installer, likely due to the extensive nature of the Whisper model.

Could anyone advise on the best way to package such a large project with a substantial backend? Any tips or solutions would be greatly appreciated!

Thank you!

past meteor May 20, 2024, 9:15 AM

#

jaunty helm sooo I don't think the hyperparams did much

The next step is just looking at the misclassified ones

boreal nest May 20, 2024, 9:45 AM

#

Hello everyone, sharing a notebook on customer segmentation using KMeans and Hierarchical Clustering. I'm yet to finish the conclusion. I hope you guys can check it out. Thank you.
https://www.kaggle.com/code/jaepin/customer-segmentation-using-unsupervised-ml-algo

Customer Segmentation using Unsupervised ML Algo.

Explore and run machine learning code with Kaggle Notebooks | Using data from Customer Segmentation : Clustering

jaunty helm May 20, 2024, 9:53 AM

#

past meteor The next step is just looking at the misclassified ones

do I like compare them against the correctly classified ones or something?
(there are 37 columns and I'm not sure how I'd see what's causing the misclassification manually)
not sure what I'm supposed to look at

past meteor May 20, 2024, 10:47 AM

#

jaunty helm do I like compare them against the correctly classified ones or something? (ther...

Yeah, manually. It's time consuming but you need to get intuition on what's going wrong. Another thing you can do is make plots that show the error versus each variable. Maybe you can see that if this variable occurs errors are more common and so on

jaunty helm May 20, 2024, 10:49 AM

#

past meteor Yeah, manually. It's time consuming but you need to get intuition on what's goin...

aight, ty for your help and insights

dreamy sorrel May 20, 2024, 12:00 PM

#

hi guys! I am working on a multilabel classification task, and i have 3 models trained with different datsets. I want to ensemble these models. whats the best appraoch?

vestal imp May 20, 2024, 1:02 PM

#

I mean just tf.keras.models.load_model ye?

#

Does being a h5 or a keras file has anything to do with it aye?

vestal imp May 20, 2024, 1:30 PM

#

weird both keras and tf are 2.15 in both places

orchid forge May 20, 2024, 4:41 PM

#

I just needed to know

warm trellis May 20, 2024, 4:41 PM

#

Anyone here using lightening ai? to train their model

orchid forge May 20, 2024, 4:41 PM

#

If thishttps://www.datawars.io/
Is a good place to make projects

DataWars - Free Data Science Interactive Projects

DataWars is a Project-based playground with +1000 ready-to-solve, interactive, Data Science projects. Practice your skills solving real life challenges in an interactive, real-life Data Science simulator.

river cape May 20, 2024, 5:21 PM

#

print("{0:20}{1:20}".format(word, wordnet_lemma.lemmatize(word, pos='v'))

#

Any idea as to what does :20 mean?

agile cobalt May 20, 2024, 5:39 PM

#

!e adds some empty space (padding) to align ```py
for name in ('Foo', 'Title Bar', 'Long Title Baz'):
print(f'{name:20} - test')

arctic wedgeBOT May 20, 2024, 5:39 PM

#

@agile cobalt :white_check_mark: Your 3.12 eval job has completed with return code 0.

001 | Foo                  - test
002 | Title Bar            - test
003 | Long Title Baz       - test

agile cobalt May 20, 2024, 5:40 PM

#

!d str.format

arctic wedgeBOT May 20, 2024, 5:40 PM

#

str.format


str.format(*args, **kwargs)```
Perform a string formatting operation. The string on which this method is called can contain literal text or replacement fields delimited by braces `{}`. Each replacement field contains either the numeric index of a positional argument, or the name of a keyword argument. Returns a copy of the string where each replacement field is replaced with the string value of the corresponding argument.

```py
>>> "The sum of 1 + 2 is {0}".format(1+2)
'The sum of 1 + 2 is 3'
```  See [Format String Syntax](https://docs.python.org/3/library/string.html#formatstrings) for a description of the various formatting options that can be specified in format strings.

agile cobalt May 20, 2024, 5:40 PM

#

See Format String Syntax for a description of the various formatting options that can be specified in format strings.

normal python May 20, 2024, 6:12 PM

#

Confession time: Trying to make pytorch to work with ROCm is the most rage inducing, frustrating and completely awful experience that anyone ever had to experience

left tartan May 20, 2024, 7:30 PM

#

tbh, I think it's a good idea. Not a lot of practical project sites for data.

#

It seems like it's more educational/useful than leetcode

warm trellis May 20, 2024, 7:59 PM

#

Hey guys. I've a weather dataset where I train data to learn how to predict solar irradiance. Now after it, I wanna use this model as a base model for PV solar energy prediction in transfer learning, but it does not do any good job in the latter no sort of change. So how can I debug what went wrong?

spring field May 20, 2024, 8:38 PM

#

#ot2-never-nester’s-nightmare message

#

that's intriguing

#

but it's only 50 days

#

like a db, but with history

final swift May 20, 2024, 11:14 PM

#

I'm working on a project where I'm trying to get an AI to learn how to play blackjack, would it be simpler to get the AI to predict whether you will lose or not depending on each move you could make and then make decisions based on that, or would it be simpler to just go straight to making decisions based on the state of the game and then improving based on the results of that?

serene scaffold May 21, 2024, 1:26 AM

#

final swift I'm working on a project where I'm trying to get an AI to learn how to play blac...

I don't know the rules of blackjack. But for chess, the non-neural approach is to compute all the possibilities up to n turns out, and then use a heuristic to decide which path is best.

final swift May 21, 2024, 1:27 AM

#

Okay thank you.

serene scaffold May 21, 2024, 1:29 AM

#

If you have a model that "predicts whether the next move will eventually cause you to win or lose", you still have to decide how the model will learn to make that prediction. Which still involves learning some understanding of what makes one move better than another

#

Does blackjack involve random chance?

frosty fulcrum May 21, 2024, 1:31 AM

#

does anyone know what's the best deep learning model for Regression task?

serene scaffold May 21, 2024, 1:39 AM

#

frosty fulcrum does anyone know what's the best deep learning model for Regression task?

Hello, welcome to our wonderful data science chat.

Try being more specific about your task

frosty fulcrum May 21, 2024, 1:41 AM

#

serene scaffold Hello, welcome to our wonderful data science chat. Try being more specific abou...

I don't know how to be more specific other than that the dataset doesn't contain a Time feature, so LSTM isn't helpful.

serene scaffold May 21, 2024, 1:41 AM

#

frosty fulcrum I don't know how to be more specific other than that the dataset doesn't contain...

Try telling the chat about the dataset

frosty fulcrum May 21, 2024, 1:46 AM

#

serene scaffold Try telling the chat about the dataset

Well, the dataset is in the CSV format and contains 21 features, whereas the Yield values range from 0-1. I’ve tried XGBoost and CatBoost, but the accuracy seems to be stuck at 86%. I believe the dataset is imbalanced, but I want to try other models as well to see if I can reach 90% without doing data augmentation.

serene scaffold May 21, 2024, 1:48 AM

#

frosty fulcrum Well, the dataset is in the CSV format and contains 21 features, whereas the Yie...

I've heard that xgboost tends to be the best for tabular data. Are there any hyper parameters you can change?

Take a look at what instances are getting predicted incorrectly. What do they have in common? Are there any patterns in the data that those instances don't uphold?

frosty fulcrum May 21, 2024, 1:50 AM

#

serene scaffold I've heard that xgboost tends to be the best for tabular data. Are there any hyp...

I’ve tried the Grid Search CV method to find the best hyper-parameters. It improved the accuracy but not by a lot.

frosty fulcrum May 21, 2024, 1:51 AM

#

serene scaffold I've heard that xgboost tends to be the best for tabular data. Are there any hyp...

I really don’t want to do data analysis, but it seems like I’ve no other option. oof

jaunty helm May 21, 2024, 1:59 AM

#

frosty fulcrum I really don’t want to do data analysis, but it seems like I’ve no other option....

if you just want something quick, have you looked into automl solutions like autogluon, tpot, etc?

jaunty helm May 21, 2024, 2:07 AM

#

frosty fulcrum Well, the dataset is in the CSV format and contains 21 features, whereas the Yie...

imbalanced
a lot of these models have parameters that try to remedy this, for xgb I believe it's sample_weight, catboost class_weights, lgbm is_unbalance, etc

#

and also if you know the data is imbalanced, accuracy may not be what you want to look at, something like f1 might be better

frosty fulcrum May 21, 2024, 2:09 AM

#

jaunty helm if you just want something quick, have you looked into automl solutions like aut...

thx, I'll look into it.

orchid forge May 21, 2024, 3:52 AM

#

idk, tell me if its good

spring field May 21, 2024, 3:55 AM

#

Yeah, idk, I implemented a transformer, but it's only predicting dots... It does seem to converge faster to just predicting this one symbol compared to the RNNs, but nonetheless that is not the expected result as one might imagine
I mean, at this point, I have tried like at least 3 separate models (two RNN types, now one Transformer type), they just don't want to work and I'm a tad lost here
maybe it is the dataset, maybe it is that it's not actually tokenizing, it's more just splitting words
another thing I noticed is that they had a lot more sentences and a lot more words in the vocabulary, the dataset I'm using has 10k tokens and 34k sentences and I did adjust the other hyperparameters according to the paper (Attention Is All You Need) (as closely as possible)

#

dots are the most common token, by quite a large margin too

#

I'll let it train

#

maybe by some miracle, a miracle will happen, idk, will see later ig

orchid forge May 21, 2024, 4:11 AM

#

what is itertools im not able to understand

spring field May 21, 2024, 4:11 AM

#

tools for various iteration needs, you're gonna have to be a bit specific if you want a more specific answer

orchid forge May 21, 2024, 4:14 AM

#

it is in a code which which is combining two different dataset columns

orchid forge May 21, 2024, 4:14 AM

#

spring field tools for various iteration needs, you're gonna have to be a bit specific if you...

this

spring field May 21, 2024, 4:16 AM

#

may I suggest the documentation for itertools.product
it's really just a couple iteration utilities, certainly one of the built-in modules I would recommend being familiar with at least to some extent

Python documentation

itertools — Functions creating iterators for efficient looping

This module implements a number of iterator building blocks inspired by constructs from APL, Haskell, and SML. Each has been recast in a form suitable for Python. The module standardizes a core set...

orchid forge May 21, 2024, 4:16 AM

#

hmm

#

well is this part of data analysis or data science?

#

or anything else

#

and why is it called module

spring field May 21, 2024, 4:18 AM

#

if you make it a part of either, then it is a part of it

orchid forge May 21, 2024, 4:18 AM

#

ok

spring field May 21, 2024, 4:19 AM

#

orchid forge and why is it called module

while I can't give you the exact origins of the name, I assume it's to do with Python files being modular in that you can import them and reuse and stuff like that, so, yeah, you can create your own modules, you can use built-in modules, you can use 3rd party modules, they're also sometimes called packages when there are several modules grouped together or libraries, but yk, they're all modules when you import them anyway, it's just an object in Python pretty much

#

it's probably something I'd recommend getting familiar with before trying to learn stuff like pandas

orchid forge May 21, 2024, 4:21 AM

#

yeah sure, its funny question weird things just becuz i wanna to do english literature than this

#

i question the meaning of word instead questioning what is it use for lol

#

hahaha

spring field May 21, 2024, 4:30 AM

#

spring field maybe by some miracle, a miracle will happen, idk, will see later ig

yk what, it seems it's going away from them dots pg_party

#

I forgot to add rollouts so I'm stuck with test predictions and not some fun fresh stuff, but at least it's not just dots

#

orchid forge May 21, 2024, 4:44 AM

#

orchid forge it is in a code which which is combining two different dataset columns

im sorry, im not understanding why we used Itertools.Product() here

spring field May 21, 2024, 4:49 AM

#

well, did you read the documentation? what part of the docs did you not understand? maybe I can help you explain it.
on a higher level it just seems to have been used to generate those combinations pretty much

spring field May 21, 2024, 4:59 AM

#

orchid forge im sorry, im not understanding why we used Itertools.Product() here

hopefully this helps you better understand what itertools.product does

spring field May 21, 2024, 5:01 AM

#

spring field yk what, it seems it's going away from them dots <:pg_party:772652894574084098>

yooo, actually, it's doing a lot better than the dang GRU net (maybe it just needed a couple thousand more iterations and it would've been at the same level too...)

spring field May 21, 2024, 5:22 AM

#

(I still partially blame the dataset)

burnt pond May 21, 2024, 5:23 AM

#

How do I start learning ml and ai

Like I know the maths such as calculus and statistics I have also learned python so I would like to know how to start learning ml and ai , would also like if you provide some Good teaching websites or yt tutorials not those too heavy ones but just some basic to intermediate so that I can atleast create my own models and also fine tuning them

spring field May 21, 2024, 5:24 AM

#

burnt pond How do I start learning ml and ai Like I know the maths such as calculus and s...

check out the pinned messages in this channel

orchid forge May 21, 2024, 7:21 AM

#

spring field hopefully this helps you better understand what `itertools.product` does

omg thanks....this is exactly what i want to learn data analysis. this visual way of showing a code

#

btw im gonna have to give an interview for a "Web Data Scraping Company" - https://www.actowizsolutions.com/about-us.php
guys can anyone tell me what to prepare for this

Enterprise Web Data Scraping Company USA, UAE, and Australia

World’s most prominent companies trust Actowiz Solutions to convert millions of web page data into actionable insights. Get top Enterprise web data scraping company USA, UAE, and Australia.

orchid forge May 21, 2024, 7:39 AM

#

oh thanks, i guess ypu appove this website. im happy im learning on it. that i first came up with this, now everyone knows. haha i'm so amazing

#

what sujbects i would need to be a good data analysis tell me'

#

i'll go there and learn it

#

i'll go with 'Probability and Statistics' for now and learn it before i go to sleep

#

instead of wasting time playing "age of mythology"

#

"One strategy is to use the website's API, if available. APIs often provide a more structured way of accessing data and are less likely to have blocks in place to prevent scraping."
what does website API means?

orchid forge May 21, 2024, 7:56 AM

#

orchid forge "One strategy is to use the website's API, if available. APIs often provide a mo...

also what does this means in simple launguage

#

oh

#

so we read the given set of rules and understand the block they might give us for their authentication requirement and then do something with the block given

#

right?

#

i'm sorry for my English i sound do unprofessional

#

yes

fleet epoch May 21, 2024, 8:03 AM

#

hello Im new here im doing one project with neural collaborative filtering and i tryed to do it thru gpu I have win 11 and i downloaded latest version of cuda bud pytorch cant find it. and yes i have GPU what shoul I do ?

orchid forge May 21, 2024, 8:03 AM

#

ohhh god how can u make so much sense

#

hmmm right

#

but lets assume if there's a block

#

oh

#

god wow

#

how could u know this much man

#

you're amazing

#

so you're a data analyst ?

#

idk what is that but sounds cool

#

oh

past meteor May 21, 2024, 8:12 AM

#

Maximum likelihood estimator

orchid forge May 21, 2024, 8:12 AM

#

i think it might be night there, if u r working in day, i think u should sleep and rest, u seem like a hard working person idk

past meteor May 21, 2024, 8:13 AM

#

fleet epoch hello Im new here im doing one project with neural collaborative filtering and i...

Are you using windows subsystem for Linux?

fleet epoch May 21, 2024, 8:13 AM

#

no pure win

orchid forge May 21, 2024, 8:13 AM

#

yeah well u r a solid person

past meteor May 21, 2024, 8:13 AM

#

Yeah, the latest torch versions don't work with GPU on pure windows

orchid forge May 21, 2024, 8:14 AM

#

good man your future is bright

fleet epoch May 21, 2024, 8:14 AM

#

past meteor Yeah, the latest torch versions don't work with GPU on pure windows

what should i do ?

orchid forge May 21, 2024, 8:14 AM

#

good luck

past meteor May 21, 2024, 8:14 AM

#

fleet epoch what should i do ?

You'll have to use WSL

#

It's explained somewhere in the docs

#

Actually I'm wrong sorry

#

This was the case with Tensorflow 🥴

lapis sequoia May 21, 2024, 8:15 AM

#

does anybody know of a good way to generate smooth noise

past meteor May 21, 2024, 8:16 AM

#

fleet epoch what should i do ?

Hmmmm you should be able to use it with pure windows. Have you tried running torch.cuda.is_available()

orchid forge May 21, 2024, 8:17 AM

#

i mean i love this server becuz people here help a dumb person like me to understand things in better human language, i really trust u guys blindly. i've only came across smart people from this people. who always help me learn. you're one of them. thanks for helping me. you rock!

fleet epoch May 21, 2024, 8:17 AM

#

past meteor Hmmmm you should be able to use it with pure windows. Have you tried running `to...

False

cedar tusk May 21, 2024, 8:24 AM

#

lapis sequoia does anybody know of a good way to generate smooth noise

smooth noise as in random numbers around the mean?

lapis sequoia May 21, 2024, 8:31 AM

#

cedar tusk smooth noise as in random numbers around the mean?

yes and it should look smooth

lapis sequoia May 21, 2024, 8:32 AM

#

past meteor Yeah, the latest torch versions don't work with GPU on pure windows

wym

#

#

smooth noise

#

how in any dimensions

cedar tusk May 21, 2024, 9:27 AM

#

lapis sequoia yes and it should look smooth

u can just use normally distributed values from numpy with howmuchever mean and deviation and then just shuffle em

#

do it for each dimension and voila

#

u prob can even type a oneliner like this

df = pd.DataFrame({f'Column{i+1}': np.random.permutation(np.random.normal(0, 1, 100)) for i in range(3)})

#

change the 0 to be mean, 1 to be the deviation and 100 to be the amount of values

#

oh wait a sec, u mean smooth as in the values must be continuous?

#

ok scratch that do this instead

import numpy as np
import matplotlib.pyplot as plt
from perlin_noise import PerlinNoise

# Set the random seed for reproducibility
np.random.seed(42)

# Define the size of the grid
grid_size = 100

# Create a Perlin noise object
noise = PerlinNoise(octaves=4, seed=42)

# Generate the 2D noise grid
noise_grid = np.array([[noise([i/grid_size, j/grid_size]) for j in range(grid_size)] for i in range(grid_size)])

# Display the noise grid
plt.imshow(noise_grid, cmap='gray')
plt.colorbar()
plt.title('2D Smooth Noise Grid')
plt.show()

#

this yields this image

#

u can find a more detailed package in https://pynoise.readthedocs.io/en/latest/

#

which also has perlin

spring field May 21, 2024, 9:58 AM

#

in like 2 hours, but the good news is that it worked out in the end, yay, I was pessimistic at the start, but boy did a miracle (it's called math) happen, it went to like over 90% test accuracy, all the example inputs and predicted outputs that were printed matched perfectly, I'll try to do inference as well, as I didn't have any rollouts going on during training
oh and those attention matrices... man, those were beautiful to see
will share the fun stuff in them couple hours as well

spring field May 21, 2024, 11:29 AM

#

#

honestly, kinda crazy

#

right and also the implementation (well, a chunk of it anyway) https://paste.pythondiscord.com/BJAQ

#

also the actual metrics at epoch 22

#

so yeah, very cool, onto the other stuff now

spring field May 21, 2024, 11:54 AM

#

sure, I can try, if there will be any free GPUs available on paperspace, well, they'll become available eventually, so... yeah, but sure, can do

#

but using the dataset I have or sth else?

serene scaffold May 21, 2024, 4:10 PM

#

<@&831776746206265384> it appears that @vernal thunder is advertising a YouTube channel

long locust May 21, 2024, 4:10 PM

#

Hello, your message has been deleted due to violating rule 6 of the server, which does not allow advertising

craggy agate May 21, 2024, 4:18 PM

#

People, how do I use my M3 chip's GPU for tensorflow?

left tartan May 21, 2024, 4:20 PM

#

Hah, Excel just announced a cutting edge new feature today:

#

https://www.neowin.net/news/microsoft-365-insiders-can-try-out-new-regex-functions-in-excel-for-windows-and-mac-apps/

craggy agate May 21, 2024, 4:21 PM

#

I have tried to install tensorflow macos but it didn't work I think.

#

Failed to install tensorflow metal from pip

#

and tensorflow GPU

spring field May 21, 2024, 4:42 PM

#

left tartan <https://www.neowin.net/news/microsoft-365-insiders-can-try-out-new-regex-functi...

why would they introduce more problems into their product and people's lives? smh

craggy agate May 21, 2024, 4:57 PM

#

Rough take, switching from Tensorflow to Pytorch is a pain.

#

I am learning Pytorch but it just seems hella complex compared to tensorflow

#

My take is in TF you gotta worry about the logic and building of the model more than the syntax, for Pytorch, you gotta do both.

agile cobalt May 21, 2024, 5:04 PM

#

can you show some examples of which syntax elements you're thinking about?

craggy agate May 21, 2024, 5:11 PM

#

agile cobalt can you show some examples of which syntax elements you're thinking about?

Mainly having to work with OOP and just extra code for something simple like an ANN.

#

I could do the same thing with half the code with tensorflow

#

That's a little extreme but you get the point

#

Yes but why the need for OOP?

#

Tensorflow doesn't use it, idk why Pytorch needs to.

#

Enlightened me if I got it wrong

spring field May 21, 2024, 5:14 PM

#

last I checked, TF definitely does use OOP
and yes need, you gotta keep a bunch of internal state somewhere

craggy agate May 21, 2024, 5:15 PM

#

spring field last I checked, TF definitely does use OOP and yes need, you gotta keep a bunch ...

Not as much as Pytorch though

#

Everything I see with Pytorch just makes my head hurt

spring field May 21, 2024, 5:15 PM

#

well, ig there's the whole torch.nn.functional which at least partially is getting depreacated for torch, so... hmm

craggy agate May 21, 2024, 5:16 PM

#

spring field well, ig there's the whole `torch.nn.functional` which at least partially is get...

Class Model(nn.Module)

#

I assume this is initializing the nn?

#

I am 100% lost 😂

spring field May 21, 2024, 5:17 PM

#

wait, I swear I saw a warning about some nn.functional thing getting deprecated for just torch.

craggy agate May 21, 2024, 5:18 PM

#

Where would y'all recommend I learn Pytorch from?

#

Documentations?

past meteor May 21, 2024, 5:18 PM

#

tensorflow does OOP too as soon as you want to make any non-trivial model

craggy agate May 21, 2024, 5:19 PM

#

Ong learning data science is insanely difficult without university courses...

#

Deep learning specifically

past meteor May 21, 2024, 5:19 PM

#

They've both converged to very similar libs, especially with LazyLinear, LazyConv2D and so on being a thing now in Torch

past meteor May 21, 2024, 5:20 PM

#

craggy agate Documentations?

yes, reading the docs is what I'd recommend

#

the hard part is rarely the code though

past meteor May 21, 2024, 5:20 PM

#

craggy agate Ong learning data science is insanely difficult without university courses...

If I were you I'd read a book about this stuff

craggy agate May 21, 2024, 5:21 PM

#

Like I understand all the theory to NNs but this is just a tad bit confusing

past meteor May 21, 2024, 5:21 PM

#

Even if you're doing a uni course you still venture out and learn on your own

craggy agate May 21, 2024, 5:21 PM

#

past meteor If I were you I'd read a book about this stuff

Maybe that's a good idea

past meteor May 21, 2024, 5:21 PM

#

craggy agate Maybe that's a good idea

https://arxiv.org/abs/2106.11342

arXiv.org

Dive into Deep Learning

This open-source book represents our attempt to make deep learning approachable, teaching readers the concepts, the context, and the code. The entire book is drafted in Jupyter notebooks, seamlessly integrating exposition figures, math, and interactive examples with self-contained code. Our goal is to offer a resource that could (i) be freely av...

craggy agate May 21, 2024, 5:21 PM

#

past meteor Even if you're doing a uni course you still venture out and learn on your own

I am not, I am literally in grade 9 💀

#

The course I am following should have let me know that OOP would be required, never bothered to learn it before.

#

Should probably learn it now.

spring field May 21, 2024, 5:23 PM

#

a factory of closures? that sounds like peak Javaism

past meteor May 21, 2024, 5:23 PM

#

I'm not even sure Java has Closures

craggy agate May 21, 2024, 5:23 PM

#

Yeah.

past meteor May 21, 2024, 5:24 PM

#

You need free functions for closures. Maybe they have it now because they're adding every feature known to man in the latest Java versions, but I digress

spring field May 21, 2024, 5:24 PM

#

but ofc they have anonymous classes

Java initially didn't have syntactic support for closures (these were introduced in Java 8), although it was fairly common practice to simulate them using anonymous inner classes.
https://stackoverflow.com/a/3805576/14531062

past meteor May 21, 2024, 5:25 PM

#

spring field but ofc they have anonymous classes > Java initially didn't have syntactic suppo...

Oh god this looks terrible 😂

past meteor May 21, 2024, 5:27 PM

#

craggy agate Should probably learn it now.

Anyhow, just learn the syntax of OOP in Python but nothing more. Learn what class FeedForwardNetwork(L.LightningModule) means

#

You can do this:

from dataclasses import dataclass

@dataclass
class Network(nn.Module):
  w1: nn.Parameter
  
  def forward():
    pass

#

dataclass is stdlib

#

I'm sure there's a way you could do it. You can pass default, mutable parameters with field_init or something similar

#

this is what I meant: https://docs.python.org/3/library/dataclasses.html#dataclasses.field

Python documentation

dataclasses — Data Classes

Source code: Lib/dataclasses.py This module provides a decorator and functions for automatically adding generated special methods such as__init__() and__repr__() to user-defined classes. It was ori...

#

60-70 % of the classes I make are with dataclass but I don't make torch stuff with it (or use dataclass + inheritance, in general)

#

Something in me thinks it'll go south (and I don't know why)

#

like some sort of edgecase

spring field May 21, 2024, 5:33 PM

#

default_factory FTW

past meteor May 21, 2024, 5:35 PM

#

well, I'm not gonna find out any time soon 😂

#

There's something I'm missing tbh. I love absolute reproducibility but not the effort it entails. Some CLI tool that registers model / experiment runs and you can "reactivate" them with the CLI tool, which does a checkout to a commit and runs the .py file

vapid pumice May 21, 2024, 5:38 PM

#

Hey guys I need a help here,

I'm trying to make a Website Summarizer using Python and AI.
-> Which model is best for this use case? And I'm very new to this so I want to know if any type of fine-tuning can be done on the model.
-> Is there any web-service where I can deploy the model for free so I can use it as a backend to test, or I can run it locally in my laptop.
-> Is there also any lightweight fast models for this use case.

Ik there are too many questions here, and any help is appreciated.

past meteor May 21, 2024, 5:40 PM

#

vapid pumice Hey guys I need a help here, I'm trying to make a Website Summarizer using Pyt...

Consider using an existing model/service. You could use OpenAI's models to do this, it makes the task significantly easier. There's also https://goose.ai/. Both of these assume you're at least willing to put a little bit of money in it though.

spring field May 21, 2024, 5:41 PM

#

there are several issues with this
first, how are you gonna retrieve the content of those websites? you have to make sure you're not violating their ToS
second, it will most likely cost something, there are free models that can deal with RAG such as langchain, but if you want to host it someplace, you'll need to pay for the computation time, even if it runs on the CPU
third, ehhh, fine-tuning? probably won't be necessary if you just go with RAG (which I'm not sure where that technique lies in the state of the art hierarchy)

past meteor May 21, 2024, 5:42 PM

#

I spend very little time making "data science" code pretty 🤷

spring field May 21, 2024, 5:43 PM

#

data science code has got to be the ugliest code I write casually
I try my best whenever I can though

past meteor May 21, 2024, 5:43 PM

#

My DS code is really bad. I code the happy path and constrain myself to only using that

#

But, I'd say my standard for "really bad" isn't really bad (humble brag)

#

I still have a couple of interfaces, use some level of OOP etc etc but it's definitely a lot worse than my other code for various reasons

spring field May 21, 2024, 5:46 PM

#

can multi-headed attention be parallelized by using a bigger weight matrix?

past meteor May 21, 2024, 5:46 PM

#

The interfaces are mostly because for instance for work I was making encoder decoder models. It didn't makke sense, imo, to just flat out code a Seq2Seq but to make an TimeSeriesEncoder TimeSeriesDecoder because you can have a CNN encoder, with an RNN decoder, vice versa or CNN, CNN, RNN RNN and so on

spring field May 21, 2024, 5:46 PM

#

oh, thought so

past meteor May 21, 2024, 5:47 PM

#

I'm making something in this space soon 👀

#

Very early PoC

craggy agate May 21, 2024, 5:49 PM

#

past meteor Anyhow, just learn the syntax of OOP in Python but nothing more. Learn what `cla...

Definitely, thanks, I am learning about classes and objects rn, will learn a little bit more to get myself comfortable then get back to Pytorch.

spring field May 21, 2024, 5:50 PM

#

spring field oh, thought so

dam, each of my multi-headed attention blocks could've been 8x faster had I opted for this
ehh, it was expected shruganimated

past meteor May 21, 2024, 5:50 PM

#

General purpose inference server, built in Rust 🦀 that loads models from "storage" (S3, minio, azure blob, filesystem, ...) that are in ONNX format. Next to that also make a small Python lib that wraps libraries that export models to ONNX, add metadata necessary for my inference server and store it in the "storage" (that rust is reading from). Deploying a new model as an endpoint would just be export_to_storage and the endpoint is created automatically

craggy agate May 21, 2024, 5:50 PM

#

If classes are classes then what are classes?

#

You were talking about the "class" objectright?

#

I see, okay

spring field May 21, 2024, 5:51 PM

#

well, type is the default metaclass and other metaclasses have to inherit from it

past meteor May 21, 2024, 5:52 PM

#

(maybe not details that matter to beginners tho)

spring field May 21, 2024, 5:52 PM

#

everything (aside from type ig) inherits from object

#

but tbf, I would not worry about it even at an upper intermediate level

#

everything's a PyObject *

#

in CPython anyway

past meteor May 21, 2024, 5:53 PM

#

Metaprogramming is dangerous because it's easy to convince yourself you need it

#

Yup, but particularly for things you train yourself. Especially traditional ML models.

#

Doubt it. it's not a hard language at all imo but you need patience + to care about programming

vapid pumice May 21, 2024, 5:56 PM

#

past meteor Consider using an existing model/service. You could use OpenAI's models to do th...

I'd love to use an API service, but I'm making this as a project so a model running locally or smt that I can modify or use would be a great addition.

past meteor May 21, 2024, 5:56 PM

#

And that's only a small % of people programming professionally (and that's totally ok! Your job doesn't need to be your hobby)

past meteor May 21, 2024, 5:56 PM

#

vapid pumice I'd love to use an API service, but I'm making this as a project so a model runn...

Then huggingface is your best bet https://huggingface.co/

#

ah, an unimportant detail is I want the project to be usable by people that only know Python and no Rust

#

Like through a CLI tool that spins up the inf server so as a user you just focus on making models and putting them in the right storage

#

Hmmm, I doubt it because cpp doesn't have web related stuff that is popular + it's much more of a pain to write

craggy agate May 21, 2024, 6:00 PM

#

Here is a question, would you guys say chatGPT is an "AI" or a glorified search engine?

past meteor May 21, 2024, 6:03 PM

#

craggy agate Here is a question, would you guys say chatGPT is an "AI" or a glorified search ...

With AI you probably mean "artificial general intelligence" (also known as AGI)?

The term AI (without the G) actually includes many "basic" things like search algorithms. When I say search here I don' tmean search engines but I mean depth first search, A*, uniform cost search and so on. These are traditionally seen as AI as well.

Now if the question is is it "AGI or a search engine" then the answer is: "ask a philosopher" 😂

craggy agate May 21, 2024, 6:03 PM

#

What I mean is that it's not actually doing anything, its trained on a massive massive dataset so anything you might ask would be in it's dataset, it doesn't do anything out of the box.

agile cobalt May 21, 2024, 6:03 PM

#

REST APIs are mostly a solved problem - you don't suddenly decide you need to change your entire API to use XML instead of JSON, or stop using HTTP Headers to use a different type of request metadata

deep learning is much more experimental on that aspect, so the libraries do have to provide more low-level access, so that researchers can experiment with using different existing or new architectures, including all sorts of layers and connections between them, different ways of training the models, creating new activation functions and so on

#

You can use huggingface transformers if you want something on a similar level of abstraction to fastapi

past meteor May 21, 2024, 6:04 PM

#

Did you use TF1?

#

Well then you can see the progression we've made from TF1 all the way to Pytorch

#

But, I think @agile cobalt 's answer covers it pretty well

craggy agate May 21, 2024, 6:05 PM

#

past meteor With AI you probably mean "artificial general intelligence" (also known as AGI)?...

I mean the process of creating the text and using different code pieces and making a finished code from those is from the ability to learn, that's what makes anything an AI or NI. And no, I don't use it to make code for me.

past meteor May 21, 2024, 6:07 PM

#

The fact we have automatic differentiation is BIG

agile cobalt May 21, 2024, 6:07 PM

#

I guess that another point is that ML libraries cannot sacrifice performance at all, while most wouldn't complain if you got like 50% slower on fastapi because of using middlewares instead of defining things in a more low level way

past meteor May 21, 2024, 6:07 PM

#

it's easy to overlook the strides we've made. For something that is still research (like etrotta says) and not just code => solution means we might be there

#

Like?

craggy agate May 21, 2024, 6:07 PM

#

It has learnt all of python or all of C# from its dataset, similar to how we learned those programming languages. It could be argued that that is what makes it AI.

wooden sail May 21, 2024, 6:08 PM

#

craggy agate It has learnt all of python or all of C# from its dataset, similar to how we lea...

the "learning" in machine learning is not something as lofty as you make it sound here. it just refers to optimizing parameters based on data examples

past meteor May 21, 2024, 6:08 PM

#

Just look at Keras for instance

#

So what's your point then 😭

agile cobalt May 21, 2024, 6:09 PM

#

you realize that app = FastAPI() is already oop right?

craggy agate May 21, 2024, 6:09 PM

#

wooden sail the "learning" in machine learning is not something as lofty as you make it soun...

The weights you mean?

#

Or are there some other parameters you are talking about

wooden sail May 21, 2024, 6:09 PM

#

not necessarily, but sure

wooden sail May 21, 2024, 6:09 PM

#

craggy agate Or are there some other parameters you are talking about

AI and ML do not involve neural networks in general

craggy agate May 21, 2024, 6:10 PM

#

wooden sail AI and ML do not involve neural networks in general

I was not talking about ML specifically

craggy agate May 21, 2024, 6:10 PM

#

wooden sail AI and ML do not involve neural networks in general

It was more about LLMs as a whole

agile cobalt May 21, 2024, 6:10 PM

#

that you cannot reach the level of abstraction you are thinking about without making sacrifices, and these sacrifices are not (yet) viable at this point in time

past meteor May 21, 2024, 6:10 PM

#

Hmm the terms used are ill defined so you can't really have a discussion

wooden sail May 21, 2024, 6:10 PM

#

which falls under deep learning, and that also falls under machine learning

past meteor May 21, 2024, 6:10 PM

#

AI > ML > LLMs

wooden sail May 21, 2024, 6:11 PM

#

yep

past meteor May 21, 2024, 6:11 PM

#

So LLMs are per definition AI

craggy agate May 21, 2024, 6:11 PM

#

What about it though? I understand that but I was thinking about it logically

#

Not definition wise

craggy agate May 21, 2024, 6:11 PM

#

past meteor AI > ML > LLMs

Yes I know that

wooden sail May 21, 2024, 6:11 PM

#

what exactly is your question?

craggy agate May 21, 2024, 6:12 PM

#

wooden sail what exactly is your question?

Can GPT be considered an AI due to the fact that it isn't doing anything intelligent or out of the box.

#

By that I mean it's not inventing anything.

wooden sail May 21, 2024, 6:13 PM

#

craggy agate Can GPT be considered an AI due to the fact that it isn't doing anything intelli...

by definition, yes

craggy agate May 21, 2024, 6:13 PM

#

I can't exactly explain what I mean but I hope you get the general idea

wooden sail May 21, 2024, 6:13 PM

#

otherwise, AI does not exist if that's what you're getting at

#

zestar was right in pointing out you mean something at or above AGI level

craggy agate May 21, 2024, 6:13 PM

#

wooden sail zestar was right in pointing out you mean something at or above AGI level

We are not there yet though

past meteor May 21, 2024, 6:13 PM

#

Hence why this is a question for philosophers

craggy agate May 21, 2024, 6:14 PM

#

past meteor Hence why this is a question for philosophers

Lmao

wooden sail May 21, 2024, 6:14 PM

#

so as i said, from your view point, there is no AI at all

craggy agate May 21, 2024, 6:14 PM

#

wooden sail so as i said, from your view point, there is no AI at all

Yeah ig

wooden sail May 21, 2024, 6:14 PM

#

AI broadly refers to data-driven optimization, which is what i pushed in that direction

#

in that sense, GPT is AI

craggy agate May 21, 2024, 6:14 PM

#

Like there was one checkers bot that defeated the world champion by inventing a new out of the box move.

past meteor May 21, 2024, 6:14 PM

#

https://keras.io/getting_started/intro_to_keras_for_engineers/ Keras 3 actually looks cool

wooden sail May 21, 2024, 6:14 PM

#

in the sense you're making up, there is no AI at all

past meteor May 21, 2024, 6:15 PM

#

I think it's what I'd recommend for "engineers" that have to train a model or so

wooden sail May 21, 2024, 6:15 PM

#

where do you draw the "engineer" line

craggy agate May 21, 2024, 6:16 PM

#

I would consider that to be more of AGI. Cause it's doing some inventing and thinking

past meteor May 21, 2024, 6:16 PM

#

good question, I'd say specifically software engineers

#

ML engineers are also engineers

wooden sail May 21, 2024, 6:17 PM

#

(my phd cardboard, if i ever finish this shit, is gonna say dr. ing. too)

past meteor May 21, 2024, 6:17 PM

#

nice

wooden sail May 21, 2024, 6:17 PM

#

so technically engineer here too

past meteor May 21, 2024, 6:17 PM

#

The ing. flex is a thing in germany too

#

Well, you have ir. as a title for MSc engineering science (the hard one) and ing. for engineering technology (the easier one)

wooden sail May 21, 2024, 6:18 PM

#

ooh keras with jax, tf, and pytorch. nice

#

hmm interesting, i'm not sure if that distinction exists here as well

#

gonna have to ask

past meteor May 21, 2024, 6:18 PM

#

It's because of the bologna accord soup thing

#

It's pretty bad I've been in situations where I saw an ir. tell and ing. "don't touch the machine, let me get an engineer!" on the shop floor 😭 (because 1 has more prestige than the other)

wooden sail May 21, 2024, 6:20 PM

#

yeah sorry

#

(all the terms are bullshit anyway)

#

AI is pretty tough to define broadly enough

past meteor May 21, 2024, 6:21 PM

#

We did things like search algorithms

#

Graphical models

wooden sail May 21, 2024, 6:21 PM

#

you hear people talk about game AI all the time and it's just like a for loop and 3 if statements

past meteor May 21, 2024, 6:21 PM

#

Prolog 😦

#

Honestly it just depends on who you ask as well

#

I learnt about a bunch of methods in operations research that were also called "AI" in later years

craggy agate May 21, 2024, 6:22 PM

#

I think we are at narrow AI

#

AGI is par to the human brain

wooden sail May 21, 2024, 6:22 PM

#

craggy agate I think we are at narrow AI

i think this is the best answer to your original question

past meteor May 21, 2024, 6:22 PM

#

It's the same as linear regression being AI

craggy agate May 21, 2024, 6:22 PM

#

According to the definition I think

craggy agate May 21, 2024, 6:23 PM

#

wooden sail i think this is the best answer to your original question

What about it?

#

Stack overflow is getting GPT?

#

Didn't they already have an auto mod

past meteor May 21, 2024, 6:24 PM

#

Anyhow, to give my last message in something that is going off topic.

AI is basically whatever creates value for critical stakeholders by means of increasing the amount of venture capital they have to burn.

craggy agate May 21, 2024, 6:24 PM

#

past meteor It's the same as linear regression being AI

💀

wooden sail May 21, 2024, 6:25 PM

#

that immediately excludes me

#

no but i look dumb 😌

iron basalt May 21, 2024, 6:42 PM

#

craggy agate I would consider that to be more of AGI. Cause it's doing some inventing and thi...

This just raises the question of what counts as "inventing" and "thinking."

#

On the inventing part, if it's just making something new, that is trivial, if it's making something new and useful, we already do that with AI / optimization: https://en.wikipedia.org/wiki/Evolved_antenna

Evolved antenna

In radio communications, an evolved antenna is an antenna designed fully or substantially by an automatic computer design program that uses an evolutionary algorithm that mimics Darwinian evolution. This procedure has been used since the early 2000s to design antennas for mission-critical applications involving stringent, conflicting, or unusual...

#

And as for thinking, it's just not really defined (beyond the not very useful stuff like "it's the process by which someone comes up with a solution").

craggy agate May 21, 2024, 6:46 PM

#

iron basalt This just raises the question of what counts as "inventing" and "thinking."

For example a checkers AI, it defeated the checkers world champion by inventing a move from its training. That, I would consider an AI.

#

I know I am throwing the term AI very loosely but you get the point

iron basalt May 21, 2024, 6:47 PM

#

craggy agate For example a checkers AI, it defeated the checkers world champion by inventing ...

Yeah, in that case we have been doing that for a long time.

craggy agate May 21, 2024, 6:47 PM

#

iron basalt Yeah, in that case we have been doing that for a long time.

My point is that GPT doesn't exactly do that

iron basalt May 21, 2024, 6:47 PM

#

But that is really just because there are things that humans are not good at all, and computers are, and for which you can make a well defined procedure.

#

Technically a human could find that move if they also did the search algorithm by hand on paper, it would just take really long.

#

So time is an important factor here.

iron basalt May 21, 2024, 6:49 PM

#

craggy agate My point is that GPT doesn't exactly do that

Yes, GPT can only kind of do it, but not really intentionally.

#

It's meant to just be a chat bot.

wooden sail May 21, 2024, 6:49 PM

#

that's a kinda "trivial" case though. there are finitely many (though really a LOT) of possible chess games. you don't need anything clever to make a "new" move. just loop over all of them and play a good move for your situation, no special thinking involved. people don't do that because memorizing a set of very good moves and having the skill to recognize them and know when to use them gets the job done. you can go out of your way and play weirdass moves if you like though

iron basalt May 21, 2024, 6:49 PM

#

Although it's advertised as much more.

craggy agate May 21, 2024, 6:50 PM

#

I kind of agree with this thinking but I was having a conversation with someone and wanted a second opinion

#

Ask it what script it's following

#

Or something like that

iron basalt May 21, 2024, 6:51 PM

#

wooden sail that's a kinda "trivial" case though. there are finitely many (though really a L...

This, what is often considered intelligent is not having to do the brute force. An intelligent math proof is one where you cleverly get around doing brute force (to the extreme, you effectively skip it all (or even infinite)).

craggy agate May 21, 2024, 6:51 PM

#

Not exactly a languages expert lol

sick eagle May 21, 2024, 6:51 PM

#

guys i will start learn numpy and pandas... (some lybrairys should learn it for machine learning) so jupyter or pycharm is good for me????

#

logo_notion logo_panda3d pypi

craggy agate May 21, 2024, 6:53 PM

#

So it basically just moved
d some letters around, not really reinvented or or something.

wooden sail May 21, 2024, 6:53 PM

#

sick eagle guys i will start learn numpy and pandas... (some lybrairys should learn it for ...

whichever you like, doesn't really matter much. if you're just starting out with python, i would advise against jupyter because it can promote bad habits in out of order code execution

craggy agate May 21, 2024, 6:53 PM

#

sick eagle guys i will start learn numpy and pandas... (some lybrairys should learn it for ...

Between pycharm and Jupiter I would say Jupiter

iron basalt May 21, 2024, 6:53 PM

#

@craggy agate What you are probably looking for is deep reasoning as described by Wolfram. You can think of this as having and internal thought loop in which it solves problems algorithmically. We do have AI models that do this (e.g. NTMs), but they have been overshadowed by language models in popularity, they are still there though and being worked on, including integrating them with language models.

craggy agate May 21, 2024, 6:54 PM

#

No not really

sick eagle May 21, 2024, 6:54 PM

#

wooden sail whichever you like, doesn't really matter much. if you're just starting out with...

hmmmmm

sick eagle May 21, 2024, 6:54 PM

#

craggy agate Between pycharm and Jupiter I would say Jupiter

why? can you give me reasons

sick eagle May 21, 2024, 6:55 PM

#

wooden sail whichever you like, doesn't really matter much. if you're just starting out with...

i think jupyter is good for small projects like biginners right?

#

don't care for my mistakes in english

wooden sail May 21, 2024, 6:56 PM

#

i already gave you my 2 cents 😛 it promotes bad habits if you're new to python

whole zephyr May 21, 2024, 6:56 PM

#

yo, does anyone know of a neural net architecture that:

is trained using a target_processed_signal + raw_unprocessed_signal and outputs the parameters

is tested using only raw_unprocessed_signal and outputs some parameters that would transform the raw signal into a desired signal? any links to papers will be highly appreciated, especially if they contain neural net diagrams for the architecture

I'm thinking of a use case is to train the network to somehow "remember" the qualities of a desired signal

and when I have no target signal but give it new, raw signals it would give me some parameters that I could apply to process the new signals

wooden sail May 21, 2024, 6:57 PM

#

we've had a few cases here of people asking for help where the issue was executing cells out of order, or running a cell a second time resulting in inadvertent composition of functions. the first means the code only works if you run the cells out of order, and the latter means the code only works if you re run all the cells in order, at which point you're better off just using any ide you like

sick eagle May 21, 2024, 6:57 PM

#

wooden sail i already gave you my 2 cents 😛 it promotes bad habits if you're new to python

ok

craggy agate May 21, 2024, 6:57 PM

#

sick eagle why? can you give me reasons

Jupiter cause you can split up your code cell by cell and vs code for better customization.

#

I see

iron basalt May 21, 2024, 6:58 PM

#

sick eagle i think jupyter is good for small projects like biginners right?

IMO it's better to just get used to using text directly, not the notebooks. Notebooks have the issue that any other non-plain-text coding method has, and it's that everyone now needs that specific editor for it to access your code and they have to now learn that. In addition, Jupyter Notebook is not well designed even as a notebook IMO. It causes many issues with debugging.

sick eagle May 21, 2024, 6:58 PM

#

craggy agate Jupiter cause you can split up your code cell by cell and vs code for better cus...

you did learn ML or something like that right?

craggy agate May 21, 2024, 6:58 PM

#

sick eagle you did learn ML or something like that right?

Yes, why?

whole zephyr May 21, 2024, 6:58 PM

#

whole zephyr yo, does anyone know of a neural net architecture that: is trained using a targ...

I currently run an architecture that predicts parameters based on a signal difference

the problem is that if I want to apply the network for my use case, I will need BOTH the raw and target signals, but I have no target signal, just the raw version

sick eagle May 21, 2024, 6:59 PM

#

iron basalt IMO it's better to just get used to using text directly, not the notebooks. Note...

ok that big reason

iron basalt May 21, 2024, 6:59 PM

#

iron basalt IMO it's better to just get used to using text directly, not the notebooks. Note...

This comes from frustration of asking for code from others only to receive a notebook which I now need to manually go through and painfully copy paste (with my mouse, ew, I need vim please) into a regular text file while paying attention to what their cell execution order was.

wooden sail May 21, 2024, 7:00 PM

#

whole zephyr I currently run an architecture that predicts parameters based on a signal diffe...

you might be able to rewrite the problem as a "self supervised problem" where you take an autoencoder that starts with params, generates a signal, then estimates params from that signal. you then split up the autoencoder and keep the decoder. just bouncing ideas around, this may or may not be helpful for you. (you an also interpret this as just using synthetic data if you keep the "encoder" fixed)

sick eagle May 21, 2024, 7:00 PM

#

craggy agate Yes, why?

i think you know jupyter is better in small projects if i want practicing my new skills it will help me but pycharm won't help me beceause it just for big projects right?

wooden sail May 21, 2024, 7:01 PM

#

sick eagle i think you know jupyter is better in small projects if i want practicing my new...

you could do big and small projects in either, just use what you prefer

#

neither will "help you" in any special way

sick eagle May 21, 2024, 7:01 PM

#

ok

wooden sail May 21, 2024, 7:02 PM

#

pycharm has a special debugger and automatically creates venvs. jupyter has in-line plotting and can display latex cells between code cells. neither helps you with coding

sick eagle May 21, 2024, 7:02 PM

#

i think will use vscode 🤣 pithink

iron basalt May 21, 2024, 7:02 PM

#

sick eagle i think you know jupyter is better in small projects if i want practicing my new...

Your tool of choice can only hinder you, choose the one that stays out of your way (so you can just code).

sick eagle May 21, 2024, 7:02 PM

#

i just kidding

#

i will use pycharm is good and simple

#

i did use it

wooden sail May 21, 2024, 7:02 PM

#

sick eagle i think will use vscode 🤣 <:pithink:652247559909277706>

i do use vscode cuz you can use it either for vanilla coding or for notebooks (but most importantly for writing latex tbh)

sick eagle May 21, 2024, 7:03 PM

#

iron basalt Your tool of choice can only hinder you, choose the one that stays out of your w...

yeah you right

wooden sail May 21, 2024, 7:03 PM

#

pycharm is arguably the most complex out of pycharm, vscode, and jupyter.

#

if you already use that comfortably, the others won't give you any issue

#

but yeah just use what you like

sick eagle May 21, 2024, 7:04 PM

#

wooden sail pycharm is arguably the most complex out of pycharm, vscode, and jupyter.

but also it is good for programmers python

wooden sail May 21, 2024, 7:04 PM

#

if you like it, sure

iron basalt May 21, 2024, 7:04 PM

#

About vim, emacs, etc. These editors are for when you decide which editor to use for a lifetime, they have high upfront learning investment required, but you will never need anything else again. So if you think you are ready for that, then maybe give it a go.

wooden sail May 21, 2024, 7:04 PM

#

no ide or editor will do your job for you. they also don't teach you how to code

#

just use whatever has tools you like

sick eagle May 21, 2024, 7:04 PM

#

wooden sail but yeah just use what you like

the best one is vscode for me but beceause i hear pycharm is good in python

sick eagle May 21, 2024, 7:05 PM

#

wooden sail just use whatever has tools you like

i just asking i want some information

wooden sail May 21, 2024, 7:05 PM

#

there is no "best", it just depends on how much you like it and how well you use it

iron basalt May 21, 2024, 7:05 PM

#

iron basalt About vim, emacs, etc. These editors are for when you decide which editor to use...

Until then, yeah, probably vscode.

craggy agate May 21, 2024, 7:05 PM

#

Recommend starting out with Jupiter NB but if you like py charm then go for it

sick eagle May 21, 2024, 7:06 PM

#

ok

#

thanks too much guys

#

you are so usefull

#

thanks 🤝

agile cobalt May 21, 2024, 7:06 PM

#

notebooks are not too bad - in particular, inline plotting can be useful for exploring datasets or testing data transformations, just remember to Restart Kernel every so often to ensure your results are reproducible and you didn't end up with a messy state

whole zephyr May 21, 2024, 7:06 PM

#

the tool you use doesn't really matter

though, there are indeed 2 main approaches that can influence the way you organize your code:

notebooks, where the state of the program is sort of saved as long as your session (or whatever it's called) is still active

or the classic way where you would have to run your code every time if you want to access a specific state

iron basalt May 21, 2024, 7:06 PM

#

Yeah, just do whatever, you can use notepad, it's really whatever. What really matters is that you are making stuff, you will find out for yourself what is working better after trying them.

#

Don't get stuck in analysis paralysis.

sick eagle May 21, 2024, 7:07 PM

#

i think jupyter is so hard

#

It's not organized

whole zephyr May 21, 2024, 7:08 PM

#

I'd recommend the classic way if you're a beginner

don't look into notebooks just yet because there you can run the cells out of order and you could get unexpected results because you didn't pay attention to what cell you've ran before

iron basalt May 21, 2024, 7:08 PM

#

Path of least resistance since what right now matters is that you are coding. But keep in mind that some paths with more resistance have payoffs at the end.

sick eagle May 21, 2024, 7:09 PM

#

yeah

#

so guys do you think AI will replace devlopers front end ??

whole zephyr May 21, 2024, 7:09 PM

#

I worked like 1.5 years with notebooks and it was a bit hard for me to transition back to the classoc way of coding

sick eagle May 21, 2024, 7:09 PM

#

i will never choose front end

whole zephyr May 21, 2024, 7:10 PM

#

I don't like front end either but anyway

sick eagle May 21, 2024, 7:10 PM

#

whole zephyr I worked like 1.5 years with notebooks and it was a bit hard for me to transitio...

i just 3 months shipit

wooden sail May 21, 2024, 7:10 PM

#

iron basalt Don't get stuck in analysis paralysis.

looks at python code
for every epsilon, there is a delta...

iron basalt May 21, 2024, 7:10 PM

#

sick eagle so guys do you think AI will replace devlopers front end ??

If it does it says more about frontend work as a whole than about AI... (or any other "job")

sick eagle May 21, 2024, 7:10 PM

#

whole zephyr I don't like front end either but anyway

i like back end is just like, i see my self like mr robot 🫠 🤣

wooden sail May 21, 2024, 7:10 PM

#

sick eagle so guys do you think AI will replace devlopers front end ??

sick eagle May 21, 2024, 7:11 PM

#

but after years AI will replace front end

#

i search in google

iron basalt May 21, 2024, 7:12 PM

#

sick eagle but after years AI will replace front end

Well, eventually everything maybe, frontend will be the least of our concerns at that point...

sick eagle May 21, 2024, 7:12 PM

#

it will never replace all devlopers but just front end can ai replace it

sick eagle May 21, 2024, 7:12 PM

#

iron basalt Well, eventually everything maybe, frontend will be the least of our concerns at...

yeah

whole zephyr May 21, 2024, 7:13 PM

#

anyway, does anyone know how to train a network with 2 inputs X1, X2 (like a raw and a target) to match some output parameters Y

so that when I apply it to the real use case I only have the raw input X1 and I need to find the parameters that would transform X1 to X2 without having the X2?

sick eagle May 21, 2024, 7:13 PM

#

iron basalt Well, eventually everything maybe, frontend will be the least of our concerns at...

i think much peaple choose ai

#

right??

iron basalt May 21, 2024, 7:14 PM

#

sick eagle i think much peaple choose ai

Will people be choosing at all at that point?

sick eagle May 21, 2024, 7:14 PM

#

if that happend the market will..

sick eagle May 21, 2024, 7:14 PM

#

iron basalt Will people be choosing at all at that point?

not all peaple, beceause thier eyes see AI is the future

whole zephyr May 21, 2024, 7:16 PM

#

wooden sail you might be able to rewrite the problem as a "self supervised problem" where yo...

or I could use an encoder to estimate the params from the raw to the desired AND use the output of the decoder which should be the desired signal

and then the decoder would receive the params+raw as input and would generate the desired signal

wooden sail May 21, 2024, 7:18 PM

#

that's already implicitly included in the model, since forward modelling the input parameters into synthetic data means your data y is a function of the params x. then you want to use y(x) to estimate x, which is a typical inverse problem

whole zephyr May 21, 2024, 7:18 PM

#

thing is: I already have a deterministic "algo" for the "decoder" - i.e. the signal processor I want to set those params for

wooden sail May 21, 2024, 7:20 PM

#

which algorithm are you using? if it'S deterministic and fixed, you certainly don't need both the signals and the parameters

#

there is no training

whole zephyr May 21, 2024, 7:21 PM

#

it's a parametric filter that's used to transform X1 to X2

#

I have a training set of a very soecific set of desired X2s

#

and when I encounter real data, I wouldn't have the X2, but just an idea of what X2 I would want to find

#

it's basically the mixing process of a sound engineer

#

so the training would be needed to find the "good" parameters to make any X1 signal sound better even if I don't have a target X2

but I want to estimate the params instead of having the desired signal as a direct output, because I want to be able to make changes in the parameters for full customizability of the output

wooden sail May 21, 2024, 7:26 PM

#

that sounds like a typical ML problem though, what's the issue?

#

you want to compute X2 from X1, you have a parametric function that can do that but you don't know the parameters

#

you have examples of X2 as well. do you have the X1 that go with those X2's?

quiet bridge May 21, 2024, 10:29 PM

#

Hi Guys I have actually started learning SQL for data science.
I was wondering what I should do after learning the syntax and those basic where,group by,join,etc

spring field May 21, 2024, 10:30 PM

#

practice

quiet bridge May 21, 2024, 10:30 PM

#

Surely but how

spring field May 21, 2024, 10:31 PM

#

I mean, ig practicing SQL on its own is gonna be a bit tough, you kinda gotta include it in some bigger project

quiet bridge May 21, 2024, 10:32 PM

#

So you suggest that instead of doing Hackerrank Problems,I should jump straight into using it in some project that requires SQL?

spring field May 21, 2024, 10:32 PM

#

there are hackerrank problems that only need SQL?

#

but yeah, usually projects are a great way to practice

quiet bridge May 21, 2024, 10:33 PM

#

Defiently

agile cobalt May 21, 2024, 10:33 PM

#

check the pins in #databases

spring field May 21, 2024, 10:33 PM

#

I mean, ig you can try hackerrank as well

quiet bridge May 21, 2024, 10:33 PM

#

agile cobalt check the pins in <#342318764227821568>

Okay thanks

quiet bridge May 21, 2024, 10:34 PM

#

spring field I mean, ig you can try hackerrank as well

Yea

#

I get it now

#

Can I tell you ig what I am doing rn towards learning Data Science to make sure I am doing it right and in the right order?

#

I am just making sure that you are specialized in it

spring field May 21, 2024, 10:36 PM

#

I'm not particularly specialized, but you can check out the pinned messages in this channel

quiet bridge May 21, 2024, 10:37 PM

#

Yea I see

#

But do you know about Andrew Ng Specialization for ML on Coursera?

spring field May 21, 2024, 10:40 PM

#

nope

calm hatch May 22, 2024, 6:45 AM

#

i have categorical columns in my dataset that have missing values what could be a goof way to impute them besides replacing empty values with mode for that column. something that might impute these values depending on some other column to make ot closer to what could have been the real value. I hope the question is understandable

velvet olive May 22, 2024, 7:28 AM

#

So, you want to infer some data depending on the values of other rows?

calm hatch May 22, 2024, 8:54 AM

#

velvet olive So, you want to infer some data depending on the values of other rows?

yes. taking an example:suppose there is a datset with cols A B C. A is a categorical col with many null values. I want to handle these missing values and to do so I want use cols B and C to come to help.This way I can get a value which is as realistic as possible w.r.t the given dataset.

velvet olive May 22, 2024, 9:04 AM

#

if you have enough examples, you could create a small supervised model to infer those values but it might involved a bit of work. But since you already have all the data structured it could be worth it

calm hatch May 22, 2024, 9:50 AM

#

velvet olive if you have enough examples, you could create a small supervised model to infer ...

thanks for the idea👍

jaunty helm May 22, 2024, 10:43 AM

#

calm hatch i have categorical columns in my dataset that have missing values what could be ...

one idea is you can groupby some other column and use the modes of each group to fill instead

calm hatch May 22, 2024, 12:30 PM

#

oh! thats actually a very nice approach. Thanks alot🙌

fallow coyote May 22, 2024, 1:02 PM

#

afternoon everyone. can anyone recommend alternatives to the mml book? or website which lists all the topics to learn for ML (excluding the stats)?

barren mango May 22, 2024, 1:07 PM

#

hey guys

past meteor May 22, 2024, 3:15 PM

#

fallow coyote afternoon everyone. can anyone recommend alternatives to the mml book? or websit...

Take a look at the pinned posts, I wrote some good ones down there

orchid forge May 22, 2024, 4:32 PM

#

im studying from khan academy for the Probability and Statistics, u asked. aur you sure there can't be Probability and Statistics specially for a data analysis coders?

orchid forge May 22, 2024, 5:05 PM

#

Yeah it's good

#

Actually

river cape May 22, 2024, 6:51 PM

#

Any as to why do we use toarray() function to get the output of the Bag Of Words ?

tidal bough May 22, 2024, 6:56 PM

#

It's hard to tell what you mean without any context, but generally toarray in ML is used just to make something a normal numpy array, rather than a torch tensor (which is similar to a numpy array but has extra stuff like autograd attached to it).

river cape May 22, 2024, 7:05 PM

#

Here why do we use toarray()

river cape May 22, 2024, 7:25 PM

#

tidal bough It's hard to tell what you mean without any context, but generally `toarray` in ...

Here why do we use toarray()

iron ruin May 22, 2024, 7:28 PM

#

self - Reminder to not always set CV so high for parameter tuning, unless absolutely necessary

left tartan May 22, 2024, 7:41 PM

#

quiet bridge Hi Guys I have actually started learning SQL for data science. I was wondering w...

A couple of resources - sqlbolt

flat sigil May 22, 2024, 11:47 PM

#

well shit

#

i messed something up 💀

flat sigil May 23, 2024, 12:41 AM

#

i am scrapping and completely redesigning my reward function

#

cuz obviously something aint working

buoyant sapphire May 23, 2024, 3:23 AM

#

C:\Users\Adam\AppData\Local\Programs\Python\Python312\Lib\site-packages\google\protobuf\symbol_database.py:55: UserWarning: SymbolDatabase.GetPrototype() is deprecated. Please use message_factory.GetMessageClass() instead. SymbolDatabase.GetPrototype() will be removed soon.

buoyant sapphire May 23, 2024, 3:55 AM

#

https://discord.com/channels/267624335836053506/1243023968194134106

charred sandal May 23, 2024, 6:29 AM

#

hey
i have detection model for potholes, which is intended to work on the DVR set up on the driving car
currently I have only class potholes, but I need to diversify it and create new classes based on the size of the potholes : small_pothole, medium_pothole, large_pothole
should I retrain the model, or should I leverage the area of the potholes based on the pixels, such as
bbox_area = (x_max - x_min) * (y_max - y_min)

if bbox_area > area_threshold_large:
return "large_pothole"
elif ...
else...
?

orchid forge May 23, 2024, 7:35 AM

#

Hey data people

#

As an non passionate data analysis student, what do you guys think I need to understand when I'm finding it hard to have focus on a project, that sometimes I lose motivation while making one?

orchid forge May 23, 2024, 8:43 AM

#

How to do that ?

#

Maybe my brain thinks everything is hard for it to understand until I do, you need to know that I'm not too sharp

pliant heron May 23, 2024, 10:29 AM

#

# now lets try to convert all the cells in the date column into dates via to_datetime()
import pandas as pd
df = pd.read_csv('dirtydata.csv')
df["Date"] = pd.to_datetime(df["Date"])
print(df.to_string())

I am getting error, can someone tell me whats wrong here
i am trying to clean the date formatted in wrong way

hushed pike May 23, 2024, 11:02 AM

#

I always get ModuleNotFoundError: No module named 'loss_functions' in the following lines:

  File "/home/user/backend-project/core/scripts/classification.py", line 13, in classify_single_label
    checkpoint = torch.load(model_path, map_location=device)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/backend-project/.venv/lib/python3.11/site-packages/torch/serialization.py", line 1025, in load
    return _load(opened_zipfile,
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/user/backend-project/.venv/lib/python3.11/site-packages/torch/serialization.py", line 1446, in _load
    result = unpickler.load()
             ^^^^^^^^^^^^^^^^
  File "/home/user/backend-project/.venv/lib/python3.11/site-packages/torch/serialization.py", line 1439, in find_class
    return super().find_class(mod_name, name)

Problem is, that this function (which I have used in another project for training my model) is never used in this new API project. Anyone ever experienced something similar?

left tartan May 23, 2024, 11:29 AM

#

hushed pike I always get `ModuleNotFoundError: No module named 'loss_functions'` in the foll...

!traceback Paste the full trackback plz

arctic wedgeBOT May 23, 2024, 11:29 AM

#

Traceback

Please provide the full traceback for your exception in order to help us identify your issue.
While the last line of the error message tells us what kind of error you got,
the full traceback will tell us which line, and other critical information to solve your problem.
Please avoid screenshots so we can copy and paste parts of the message.

A full traceback could look like:

Traceback (most recent call last):
  File "my_file.py", line 5, in <module>
    add_three("6")
  File "my_file.py", line 2, in add_three
    a = num + 3
        ~~~~^~~
TypeError: can only concatenate str (not "int") to str

If the traceback is long, use our pastebin.

left tartan May 23, 2024, 11:30 AM

#

hushed pike I always get `ModuleNotFoundError: No module named 'loss_functions'` in the foll...

But, the fact that you're unpickling an object is likely the problem. What kind of object is it?

#

You (or someone) trained some model, pickled it, then are trying to load it... but to load it, you need the modules that it uses.

hushed pike May 23, 2024, 11:48 AM

#

left tartan You (or someone) trained some model, pickled it, then are trying to load it... b...

I've trained the model(s). I also know which module is needed, but I was so confused, as it is never used anywhere else outside of training. But pasting the module back to its original location in the new project, where I changed all locations for each file (since I am building the backend now, with a proper api structure), didn't work. Also new locations weren't accepted as well.

Is there a way to find out where the module should be located at?

Traceback will be added via pastebin, as the python deleted my message...

#

*python bot

left tartan May 23, 2024, 11:49 AM

#

hushed pike I've trained the model(s). I also know which module is needed, but I was so conf...

Is the module your code or something you installed via pip?

hushed pike May 23, 2024, 11:51 AM

#

No, custom made (by me, my code) module, consisting of 2 functions. It is really small and only used for the loss calculation in training.

hushed pike May 23, 2024, 11:52 AM

#

left tartan !traceback Paste the full trackback plz

Traceback: https://paste.pythondiscord.com/5WZA

left tartan May 23, 2024, 12:12 PM

#

Sorry, missed the last msg, one sec

#

How is your code laid out? Just test to make sure your module can be imported by a simple program

#

This would be a good help thread, as I'm leaving in a few minutes #❓｜how-to-get-help

late lichen May 23, 2024, 1:15 PM

#

What activation function I can use on hidden nodes and output nodes?

agile cobalt May 23, 2024, 1:19 PM

#

hidden nodes: nearly any can work, there isn't one clear winner for all cases, but ReLU is a popular choice
output nodes: if you mean the final output, very problem dependent but much of the time you wouldn't include one - The output of the model has to match the properties of your target variable

Most tutorials should mention which activation functions you should use for a particular problem, and you can always check which ones popular architectures are using

wooden sail May 23, 2024, 1:53 PM

#

neurips does double blind and open review (the papers are put up somewhere where anyone can comment and leave feedback, plus reviewers are assigned to it)

#

as for the quality... it's a huge conference with tens of thousands of submissions, so there have been concerns with the quality of the reviews

native narwhal May 23, 2024, 1:58 PM

#

how do i find bends in this image?

agile cobalt May 23, 2024, 2:33 PM

#

you'll have to be more specific about what you are asking

past meteor May 23, 2024, 2:56 PM

#

Yes, this table represents the correlations between each variable, the correlation is 0.389583. What exactly do you not understand?

#

On the diagonal you're computing the correlation between a variable and itself, which is obviously 1 (look at the formula again if you're unsure)

Afterwards you compute the correlation between weight and height, this is exactly the same as computing the correlation between height and weight (look at the formula again if you're unsure about this one as well)

spring field May 23, 2024, 4:17 PM

#

do I just replace my SingleHeadedAttention with this? which one? try with both?
what's x_bcd 👀
(what are all these attention types)

#

it's moments like that that make me wonder, well, surely I'm in the valley of despair, but maybe it's a logarithmic scale

crimson schooner May 23, 2024, 4:33 PM

#

Hey i am new to pytorch and trying to basic train and evaluate a model. The training is not working very good is here someone I could ask some questions in the DMs about this? Maybe get some tips and hints.

past meteor May 23, 2024, 4:42 PM

#

crimson schooner Hey i am new to pytorch and trying to basic train and evaluate a model. The trai...

Heads up, nobody will answer in DMs. It's best to just describe your issue here and if someone can help they will help

spring field May 23, 2024, 4:46 PM

#

mmm, I use multiheaded attention, the singleheaded attention class is so I can expand it more easily and such or is that what you meant?

#

what? I'm pretty sure I'm doing it pretty much how it's described

#

I don't remember anything about that pithink

#

the shape of the output of the multiheaded sublayer is the same as the shape of a single head

#

is that not what you mean?

#

#

the HIDDEN_SIZE is the size of a single head

#

yeah, that's what I have

#

concat dotted with the fc layer

#

it's for a single head

#

yeah, so the input gets fed to each head, each of which has a hidden size of 64, each of them output a shape of (64, 64), that gets concatenated, so you get (64, 512) that then gets dotted with (512, 64) and the output of the whole multiheaded sublayer is (64, 64)

#

mmm, should it? it just duplicates the input for each head pretty much

#

I was also reading this alongside the paper: https://jalammar.github.io/illustrated-transformer/

lone hollow May 23, 2024, 5:16 PM

#

Are you guys aware of any hackathons where aiml skills are used?
I only hear web dev guys going and rocking there

wooden sail May 23, 2024, 5:19 PM

#

i just wanna highlight that you don't need it to be a rectangular matrix for it to be a projection

#

(in fact rectangular matrices don't define projections at all, but the resulting vector space is isomorphic to a low dimensional subspace of the domain)

#

that it does, yes, but still you don't need it to be rectangular for that to happen

#

you also don't

#

you noted yourself that there is an implementation that uses dropout instead of those rectangular matrices

spring field May 23, 2024, 5:25 PM

#

I'm deeply lost now

#

what is token embed?

wooden sail May 23, 2024, 5:26 PM

#

dropout is a projection onto a low dimensional subspace. a proper one, too. square matrix and idempotent, rank deficient

#

an identity with missing 1s on the diagonal

#

they really don't

#

good for them, but they don't need to be

#

here

#

by enforcing low rank with some other condition

#

e.g. requiring the matrix to be diagonal and only a specified number of nonzero entries

#

implementation in paper and in code are two different things as well

#

in the example i gave you above of punctured identity mats, you can use a sparse array representation that is pretty efficient

#

the matrix also has square root as many parameters. win-win

#

i'm not bringing it up to be pedantic, but rather to highlight that there are multiple ways of achieving the same effect, and as you noted, not all of them have been tried. they have different properties and entail different costs. the rectangular approach certainly achieves the effect you want, but i don't want either of you to be chained to that approach and/or wonder why all of a sudden people do something different in another paper

#

the main idea is low dimensional subspaces, and you can get those in more than one way

spring field May 23, 2024, 5:40 PM

#

ok, that visualization really messed with me

#

I just have this pretty much

#

#

alright

#

here's what I understand

#

in_features of each head correspond to embedding dimensions

#

which is what I have

#

it just happens that they are the same value

#

alright, in that case, I have the architecture implemented correctly I just need to ensure that embedding dim == number of heads * hidden size

#

phew

#

alright, but each head still receives the same input, right? like it's duplicated across each head, just each head has its own weights

#

alright, I see now, I misunderstood you at first

hollow escarp May 23, 2024, 7:28 PM

#

Has anyone every installed onnxruntime for armv7 architecture

dull radish May 23, 2024, 8:21 PM

#

Hey so I was deploying a simple model with the following code in the predict function

    item = Item(
        prompt=prompt,
    )

    messages = [
    {
        "role": "system",
        "content": "You are an expert programming assistant",
    },
    {"role": "user", "prompt": item.prompt},
    ]

    outputs = pipe(
    messages,
    max_new_tokens=512,
    do_sample=True,
    temperature=0.7,
    top_k=50,
    top_p=0.95,
    stop_sequence="<|im_end|>",)

    return(outputs[0]["generated_text"][-1]["content"]) ```

and I'm getting this error when calling the function:
```{
    "run_id": "5e5b8f00-2587-93e4-96f5-bf23009ee062",
    "result": {
        "error": "When passing chat dicts as input, each dict must have a 'role' and 'content' key."
    },
    "run_time_ms": 26864.662170410156
}```
My function call was just a simple:
```{
    "prompt": "Program to add 3 numbers"
}```

spring field May 23, 2024, 9:34 PM

#

what is x_bcc supposed to be?

#

I'll assume dotproducts_bcc

#

well, it is

#

in the code you sent

#

in softmax

#

lol, what

#

also how is this supposed to work if the matrices are not square?

metric_dd = (self.coefmatrix_dd.weight + self.coefmatrix_dd.weight.transpose(-1, -2)) / 2

#

no, but why are you in vr 😄

#

well, I changed it

#

but why wouldn't it generalize over non-square matrices?

#

alright, I named it EMBEDDING_DIM and HIDDEN_SIZE btw

#

I assume K is the head hidden size

#

well, not to me
at least for now...

#

wait, why did it change to D, K for kk?

#

alr

#

kd seems to be the right one

#

also bit of a technicality, but why not use .forward and set bias=False or use a Parameter(Tensor(...))

deft solar May 23, 2024, 9:58 PM

#

any recommended projects for data analyst or data scientist

spring field May 23, 2024, 9:58 PM

#

anyway, it's running, but I changed it to projection_kd instead of transposing

#

makes sense, but then why not just create a Tensor param

#

ah, I see

#

makes sense

honest reef May 23, 2024, 10:17 PM

#

can someone guide me where to start with data science? currently i know python and some bash, can bash be helpful? where should i look for algorithms and such...

#

I know python is not the only necessary tool to use, but i want to begin from somewhere

spring field May 23, 2024, 10:19 PM

#

bash can probably be useful for deployments

warm pebble May 23, 2024, 10:21 PM

#

hello i need help making an object deetection ai

#

i have no idea how to start

honest reef May 23, 2024, 10:24 PM

#

spring field bash can probably be useful for deployments

it can indeed, um actually, are there any fundamental steps i should take before getting into data science? i can decide for bash later on, i dont think its necessary right now

spring field May 23, 2024, 10:25 PM

#

probably not necessary rn, no
you can take a look at the pinned messages here
same for @warm pebble

honest reef May 23, 2024, 10:26 PM

#

ok I'll take a look at them, thanks

spring field May 23, 2024, 10:26 PM

#

something seems off with that testing accuracy

#

does it mask everything it needs to mask though?

#

doesn't it need to mask below the sequence as well?

#

I mean like this

#

also as I understand masking is done only for the decoder column, right?

#

uhhh, is it because packing the sequence is gonna throw away the rest?

#

what's in those places then? the c has to have the same shape as the rest of everything regardless of the actual sequence size

#

now I'm confused

#

yeah

#

yes

#

but like what if there are other lengths in that batch?

#

actually, gimme a sec, I need to check something, lol

#

waiiit nooo

#

it gets padded

#

the dataset I have has varying lengths of text as far as I'm aware

#

and it doesn't get preprocessed to cut that down

#

it just gets padded with zeros

#

now, I'm not saying that's how it should be, but that's how it was handed to me

#

the dataset is a bunch of sentences of varying length, yes

spring field May 23, 2024, 10:55 PM

#

spring field I mean like this

I mean, ig then it makes sense to do this if I do have varying lengths of sequences

#

classification as in sentiment analysis?

#

mmm

#

I'm slightly veering into the RNN territory again, thinking of those architectures, lol

#

I think I saw those graphs where you thought there might be a leak

spring field May 23, 2024, 11:15 PM

#

spring field I mean like this

the only issue is that I haven't found a way besides using a for loop to mask that other area...

#

and that makes the training incredibly slow

spring field May 23, 2024, 11:46 PM

#

sth about that test accuracy ain't looking good still

spring field May 24, 2024, 12:42 AM

#

status upate

spring field May 24, 2024, 1:23 AM

#

alright, I have a feeling I messed up somewhere...

spring field May 24, 2024, 1:56 AM

#

it appears the issue was too many dimensions for the embeddings?
yeah, that seems to be it, dam

spring field May 24, 2024, 1:58 AM

#

spring field I mean like this

I also just found the median sentence length and just sliced all the sentences to that length (that were long enough) so I have a fixed context size across the entire dataset so I don't have to use that dang slow for loop to do this whole thing, I can just use the triangle mask

#

I can't believe that doing what I did just now made it so much much faster, like, what it took a couple minutes to reach similar accuracy as before when it took like 6 hours...

#

so anyway, this is what I got with my model (but like, improved as of today)

8fVnrvVpmiIiIiIiIiO5qnnchAxEREREREVEvYINMREREREREBDbIRERERERERADYIBMREREREREBYINMREREREREBIANMhEREREREREANshEREREREREANggExEREREREQFgg0xEREREREQEgA0yEREREREREQA2yEREREREREQAgP8PxBM66n8JkS0AAAAASUVORK5CYII.png

#

this is this, it took it longer to reach similar accuracy as you can see, but it was certainly more interesting to see how the attention matrix developed over time (or maybe that's because it just took longer to develop, lol)

Gmr3svwMRERERERERvUGJu8eZiIiIiIiIqDixcSYiIiIiIiIqABtnIiIiIiIiogKwcSYiIiIiIiIqABtnIiIiIiIiogKwcSYiIiIiIiIqABtnIiIiIiIiogKwcSYiIiIiIiIqABtnIiIiIiIiogKwcSYiIiIiIiIqABtnIiIiIiIiogL8H7jXZJV5jIghAAAAAElFTkSuQmCC.png

spring field May 24, 2024, 3:16 AM

#

and this is for this, as you had said, it's rather similar to what I have
I modified the code a bit to do the projection thingy

class SingleHeadedAttention(torch.nn.Module):
    def __init__(self, mask: bool):
        super().__init__()
        self.projection_kd = torch.nn.Linear(in_features=EMBEDDING_DIM, out_features=HIDDEN_SIZE)
        self.coefmatrix_kk = torch.nn.Linear(in_features=HIDDEN_SIZE, out_features=HIDDEN_SIZE)
        
        self.mask = mask

    def forward(self, x_bcd, lengths, y=None, soft_attention=False):
        x_bkc = self.projection_kd.weight @ x_bcd.transpose(1, 2)
        x_bck = x_bkc.transpose(1, 2)
        dotproducts_bcc = x_bck @ self.coefmatrix_kk.weight @ x_bkc
        
        if self.mask:
            seq_size = x_bck.size(1)
            mask = torch.tril(torch.ones(seq_size, seq_size)).to(DEVICE)
            dotproducts_bcc = dotproducts_bcc.masked_fill(mask == 0, value=-torch.inf)
            
        dotproducts_bcc = torch.softmax(dotproducts_bcc, dim=-1)
        y_bcd = dotproducts_bcc @ x_bck

        if soft_attention:
            return y_bcd, dotproducts_bcc
        else:
            return y_bcd

cZH35Z2fVXlvW7sAAAAAAAB5VN13nAEAAAAA2EocnAEAAAAAKICDMwAAAAAABXBwBgAAAACgAA7OAAAAAAAUwMEZAAAAAIACODgDAAAAAFAAB2cAAAAAAArg4AwAAAAAQAEcnAEAAAAAKICDMwAAAAAABfwDNyVZBBKl8MUAAAAASUVORK5CYII.png

gritty vessel May 24, 2024, 4:20 AM

#

``
import tensorflow as tf

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.list_logical_devices('GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
print(e)
2 Physical GPUs, 2 Logical GPUs
``
why iam getting this?

#

i have only one gpu on my system

#

also got this error when i started to train my model
https://pastebin.com/KHJDMfpe

Pastebin

-------------------------------------------------------------------...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

frosty fulcrum May 24, 2024, 7:37 AM

#

Does anyone know the best Python library to under-sample the regression dataset to deal with an imbalanced dataset? I've tried resreg, but it's not really helpful since I can't control the under-sampling dataset, and the other one is smoter, which is extremely slow.

#

i'm not the one who created the dataset.

past meteor May 24, 2024, 7:46 AM

#

frosty fulcrum Does anyone know the best Python library to under-sample the regression dataset ...

Don't undersample imo

#

You're better off using some sort of cost function

#

They added this in the latest version of sklearn. Now you don't need to do it manually

#

https://scikit-learn.org/stable/auto_examples/release_highlights/plot_release_highlights_1_5_0.html#fixedthresholdclassifier-setting-the-decision-threshold-of-a-binary-classifier

scikit-learn

Release Highlights for scikit-learn 1.5

We are pleased to announce the release of scikit-learn 1.5! Many bug fixes and improvements were added, as well as some key new features. Below we detail the highlights of this release. For an exha...

#

Before I'd have written it out in full but it seems they even have a guide that explains my point well now 😄

https://scikit-learn.org/stable/auto_examples/model_selection/plot_cost_sensitive_learning.html

scikit-learn

Post-tuning the decision threshold for cost-sensitive learning

Once a classifier is trained, the output of the predict method outputs class label predictions corresponding to a thresholding of either the decision_function or the predict_proba output. For a bin...

frosty fulcrum May 24, 2024, 7:53 AM

#

past meteor Before I'd have written it out in full but it seems they even have a guide that ...

thanks i'll look into it

gritty vessel May 24, 2024, 8:38 AM

#

hey how to decide which scaling to apply on data?

past meteor May 24, 2024, 8:54 AM

#

gritty vessel hey how to decide which scaling to apply on data?

Models that use gradient descent or L2/L1 regularization need some sort of scaling, frequently standard scaling is applied. Imo it's a nice an easy exercise to figure out why it's the case (just look at the equations).

Models that don't fall into this category (famously, tree based models) don't necessarily need scaling but I frequently do it anyway.

I'd say the biggest downside of standard scaling is that robust metrics aren't used. If you have outliers it can skew your mean and median. That would be an argument for using a different scaler (e.g., min-max). Defaulting to standard scaling is a good idea though.

gritty vessel May 24, 2024, 8:55 AM

#

past meteor Models that use gradient descent or L2/L1 regularization need some sort of scali...

can i show you snippet of data ?

past meteor May 24, 2024, 8:55 AM

#

sure

gritty vessel May 24, 2024, 8:56 AM

#

https://pastebin.com/QAJhm5sx

Pastebin

array([[[[ 19, 34, 621, 632], [ 18, 36, 621, 635], ...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

1st and 2nd channels are in exponential distribution and 3 and 4th channel are in normal distribution

past meteor May 24, 2024, 8:57 AM

#

I'd start by standard scaling all of them

gritty vessel May 24, 2024, 8:58 AM

#

#

ok so apply standard scaling to all data

past meteor May 24, 2024, 8:58 AM

#

It'd be great if you figure out why (it's not a hard exercise)

gritty vessel May 24, 2024, 8:58 AM

#

to bring it in same range?

past meteor May 24, 2024, 8:58 AM

#

So really just take the equation of MSE loss + L2 regularization

spring field May 24, 2024, 8:59 AM

#

just saying but I'm masking only in decoder's self-attention, the decoder-encoder attention doesn't do masking and neither does encoder's self-attention

past meteor May 24, 2024, 8:59 AM

#

And think about "what happens to my regularization term if my variables are on different scales"

#

this one

gritty vessel May 24, 2024, 9:00 AM

#

oh ok it would be biased towards features with larger scale

past meteor May 24, 2024, 9:01 AM

#

gritty vessel oh ok it would be biased towards features with larger scale

Close, but it's the opposite. Small variables will have a large weight and would contribute disproportionately to the cost

gritty vessel May 24, 2024, 9:02 AM

#

past meteor Close, but it's the opposite. Small variables will have a *large* weight and wou...

got it!

past meteor May 24, 2024, 9:04 AM

#

gritty vessel got it!

And if you look at the update for (stochastic) gradient descent the same thing comes up.

gritty vessel May 24, 2024, 9:05 AM

#

ok so its directly proportional to magnitude of weights

past meteor May 24, 2024, 9:06 AM

#

And to come full circle, the way you normalize is typically standard scaling but if you have outliers you may also use min-max scaling or something that won't destroy your data.

peak ridge May 24, 2024, 9:07 AM

#

spring field there are several issues with this first, how are you gonna retrieve the content...

seems like u have a decent exp in RAG in general, i need some review over a few things,
would be greatful if u could enlighten with that

past meteor May 24, 2024, 9:07 AM

#

past meteor And to come full circle, the way you normalize is typically standard scaling but...

You should be trying this out with numpy to get the intuitions. Make a very large number in an array of randomly generated numbers with a certain mean and check what it does to your mean and stdev

gritty vessel May 24, 2024, 9:07 AM

#

past meteor And to come full circle, the way you normalize is typically standard scaling but...

outliers are important in my data

#

its real life data so some anamoly in it should be learned by model

past meteor May 24, 2024, 9:08 AM

#

Sure, but try some of the things out I mentioned. Either empirically (with numpy) or by looking at the equations

#

Then the whole scaling thing will be clear to you

gritty vessel May 24, 2024, 9:10 AM

#

past meteor Sure, but try some of the things out I mentioned. Either empirically (with numpy...

thank you that was really helpful

finite sierra May 24, 2024, 10:02 AM

#

how can I find the intersection 3 indexes in pandas? the Index.intersection method only accepts to check 1 other index.

spring field May 24, 2024, 10:10 AM

#

I, uhh, lost them, btw, how do I combine the attention from all heads in all layers in all blocks?

#

cuz, I'm gonna have to rerun it, cuz like, yeah... I'll also make the attention look nicer and across multiple sequences and add the sequence values as well

#

I mean, it converged pretty quickly

#

so it won't take long once I set it up

#

wait

#

but attention is bcc

#

yes

#

that's what I meant

#

I realize how it could've been misunderstood

#

no, I want to know how to combine the attention score matrices from all heads in the whole network

#

well, yes

#

but like, what attention scores do you want them?

#

I've been using one of the heads of the last encoder-decoder attention sub layer

spring field May 24, 2024, 10:25 AM

#

peak ridge seems like u have a decent exp in RAG in general, i need some review over a few ...

I have basically no experience with RAG

#

oh right... that's very interesting

peak ridge May 24, 2024, 10:27 AM

#

spring field I have basically no experience with RAG

but then how do u know that much

spring field May 24, 2024, 10:27 AM

#

but how do you do a decoder without the encoder-decoder sub-layer?

#

isn't that just an encoder with masking or sth?

#

dam

#

what is encoder + decoder used for then?

#

translation?

#

mmmmm, right

#

I thought you'd do the same with next token, though it did make we wonder why the same input is embedded in different embeddings 😁

#

(nope)

spring field May 24, 2024, 10:30 AM

#

peak ridge but then how do u know that much

I have very surface level understanding of RAG, so maybe it just seems like I know a ton about it

#

mmm, I see

peak ridge May 24, 2024, 10:31 AM

#

spring field I have very surface level understanding of RAG, so maybe it just seems like I kn...

ya possible

#

i have a very surface level questionn for u then

spring field May 24, 2024, 10:32 AM

#

is RAG just feeding the entire context to a transformer and then you just append your question to the end of that and it does next token prediction?

past meteor May 24, 2024, 10:33 AM

#

spring field is RAG just feeding the entire context to a transformer and then you just append...

You take one or many documents, slice them into pieces, embed and store them. Get a query, embded your query, do a similarity search and append the contents of the most similar document fragments to the context

peak ridge May 24, 2024, 10:33 AM

#

nope

spring field May 24, 2024, 10:34 AM

#

past meteor You take one or many documents, slice them into pieces, embed and store them. Ge...

oh and then it just does next token based on that context?

past meteor May 24, 2024, 10:34 AM

#

After a long discussion with Edd and Etrota on this topic ...

#

it's basically what I learnt in search engines and information retrieval years ago

#

but with LLMs 🤷

peak ridge May 24, 2024, 10:34 AM

#

yes sir

#

do u have exp in rag @past meteor

past meteor May 24, 2024, 10:35 AM

#

peak ridge do u have exp in rag <@260493929047130113>

My experience with RAGs is to the extent that I've made one with AWS bedrock to see wazzup

spring field May 24, 2024, 10:35 AM

#

alright, I'll cover RAG at some point in the future, I'm here trying to even understand transformers to a reasonable extent, they're too magical... and I mean, they literally are since it's not like we actually understand how they work, do we?

past meteor May 24, 2024, 10:36 AM

#

^ that's what I always say

#

I think understanding the intuitions of self attention, multihead attention and so on isn't too hard but I'm not always convinced our intuition of the methods is aligned with how and especially why they work

spring field May 24, 2024, 10:36 AM

#

well, we might have an "intuition"

peak ridge May 24, 2024, 10:39 AM

#

so i've implemented these things (room for improvement)
I manually coded it but at the meantime LangChain has in built classes and functions for it

Should i use it or my manually coded one

#

this is the working thing.
just need review

spring field May 24, 2024, 10:40 AM

#

so, basically I was trying to translate English to English... that's hilarious

peak ridge May 24, 2024, 10:40 AM

#

@spring field @past meteor maybe have a look, please?
i want u to review it

past meteor May 24, 2024, 10:41 AM

#

I'd prefer it if Maud'dib would have a look in my stead

spring field May 24, 2024, 10:42 AM

#

I'm not sure what to review there... as I said, I have a surface level understanding of RAG

past meteor May 24, 2024, 10:42 AM

#

Make it blue

peak ridge May 24, 2024, 10:42 AM

#

hmm

spring field May 24, 2024, 10:43 AM

#

I mean the code looks alright at first glance

peak ridge May 24, 2024, 10:43 AM

#

ya and it's working but the fear is i coded it manually

#

and aint using langchain libraries/classes for ReOrdering Text

spring field May 24, 2024, 10:44 AM

#

also that format_docs function is a bit pointless

peak ridge May 24, 2024, 10:45 AM

#

okay

#

💀

peak ridge May 24, 2024, 10:45 AM

#

spring field also that `format_docs` function is a bit pointless

probably, it was on the docs

#

i have no as such exp in this field 😦
but our product is fully based on this

#

i will go bankrupt

spring field May 24, 2024, 10:46 AM

#

just to recap
next token prediction with transformers is done using only encoders, but you mask the self-attention sub-layer?
because I see structures with "decoder-only", but like, that implies throwing out the whole encoder-decoder sub-layer and I don't like it

peak ridge May 24, 2024, 10:47 AM

#

hmm

#

🧐

spring field May 24, 2024, 10:47 AM

#

decoder-only doesn't make sense, it has to be a hybrid between the two, according to the paper

#

or can you also not mask the attention?

peak ridge May 24, 2024, 10:47 AM

#

spring field just to recap next token prediction with transformers is done using only encoder...

hm

#

okay okay

#

can we make a conclusion out this

#

so i can work on it

#

💀 please?

spring field May 24, 2024, 10:48 AM

#

but if you mask it, your attention score is also masked? or did I forget the order of masking

peak ridge May 24, 2024, 10:48 AM

#

No errors,

Want to make it better

#

no errors,
all good and working

need to make it

#

Better output from RAG
More relevant, more precise more btter

#

so true

spring field May 24, 2024, 10:49 AM

#

oh, do you do the whole dot prod and softmax, then use that as the attention score and then mask the thing that's going to output?

peak ridge May 24, 2024, 10:51 AM

#

i dont understand this

#

@spring field talks are above my knowledge

past meteor May 24, 2024, 10:53 AM

#

GL

peak ridge May 24, 2024, 10:53 AM

#

hm

spring field May 24, 2024, 10:59 AM

#

right, makes sense

spring field May 24, 2024, 11:00 AM

#

peak ridge <@670379095951147019> talks are above my knowledge

I just may have some knowledge in some other areas, like 10 minutes ago I found out that RAG does similarity search

peak ridge May 24, 2024, 11:00 AM

#

spring field I just may have some knowledge in some other areas, like 10 minutes ago I found ...

💀

#

more interesting.

spring field May 24, 2024, 11:01 AM

#

sth like that

flint gazelle May 24, 2024, 11:30 AM

#

Is it possible to create a Network in pytorch that only uses a int-datatype, so the weights, input and output are all int? I've tried to make this work but it always returns
RuntimeError: Only Tensors of floating point and complex dtype can require gradients

spring field May 24, 2024, 11:37 AM

#

flint gazelle Is it possible to create a Network in pytorch that only uses a int-datatype, so ...

I mean, ig you can disable the require_gradients and do them yourself, but why would you use ints anyway?

flint gazelle May 24, 2024, 11:37 AM

#

i need really fast int8_t operations in c++ with simd avx2 instructions

spring field May 24, 2024, 11:43 AM

#

but why ints?

#

simd instructions can handle floats as well

#

ig not as many as int8_t at once, but yk

#

why do you even need simd, it won't help you that much, better if you can run stuff on a GPU

buoyant vine May 24, 2024, 11:44 AM

#

tbf a lot of times float32 operations on intel chips can be faster than their integer versions

#

with a lot of integer operations you have the caveat of the overflow handling which normally limits how many lanes you can actually use even if you can hold more

spring field May 24, 2024, 11:45 AM

#

also with int8_t you might easily run into overflow issues with neural nets

buoyant vine May 24, 2024, 11:46 AM

#

yeah, normally you end up going 16 x int8 ops instead of 32 x int8 ops moving up into int16 results

#

unless you don't care about the saturation or overflows, but then that can cause behavour differences between archs

#

also some things like any integer division is incredibly expensive compared to fp

flint gazelle May 24, 2024, 11:50 AM

#

i thought of int8xint8 = int16 and then as activation function clamp [0,127] and so on

wooden sail May 24, 2024, 11:51 AM

#

classical derivatives are only defined over the reals, not over the ints

#

you wouldn't get correct results with autograd in general, since derivatives could generally map into the rationals or reals

#

rounding/casting after differentiation is also generally not correct

flint gazelle May 24, 2024, 11:52 AM

#

yeah but for clamp its just 0 or 1

#

the derrivative

buoyant vine May 24, 2024, 11:55 AM

#

probably worth a note though

#

if you want the best speed

#

Your buffers need to be aligned to 64 bytes

#

and your operations need to cut the branching down so you're doing about 64 values per loop call

flint gazelle May 24, 2024, 11:57 AM

#

yes i have already test with float and i have alligend it to the chache size

buoyant vine May 24, 2024, 11:57 AM

#

Also probably want an AMD cpu rather than intel most of the time

flint gazelle May 24, 2024, 11:57 AM

#

template<uint64_t inputSize, uint64_t outputSize>
void layer_32(Layer<inputSize, outputSize>& layer_16, std::array<float, outputSize>& output, const std::array<float, inputSize> input)
{
    alignas(64) float arr[8];

    for (uint64_t j = 0; j < outputSize; ++j) {
        output[j] = layer_16.bias[j];
        for (uint64_t i = 0; i < inputSize; i += 32) {

            __m256 _weights0 = _mm256_load_ps(&layer_16.weights[j][i]);
            __m256 _weights1 = _mm256_load_ps(&layer_16.weights[j][i + 8]);
            __m256 _weights2 = _mm256_load_ps(&layer_16.weights[j][i + 16]);
            __m256 _weights3 = _mm256_load_ps(&layer_16.weights[j][i + 24]);

            __m256 _input0 = _mm256_load_ps(&input[i]);
            __m256 _input1 = _mm256_load_ps(&input[i + 8]);
            __m256 _input2 = _mm256_load_ps(&input[i + 16]);
            __m256 _input3 = _mm256_load_ps(&input[i + 24]);

            __m256 out0 = _mm256_mul_ps(_weights0, _input0);
            __m256 out1 = _mm256_fmadd_ps(_weights1, _input1,out0);
            __m256 out2 = _mm256_fmadd_ps(_weights2, _input2,out1);
            __m256 out3 = _mm256_fmadd_ps(_weights3, _input3,out2);


            __m256 temp = _mm256_hadd_ps(out3, out3);
            temp = _mm256_hadd_ps(temp, temp);

            _mm256_store_ps(arr, temp);

            output[j] += arr[0] + arr[4];
        }
    }
}

#

this is my matrix multiplication

#

for a input of 1d and output of 1d array

#

but with int8 i could make this way faster, the only thing i need is to some how train a model that is accurate enough

buoyant vine May 24, 2024, 11:59 AM

#

you are loose a tone of performance with how you have structured your ops btw

#

and the copying of memory per iteration

flint gazelle May 24, 2024, 12:00 PM

#

i dont quite follow

agile cobalt May 24, 2024, 12:00 PM

#

flint gazelle but with int8 i could make this way faster, the only thing i need is to some ho...

did you look into popular quantatization and distillation methods first?

buoyant vine May 24, 2024, 12:00 PM

#

so SIMD instructions are basically 1 instruction, but they are not 1 instruction = 1 cyle

#

and sometimes multiple SIMD instructions can be executed within 1 cycle

#

which is where the whole "uops" stuff comes about

flint gazelle May 24, 2024, 12:01 PM

#

agile cobalt did you look into popular quantatization and distillation methods first?

im currently experimenting with this but i havent made it work yet

buoyant vine May 24, 2024, 12:03 PM

#

but currently, you are bottleknecking yourself at major points from a quick glance:

__m256 out0 = _mm256_mul_ps(_weights0, _input0); becomes your dependency on the execution below so

            __m256 out1 = _mm256_fmadd_ps(_weights1, _input1,out0);
            __m256 out2 = _mm256_fmadd_ps(_weights2, _input2,out1);
            __m256 out3 = _mm256_fmadd_ps(_weights3, _input3,out2);

Each step here is now having to wait on the previous instruction to finish before executing the next

#

so instead of the CPU being able to do this step in 2 cycles, it's execution time jumps to 4 cycles (normally) and you have the added latency for each instruction

#

which is normally ~7-10

flint gazelle May 24, 2024, 12:05 PM

#

i wast testing this implementation before i had :

template<uint64_t inputSize, uint64_t outputSize>
void layer_32(Layer<inputSize, outputSize>& layer_16, std::array<float, outputSize>& output, const std::array<float, inputSize> input)
{
    alignas(64) float arr[8];

    for (uint64_t j = 0; j < outputSize; ++j) {
        output[j] = layer_16.bias[j];
        for (uint64_t i = 0; i < inputSize; i += 32) {

            __m256 _weights0 = _mm256_load_ps(&layer_16.weights[j][i]);
            __m256 _weights1 = _mm256_load_ps(&layer_16.weights[j][i + 8]);
            __m256 _weights2 = _mm256_load_ps(&layer_16.weights[j][i + 16]);
            __m256 _weights3 = _mm256_load_ps(&layer_16.weights[j][i + 24]);

            __m256 _input0 = _mm256_load_ps(&input[i]);
            __m256 _input1 = _mm256_load_ps(&input[i + 8]);
            __m256 _input2 = _mm256_load_ps(&input[i + 16]);
            __m256 _input3 = _mm256_load_ps(&input[i + 24]);

            __m256 out0 = _mm256_mul_ps(_weights0, _input0);
            __m256 out1 = _mm256_mul_ps(_weights1, _input1);
            __m256 out2 = _mm256_mul_ps(_weights2, _input2);
            __m256 out3 = _mm256_mul_ps(_weights3, _input3);

            out0 = _mm256_add_ps(out0, out1);
            out2 = _mm256_add_ps(out2, out3);

            out0 = _mm256_add_ps(out0, out2);

            __m256 temp = _mm256_hadd_ps(out0, out0);
            temp = _mm256_hadd_ps(temp, temp);

            _mm256_store_ps(arr, temp);

            output[j] += arr[0] + arr[4];
        }
    }
}```

buoyant vine May 24, 2024, 12:10 PM

#

What does the LLVM MCA breakdown give

#

those hadds and stores look a little bit sus

flint gazelle May 24, 2024, 12:12 PM

#

i would have to test the exact performance but running this on random data a million times both versions take around the same time

#

but this are small chnages that could performance increase just a bit, my main focus is if i can make int8xint8 = int16 work with acceptable accuracy

#

i should also mention that my first layer consist of 0 and 1, so i dont even need to convert them

buoyant vine May 24, 2024, 12:18 PM

#

I mean you can do it, but like you said I'm not sure how well your accuracy is going to carry over

flint gazelle May 24, 2024, 12:20 PM

#

i will try and see if i get good results, ty

teal lance May 24, 2024, 12:28 PM

#

#

I moved on from tkinter to pyside6 ❤️ im a happy learner

past meteor May 24, 2024, 12:45 PM

#

What about just training on float16 and quantising to integers of whatever you want

flint gazelle May 24, 2024, 12:53 PM

#

thats what im currently trying

main citrus May 24, 2024, 3:35 PM

#

Is bias like accuracy score

#

?

sinful surge May 24, 2024, 4:11 PM

#

anyone know YOLO ? and have any experience into that ? i just need help

lapis sequoia May 24, 2024, 4:21 PM

#

what is the best dataset(s) for people who are kind of new to NLPs?

#

Thank you

#

what about animequotes or hate speech?

#

what is the main difference between countvectorize and Tfidvectorize?

spring field May 24, 2024, 6:52 PM

#

man, these plots hit hard

#

all that fancy schmancy spatial attention and stuff

robust zodiac May 24, 2024, 7:29 PM

#

why am i here

serene scaffold May 24, 2024, 7:42 PM

#

robust zodiac why am i here

religions attempt to answer this question. personally, I think life is more fulfilling if you don't try to ascribe purpose to your existence, and just spend it doing things that are fulfilling for you.

robust zodiac May 24, 2024, 7:49 PM

#

serene scaffold religions attempt to answer this question. personally, I think life is more fulf...

i have no words

serene scaffold May 24, 2024, 7:50 PM

#

robust zodiac i have no words

why do you think you're here?

robust zodiac May 24, 2024, 7:50 PM

#

serene scaffold why do you think you're here?

because i clicked a link

serene scaffold May 24, 2024, 7:50 PM

#

robust zodiac because i clicked a link

why did you do it?

robust zodiac May 24, 2024, 7:52 PM

#

serene scaffold why did you do it?

because my actions were predestined 50,000 years before the creation of the universe

serene scaffold May 24, 2024, 7:56 PM

#

robust zodiac because my actions were predestined 50,000 years before the creation of the univ...

Interesting. Anyway, this is the data science channel, so let's talk about that going forward.

lapis sequoia May 24, 2024, 8:03 PM

#

I just finished my first NLP, I think it is trash. How do I judge it objecitvely?

#

like, what are the tiers of skill in ML/data science whatever

serene scaffold May 24, 2024, 8:06 PM

#

lapis sequoia I just finished my first NLP, I think it is trash. How do I judge it objecitvely...

nothing is "an NLP". NLP is a concept.
What did you create?

spring field May 24, 2024, 8:33 PM

#

BCE can only be used when you have two classes at least for that particular output, right?
am I overthinking this or can BCE only be used for that one case where you have to predict between one of two classes? is that it? can it at all be used for multi-class classification?

#

ViT + TokenLearner 🤭

#

yeah

#

TokenLearner

#

which ig is really helpful if you have more transformer layers, greater hidden size, and a larger dataset than I do (apparently too many dimensions with small datasets worsens the performance in my current experience)

#

like yesterday or mby it was early morning today when I was running those gpts for next token prediction, using a hidden size of 512 basically made it so it didn't learn at all (now thinking back, maybe it could've been caused by using encoder + decoder for next token prediction...), anyway, I reduced the hidden size to 128 (16 per head) and it immediately started learning again

#

embedding dimensions

#

how many dimensions a token is embedded into

#

mhm, that might have been what's happening

#

I assume it's been tried, but what about using a quadratic instead of a linear function? instead of Linear(x, M, b) = Mx + b use sth like Quadratic(x, A, B, c) = Axx^T + Bx + c

#

the linear layers with quadratic layers? it probably has to be differentiated twice though of which the second time is gonna be a const anyway.... hmmm

past meteor May 24, 2024, 9:23 PM

#

If you're in doubt about good initial hyperparameters a good thing you can do is make them too large and fit on a single batch

#

Your loss should go to 0, if it doesn't you have a bug. It's kind of a unit test for your architecture 😄

#

Aside from time (how long it takes to train it all) starting big and going smaller is a good idea

craggy agate May 24, 2024, 10:56 PM

#

Any good guides for LSTMs for forecasting time series data?

warm pebble May 24, 2024, 11:54 PM

#

can somebody help me make an object detection ai

#

i have no idea where to start

violet gull May 25, 2024, 12:21 AM

#

warm pebble i have no idea where to start

watch videos on how object detection ais work

warm pebble May 25, 2024, 12:24 AM

#

Ok

flat sigil May 25, 2024, 12:28 AM

#

finally got pytorch running with cuda yall arent ready 😈 😈

lapis sequoia May 25, 2024, 1:18 AM

#

serene scaffold nothing is "an NLP". NLP is a concept. What did you create?

wanna see it?

#

I do not know. I have been at this for a while. I base my self=worth on this and never ever stop doing it. I never feel like I am good and with all of this pytorch TensorFlow stuff (not that bad) like, I do not know, I always feel like I am trash. Like, I do not care about money. This data thing is a massive massive obsession. Whatever, everyone is obsessed with something

#

Like, you go pretty hard. What do I lack?

#

@serene scaffold it was this piece of garbage https://github.com/nickkatsy/python_ml_ect_/blob/master/hotel_nlp.py

GitHub

python_ml_ect_/hotel_nlp.py at master · nickkatsy/python_ml_ect_

Contribute to nickkatsy/python_ml_ect_ development by creating an account on GitHub.

gritty vessel May 25, 2024, 1:40 AM

#

ValueError: The filepath provided must end in `.keras` (Keras model format). Received: filepath=model_output/test01/model-{epoch:02d}-{val_loss:03f}.h5 i ran same code on my system and it worked fine but on kaggle it throws this error

#

does this mean i have to save the file with .keras extension or can i bypass it?

serene scaffold May 25, 2024, 1:41 AM

#

gritty vessel ``ValueError: The filepath provided must end in `.keras` (Keras model format). R...

evidently, it expects a file with a .keras file extension. though if you change filepath=model_output/test01/model-{epoch:02d}-{val_loss:03f}.h5 to filepath=model_output/test01/model-{epoch:02d}-{val_loss:03f}.keras (and change the name of the file to match), you'll get a different error if the file isn't structured the way it expects.

serene scaffold May 25, 2024, 1:42 AM

#

lapis sequoia <@253696366952316929> it was this piece of garbage https://github.com/nickkatsy...

you don't need to be so hard on yourself.
a tip: when working with pandas, assume that there's always a solution that doesn't involve .apply. you've used .apply several times when you should have used native pandas methods.

#

# not this
df['content'] = df['content'].apply(lambda x: x.lower())
# do this
df['content'] = df['content'].str.lower()

gritty vessel May 25, 2024, 1:43 AM

#

mcp_save = ModelCheckpoint(os.path.join(outdir,'model-{epoch:02d}-{val_loss:03f}.h5'), save_best_only=True, monitor='val_loss', mode='min') its giving error in this line

#

so here should i add save_format = none

serene scaffold May 25, 2024, 1:43 AM

#

in fact, every time that you use apply before at least line 68 should have been a .str. method.

gritty vessel May 25, 2024, 1:44 AM

#

https://pastebin.com/WXXNeX8t here is the whole code

Pastebin

from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoi...

Pastebin.com is the number one paste tool since 2002. Pastebin is a website where you can store text online for a set period of time.

#

i searched on google and someone said in latest version they did this to increase the usage of .keras

lapis sequoia May 25, 2024, 1:57 AM

#

serene scaffold you don't need to be so hard on yourself. a tip: when working with pandas, assum...

yeah, I was doing that. I do not know why I was doing the other method without lamda. I do not remember that one. I did like 4 today because I am trying to get them on lock

lapis sequoia May 25, 2024, 1:58 AM

#

serene scaffold you don't need to be so hard on yourself. a tip: when working with pandas, assum...

thank you for this: # not this
df['content'] = df['content'].apply(lambda x: x.lower())

do this

df['content'] = df['content'].str.lower(); Do you mean in general when it comes to cleaning text and stuff? I do not know, I kinda forgot what dataset that was in all honesty.

serene scaffold May 25, 2024, 2:02 AM

#

lapis sequoia thank you for this: # not this df['content'] = df['content'].apply(lambda x: x.l...

whenever you're trying to make a Series of strings lowercase, use .str.lower(), not a lambda. or apply

lapis sequoia May 25, 2024, 2:18 AM

#

Thanks

buoyant shoal May 25, 2024, 2:23 AM

#

#

Hi, isn't this like none of the above?

#

my orange line is where there's the largest explained variance ratio right?

wooden sail May 25, 2024, 5:09 AM

#

buoyant shoal Hi, isn't this like none of the above?

why not d?

buoyant shoal May 25, 2024, 5:09 AM

#

wooden sail why not d?

yep i chose D haha thanks

#

D is basically what i said right?

#

it's perpendicular to the "/"

#

technically at least

#

this is like an ellipsoid in R^2 right?

wooden sail May 25, 2024, 5:10 AM

#

yeah

buoyant shoal May 25, 2024, 5:12 AM

#

okay thanks

spring field May 25, 2024, 6:12 AM

#

is variance and mse calculated the same? pithink
also why does sample variance divide by n - 1 instead of n?

past meteor May 25, 2024, 6:25 AM

#

spring field is variance and mse calculated the same? <:pithink:652247559909277706> also why...

To correct for sample bias.

You don't have the variance but you have a sample of the variance. Ideally, when n goes to infinity it converges to the actual variance, that means it's an unbiased estimator.

The actual reason has to be with taking the expected value of the statistic (in this case the variance) and checking if it's equal to the population variance. It isn't, you need a correction to get there. That's where the term comes from. It's a typical thing you do in a statistics class, it's been a while for me

spring field May 25, 2024, 6:28 AM

#

I (think I) see

#

but MSE and (population) variance share the same formula, right?

past meteor May 25, 2024, 6:33 AM

#

No they're different

#

Ah wait, I see what you mean

#

Yeah that's a correct observation

spring field May 25, 2024, 6:39 AM

#

very cool, thanks

latent girder May 25, 2024, 6:41 AM

#

hi, whats a good beginner data science python project?

wooden sail May 25, 2024, 6:45 AM

#

spring field but MSE and (population) variance share the same formula, right?

they don't

#

they only do if whatever generates your estimate is "unbiased", yielding the correct value in expectation

#

otherwise the MSE is bias + covariance

spring field May 25, 2024, 6:47 AM

#

I meant like, it's a mean square over a bunch of differences

wooden sail May 25, 2024, 6:47 AM

#

yes but the meaning is different

#

you can get a huge MSE with 0 variance

#

e.g. if your function just outputs 0 always, regardless of input

spring field May 25, 2024, 6:48 AM

#

I get that (well, I understand that that's the case, I'm unsure of the deeper details)

#

I just found it surprising the formulas were conceptually the same

wooden sail May 25, 2024, 6:49 AM

#

on purpose, so that you can study the mean and covariance :p

#

they both describe second order statistics

spring field May 25, 2024, 6:51 AM

#

wooden sail on purpose, so that you can study the mean and covariance :p

oh, is that why it's a square?

wooden sail May 25, 2024, 6:51 AM

#

yes

wooden sail May 25, 2024, 6:51 AM

#

spring field oh, is that why it's a square?

https://en.wikipedia.org/wiki/Moment_(mathematics) some people call these "statistical moments"

Moment (mathematics)

In mathematics, the moments of a function are certain quantitative measures related to the shape of the function's graph. If the function represents mass density, then the zeroth moment is the total mass, the first moment (normalized by total mass) is the center of mass, and the second moment is the moment of inertia. If the function is a probab...

spring field May 25, 2024, 6:51 AM

#

I'd call this an episode of enlightenment pg_rofl

wooden sail May 25, 2024, 6:53 AM

#

you could generally say that the two things (variance and MSE) are "second moments". the variance is the second central moment (subtract the TRUE mean of the random variable). the MSE is the "second moment" (NOT central) of the ERROR

#

if your estimator is unbiased, then the estimator's mean is the true mean and the error is now centered at zero, turning the MSE into a second central moment (a variance)

#

the example of the 0 estimator i mentioned before is pretty important because it reminds you that the MSE doesn't tell you the nature of the error. it's up to you to verify later if it's variance or bias

spring field May 25, 2024, 6:56 AM

#

oooh, the puzzle pieces are coming together

wooden sail May 25, 2024, 6:56 AM

#

at any rate, the point being that the MSE formula looks like the variance formula not because the MSE is a variance, but because both the MSE and the variance are the same kind of object, a statistical second moment

spring field May 25, 2024, 6:56 AM

#

unbiased relative to the population, right?

#

like if you do linear regression and find the true mean, your bias value might not be 0, but still unbiased, is that correct? pithink

wooden sail May 25, 2024, 6:59 AM

#

i see what you mean and you could technically define the bias either way

#

but we normally refer to the true parameters, so bias is defined w.r.t. the true mean and not the population one

#

this is what the correction factor zestar mentioned addresses

#

but you're mixing two things up as well

#

because i can have data of a population

#

compute the variance of that data

#

but the data has a true variance

#

so i can also compute the MSE of the variance

#

and those are two separate things 😛

#data-science-and-ml

do this