#data-science-and-ml

1 messages Β· Page 40 of 1

fading wigeon
#

Oh, it's hundreds of variables. I just printed out the pca's singular_values attribute

wooden sail
#

what are you calling variables?

#

the total number of singular values is equal to the number of variables if you haven't done any dimensionality reduction

#

what's the shape of your data?

merry fern
#

Would anyone know how to make this hack work for more than 1 level? I am trying to remove duplicates post-groupby on 3 levels, so something like this:

        bbb  1
     bb  aaa  1
  B  aa  aaa  1
              1
     bb  bbb
  c  aa  aaa
     bb  bbb  1```
https://stackoverflow.com/questions/64797580/pandas-groupby-remove-duplicates/64797686#64797686
twin parrot
#

Someone recently told me about Polars - that it is more efficient than Pandas, as Panda's needs to be done sequentially whereas Polars can be done in parallel. I wanted to ask for a confirmation of this, and if anyone can recommend a good website/video/resource for Polars.

EDIT to add there isn't any resources on here

steady basalt
#

has anyone used docker to build an app and ran into the issue of yml file not being found when installed

serene scaffold
steady basalt
#

actually, i fixed that. sorry now its 'Error: Invalid value: File does not exist: ./app/app.py'

serene scaffold
#

still not a data science question. try going to that other channel and show the dockerfile as well, I guess.

steady basalt
#

im assuming ive screwed my dockerfile but really i did it to the book

#

yeah sorry not rly meta here

serene scaffold
#

if people need "pandas but with parallel computation", they usually use dask.

worn stratus
#

polars really looks like it has a nicer API - the problem is at work we have a ton of internal tooling that expects pandas DFs, so that's not really a good enough argument for adding it into the mix.

odd meteor
#

You might find the pandas tutorial in this channel very helpful https://www.youtube.com/@dataschool/videos

iron basalt
#

Polars is a blazingly fast DataFrame library completely written in Rust, using the Apache Arrow memory model. It exposes bindings for the popular Python and soon JavaScript languages. Polars supports a full lazy execution API allowing query optimization.

odd meteor
#

Drop the id and the class column and use the rest as your input.

input_df = df.drop(['id', 'class'], axis = 1)

This should do the job now.

prime hearth
#

hello, is there any way in beautiful soup to get the class name string value sucha s below:

<span class="ui_bubble_rating bubble_10"></span>
``` and get the string value "bubble_10"?
#

the class name is dynamic so i cant hardcode "bubble_10" it can change to "bubble_20" etc, so would like to get that string value

#

nvm just found i can do element['classname'] and it gives class name

cyan tiger
#

Has anyone tried to use chatgpt using an unofficial api here?

fading wigeon
twin parrot
thorn bobcat
#

hi everyone!

fading wigeon
lapis sequoia
#

hi I am looking for data sets on California wildfire frequency by year

#

anyone know where I could find this stuff?

serene scaffold
#

If you can't find such a thing by googling it or on Wikipedia, you might try .gov websites for California

boreal cape
#

hey do you known how to do a early fusion on two different size data sets

#

so that when we pass then through a classifier one dataset is not more dominant then the other

boreal cape
#

help please anyone

cobalt spade
vague sable
#

Hi all,

I have a file with a large list of grocery product names from two separate stores.
they each name their products slightly different.

Is it possible to compare the product from the two files & match them in order to correctly compare the prices of the two similar products?

cobalt spade
vague sable
#

1000+ product Names

#

For example, these are two products from both files.

  1. Fresh Pink Lady Apples Each
  2. Pink Lady Apples | approx. 200g each

Same item obviously, just names slightly different.

#

my intention is to compare the two items for the best price

#

I dont think I can do that until I figure out how to 'match' the products to ensure I am comparing the same product

rough crag
#

Maybe you can compare whether each word is a subset of the other

#

@vague sable

cobalt spade
celest vine
#

Hi

#

Do I need to know Django if I aspire to become a data scientist?

north wolf
#

hi, QQ: recommendations for text classification on complex abstractions with small datasets? i tried using SVM but the concept is too complex to do with simple machine learning(featuring word by word), and the dataset is too small for the text (2000 characters avg, only 250 samples) so applying models such as roBERTa or longformers doesnt work quite well.

odd meteor
# boreal cape so that when we pass then through a classifier one dataset is not more dominant ...

I haven't heard of early fusion before but if what you're referring to is related to solving the problem of class imbalance in your data, then there are several techniques to handle that.

  • SMOTE
  • Resampling strategy
  • Upsampling minority class
  • Tunning the class weight hyperparameter of your classifier
  • Doing Data Augmentation etc

In whatever approach you decide to take, if you have getting an optimum performance in mind, please avoid using SMOTE.

Depending on the project, you can use class weight or data augmentation.

odd meteor
odd meteor
north wolf
#

the amount of labels is low (3 only) but the amount of data is not sufficing i think

#

thats why using some noised text or text augmentation methods wont work, because it would break the whole text idea

molten hamlet
#

is this book good with theory? applying algos is easy, but I want to understand that πŸ™‚

old crow
#

Hey so, I'm working on a college project where I'm using an LSTM model to try and predict stock prices. I have a few questions:

  1. How do you predict future stock prices? Most of the models I've seen don't actually predict future values - they just compare predicted versus actual prices, but only within the input dataset.

2)In my LSTM model, the results for 10 years of data are kind of strange - they seem too good to be true. I'm not an expert so basically I'm not sure if I'm doing this right LOL.
(I've attached the results)

3)Every time I rerun the model, the results seem to get worse(significantly). Is this normal?

I'd really appreciate any sort of help. Thanks!

mild dirge
# old crow Hey so, I'm working on a college project where I'm using an LSTM model to try an...
  1. Your model predicts the next value(s) given all (or part of) the previous data. You could give all the data you have to it, and it should predict the next value(s).

  2. Are you testing it on a test set not used for training the model? If you test it on the data you used for training and it has very good results, it probably is overfitted. It will likely not work well on actual new data, but it has just memorized your training data.

  3. Rerunning should not change the results of the model I'd think, LSTM should just be deterministic given the same data and initial hidden state.

odd meteor
# north wolf its hard, because what im trying to achieve is to classify the stance of the doc...

Well, you won't be reprimanded, penalised, or persecuted for attempting such, yeah? 😊

If I were in your shoe, I'll definitely try meta learning vs text augmentation.

I'm certain you'll either see some reasonable improvement or discover something interesting in the process.

For text augmentation, I've always used TextAttack and I enjoy using the library. You might wanna check it out later https://textattack.readthedocs.io/en/latest/0_get_started/basic-Intro.html

Then, if you're still worried about the artificial perturbation that comes with augmentation, then you can proceed to carry out Adversarial Text Attack (which you can also do with the TextAttack library).

Finally, if you aren't pressed for time, you can scrap data online and add it to the current one you have (if you don't wanna do text augmentation).

If you're feeling generous enough, please do share with us what you were able to uncover after trying the two approaches.

old crow
# mild dirge 1) Your model predicts the next value(s) given all (or part of) the previous dat...

Are you testing it on a test set not used for training the model? If you test it on the data you used for training and it has very good results, it probably is overfitted. It will likely not work well on actual new data, but it has just memorized your training data.
yeah i feel overfitting might be the issue, i haven't tried running the model on a different dataset, any suggestions on how i should go about fixing it?

Rerunning should not change the results of the model I'd think, LSTM should just be deterministic given the same data and initial hidden state.
It's frustrating cuz i have to rewrite the entire code again to get results close to the first run.

pure wraith
#

Anyone here ever taken harvards cs50 intro to comp sci?

languid warren
errant juniper
#

i do think its worth the cert thou

pure wraith
#

Hmmm ok thanks

errant juniper
#

np

hasty mountain
#

I was making a diffusion prototype to see if I can get how it works and I got a bit confused over the noising thing.
I have to make my model noise my image, passing the image as input and using a noisy version as label, right? And this is done through some time steps.

Should I follow certain pattern to apply this noise to the image? Or can I just randomly apply noise to randomly selected pixels and let the forward/backward noising steps do the trick?

#

I'm using 5 steps at the moment, each image has 64x64x3 pixels(so 12,288 in total), the first step has the image as input and the label has 1000 random pixels added. This label is passed as input for the second step, which has as label the same input image, but with 2000 random pixels, and so on.
The 5th step has the 4th label(8000 random pixels) as input, while the label is simply a random noise with shape 64x64x3.

The results I'm getting is...nothing. lr=1e-3 leads to vanishing gradients and black and white stripes on every output.
lr=1e-6 produces only gray squares.

broken zenith
digital fog
#

For kNN when inspecting the features of the dataset, I have done a histogram and boxplot to check for normality/skewness and outliers. Is there anything else I should be doing visually?

queen cradle
#

If you want to visually check normality, I suggest a Q-Q plot.

#

But kNN is a non-parametric method, so you don't need normality except for the warm fuzzy feeling.

#

Keep in mind that real data is never normal. It may be approximately normal; it may be so close to normal that you can't tell the difference. But real-world data always comes with complications of one kind or another, so it's never going to be exactly normal.

gusty anvil
#

hi guys, does anybody know a way to schedule the execution of a google colab script? In a way that it runs once a week

digital fog
novel python
#

Hello everyone, I got some weird scenarios when testing some models here, but I couldn't really figure out where there's something wrong with the code (which I think there is because the values are way too off to make sense).

Currently, I'm sending different models to a run_model function so that I can have the results automated:

def run_model(model, X_train, y_train, X_test, y_test):
    
    # Fit Model
    model.fit(X_train, y_train)
    
    # Get Metrics    
    preds = model.predict(X_test)
    
    rmse = np.sqrt(mean_squared_error(y_test, preds))
    print(f'RMSE : {rmse}')
    
    # Plot results
    signal_range = np.arange(0, len(X_test) + len(X_train))
    output = model.predict(signal_range.reshape(-1,1))
    
    
    plt.figure(figsize=(12,6),dpi=100)
    sns.scatterplot(y=df[df['wireless_number'] == 1]['data_used_in_gb'],  x=np.arange(0, len(df[df['wireless_number'] == 1]['data_used_in_gb'])), color='black')
    plt.scatter(x=signal_range, y=output)

I'm testing every type of regressor, linear, polynomial, random forest, etc. But a simple linear regression is looking like this after a grid search was applied:

#

Which is absolutely non-sense. Let me know if you guys spot anything weird there, I couldn't really figure it out.

mild dirge
#

So there is only 1 feature and 1 output? @novel python

mild dirge
#

The feature being the x-axis and the y-axis being the predictions?

novel python
#

yeah, but the feature is always the same because it's the data usage over the course of 30 days

#

so that's why I just used np.arange(0,30

#

to create them

mild dirge
#

And you are plotting the test data and the predictions?

novel python
#

I'm plotting the label data and the predictions for the whole data

mild dirge
#

Label data meaning the test data?

#

Is the feature discrete, like 1, 2, 3 etc. or can there be values inbetween?

novel python
#

nope, the whole 30 days. The test data is split between t

#

between the train_test_split function just to evaluate the RMSE

mild dirge
#

How do you split it then?

#

First few days for train and last few for test?

novel python
#
X_train, X_test, y_train, y_test =\
     train_test_split(df[df['wireless_number'] == 1]['data_used_in_gb'], np.arange(0, len(df[df['wireless_number'] == 1]['data_used_in_gb'])), test_size=0.1, shuffle=False)
#

yeah basically, 10% for the test

mild dirge
#

Do you have multiple values for the same day f.e.?

novel python
#

nope

mild dirge
#

Hmm okay

novel python
#

only 1 value per day

mild dirge
#

So if you are using simple linear regression, it should basically just go close to through the first 90% of points

novel python
#

yup, and it explodes right from the beginning

mild dirge
#

Well the linear regression will just give you a straight line

#

Did you in some way normalize any data?

novel python
#

nope, didn't use any scaling

mild dirge
#

Sorry for all the questions but really want to understand the full picture.
What was the rmse?

novel python
#

the linear regression don't even combine the first 2 points, it just go straight to heaven right from the start. That's the only thing that's confusing me

novel python
#

really high

#

considering that the max usage for that month was like 6.5ish

mild dirge
#

So there is already going something wrong at the fitting or predicting stage then

#

hmm

#

Is the model written by you?

novel python
#

nope, using sklearn ones

#

basically just doing model = LinearRegression(), for example.

#

and then passing it into the function

mild dirge
#

What shape is your training data

#

Looking at the fit description in the docs, it requires a 2d array of shape (n_samples, n_features)

#

Is yours 1d, might that be a problem?

novel python
#

I'm reshaping them with

X_train, X_test = np.array(X_train).reshape(-1, 1), np.array(X_test).reshape(-1, 1)
mild dirge
#

Yeah, that should be correct

novel python
#

oh, wait a minute... 1 sec

serene scaffold
mild dirge
serene scaffold
mild dirge
#

Wait

#

Do you not need to do model = model.fit(x, y) ?

novel python
#

it's being done inside the function

mild dirge
#

It says it returns a fitted model in the docs

novel python
#

I just realized what was being done wrong and I feel stupid af now to say that in chat

serene scaffold
novel python
#

but the X and y were reversed...

mild dirge
#

Yeah spit it out

#

Ah lol

novel python
#

ty a lot!

#

I literally spent the last 2 hours trying to fix that

#

but once I started throwing it here I realized that halfway

mild dirge
#

You unpacked them in wrong order in the train_test_split line then?

serene scaffold
novel python
#

yeah

mild dirge
#

Ah makes sense haha

novel python
#

lmao

pine wolf
languid sigil
serene scaffold
strange igloo
#

Chat GPT can solve coding problems/prompts pretty well. How does this work? I mean, roughly. Does this mean ChatGPT was trained on a similar question and answer, or is ChatGPT "thinking" up new solutions.

What's interesting is that when you ask it to write something, what comes out is very generic sounding. But the code it produces seems very elegant in comparison.

serene scaffold
austere swift
serene scaffold
#

oh cool

strange igloo
#

I’ve heard that every output is like a probability of what would come next.

Which makes sense with sentences. But with code, that seems way more difficult and error prone.

serene scaffold
languid sigil
#

You can trick it to fix itself

strange igloo
#

I haven’t used it enough to know.

languid sigil
#

If there's an error in the code or something wrong with the formatting where the results are off

#

You can ask it to "Do it again without using for loops"

#

as an example

#

And it'll find an alternative

austere swift
languid sigil
#

I've never had non-functioning code from chat gpt but depends on how advanced the exact code you're looking for is I guess

austere swift
#

I don't remember who it was here that sent a snippet from chatgpt where they asked it to expect the outcome of some code, it did the addition wrong, then they told it "can you check your math" or something like that and it was like "oh yeah I was wrong, its actually *another wrong answer*"

strange igloo
#

I’m currently upset because I was working on a double recursive solution for a while, and was conceptually correct, but was having trouble crafting the code structure.

ChatGPT basically gave me exactly what I would have ended up with had I kept going for a few days

austere swift
languid sigil
#

Its crazy how it just knows how to write the code at all

#

Does it just ... analyze source files and their instructions, and make their own interpretation using what you ask?

#

I've had trouble finding code examples and chat gpt just spits it out like nothing

austere swift
#

its trained on a bunch of code from all different sources, including github and s/o posts etc

#

that's why its responses can sometimes look like textbook s/o posts

hasty mountain
#

Which is interesting...until I ask it about Reinforcement Learning and all he can say is basically about PPO

languid sigil
#

Also, if anyone gets a chance to look at these coefficients for linear regression in this help channel would grately appreciate it πŸ˜…
#1061405019796152380 message

#

I mean, I could just ask Chat GPT lol

hasty mountain
#

Yes, but bear in mind that it can produce wrong answers

#

He answered me some things about GANs that, when I tested them, it didn't work at all.

#

Or maybe I did something wrong...which I hope so, because if it worked it would be cool

verbal venture
#

this was written in tensorflow pytf.keras.layers.Conv2D(16, (3, 3), activation="relu", input_shape=(150, 150, 3)), does 3,3 refer to the kernel filter? and what's the 16?

austere swift
verbal venture
#

so there's 48 filters total yeah?

austere swift
#

it's also called the kernel size or convolution window size sometimes

iron basalt
# strange igloo Chat GPT can solve coding problems/prompts pretty well. How does this work? I me...

@serene scaffold ChatGPT kind of has a "world model" due to its use of RL, but it's a model of just the text (notably, not physical text that has position and such (the kind that humans read / work with (very different from a string in memory in a computer))) and human preferences (they trained it with human feedback on responses). It's not a "world model" of the real physical world like humans have. In addition, when people bring up "think" my assumption is that they are imagining conscious human thought, which ChatGPT is not mimicking. Conscious human thought (made up fuzzy definition here) is much more like "the video game in your head" which requires a "world model" of the physical world, which comes from multiple senses such as vision, touch, and sound (fusion). Human language is the association (see associative memory in neuroscience) of these things (including sequences of these things) (audio for spoken language and vision (or touch) for written / carved / etc). The way stuff like most deep learning being used works is more like unconscious human thought, it's fast / fuzzy probabilistic processes / functions that don't run a whole intent-guided simulation (but dialed up to 11 via lots of compute (also parts of conscious-like thought have been making their way more and more into these systems (e.g. attention))). However, conscious human thought can poke at / modify / make use / invoke those processes. Spoken language is particularly interesting since most of it is done unconsciously, we don't think of every word we are about to say before we say it, that would be too slow for day-to-day use. It's intent guiding / invoking those faster processes (we can switch to conscious mode / focus on it though and do it the slower way). There isn't a set definition of what "conscious" is, but I find this distinction to be useful.

iron basalt
# iron basalt <@253696366952316929> ChatGPT kind of has a "world model" due to its use of RL, ...

*It turns out however, that one can get pretty far with just the text and preferences feedback. And humans are easily tricked into thinking that it came from something conscious, probably because it's an assumption. We assume other humans are conscious like us, and when something produces something we would only expect from a human our associative memory probably jumps to "is conscious". It's an important assumption that makes us attempt to learn from each other without being taught to do so (and empathy, etc).

iron basalt
# mild dirge This was typed by chatGPT*

That is unfortunately something that must considered now for all text on the internet. I hope that it does not result in too much spam and make things worse than they are.

#

(At some point we all probably need detection tools to get anywhere)

hasty mountain
#

Also...uh...is there a type of Attention Layer developed specifically for images?
I've been testing one that I made here for some days now...but if a proper ML engineer developed one, it'll probably be way more effective than mine.

mild dirge
#

I'm sure there are attention layers for simple image classification too.

hasty mountain
mild dirge
#

I'm not sure about that, I used it for a project once myself which is why I know it exists

hasty mountain
verbal venture
#

most of the AI courses on the internet suck

#

they don't explain anything. they just tell you to write x and y but never why

#

I'm trying to make CNNs atm

serene scaffold
jolly briar
hasty mountain
#

Does Pytorch's .permute() command destroys RGB images or is it just my luck?
The transforms.ToTensor() transformed my CIFAR100 images into marvellous white squares with gray stripes

EDIT: I was using .view(), when using .permute(), things go on fine. Strange...

serene scaffold
#

yep!

boreal cape
#

@serene scaffold

#

can you help with early data fusion

serene scaffold
# boreal cape can you help with early data fusion

please don't ask to ask, or ping specific people. put a complete question in the chat that anyone can read, and start answering if they know. if people have to interview you to figure out the question, they won't.

tacit basin
#

There should be chatGPT based bot here to repeat this over and over πŸ™‚

azure mulch
tacit basin
azure mulch
tacit basin
azure mulch
tacit basin
azure mulch
# tacit basin Why is it limited?

Because of the amount of public attention it got and the criticism of how easy it was to get it to say it supported the Taliban or similar.

tacit basin
#

By public attention you mean the RL part?

azure mulch
#

Not exactly RL as a whole.

tacit basin
#

But what do you mean by 'limited',?

boreal cape
#

hey evryone so I have these two different types of data

#

health data

#

and non health data

#

now I am trying to do late fusion

#

how do I do it

tacit basin
#

Could be shape of an array, for example RGB image will have 3 color channels and 512 size of the image, making it shape of 3 x 512 x 512. But there could be other meanings depends on the context

lapis sequoia
#

Train Score 0.6417617510601822
Test Score 0.6397783938389512
does this mean my model is good?

main fox
tranquil jasper
#

how can i change the name of a column?

main fox
tranquil jasper
#

thanks

lapis sequoia
#

from california housing list dataset

unique ridge
#

Hey there, is a pipeline part of the modeling process or the data preparation process?

I can do at the moment 2 things with my data:

  • remove outliers
  • append StandardScaler to all columns.

Now I am afraid that when I select my train, validate and test data the outliers removal goes a bit wrong resulting in uneven columns when appending my pipeline to the tr, val and tst data.

unique ridge
cedar sentinel
#

Hey guys. I just dive into SVMs and trying to solve some easy exercises but I struggle. Can someone give me some help ?

fallen crown
#

ok but how is the number of each innovation determined, is there an order for numbering each connection

lapis sequoia
#

Nice to know about kaggle tho
I'll try practicing from there tyvm

warm iron
#

Hello everyone I am a student hoping to get some idea for a AI project as that is something I have to do in my college for A-level. However, I am struggling to decide what AI project to make.

warm iron
# haughty heart a chess game ai

A chess AI would probably be made by some students I also kinda need something Unique as I am also applying for Uni and trying to Include it for my personal statement for uni application

haughty heart
#

oo

#

alright let me think hmm

#

advanced but

haughty heart
#

what about a ai that you put in foods and fruits and it thinks about recipes for a fruit/protein smoothing using those items

warm iron
warm iron
haughty heart
unique ridge
#

no problem πŸ’ͺ

hoary haven
#

How can I make a chat AI?

#

Which libraries would you recommend?

lapis sequoia
#

Can anyone suggest me something
I am new to machine learning thing and to show my results i generally use mathplotlib to make graphs but i get some error everytime like should be of same shape or something
I don't full understand the working of it so I'm a little confused
Can anyone help me with an in-depth guide of seaborne or math plot lib

#

I'm following two links on the internet but they aren't reliable and most of the code given there is either outdated or wrong
Also i don't want to use it's documentation

fallow frost
#

any avid flashtext users by chance?:

fallow frost
#

If you had this sentence:
sentence = "distributed super computer game"
and you wanted to extract these keywords:

{
    "Distributed Super Company": ["distributed super company"],
    "Super Computer": ["super computer"],
    "Computer Game": ["computer game"]
}

Would you want the output of kp.extract_keywords(sentence) to be:
["Super Computer"] or:
["Super Computer", "Computer Game"]

misty chasm
#

hello

verbal venture
#

what does fit mean in this context

#

model.fit(X_train, y_train)

queen cradle
#

It's an sklearn method. Exactly what it does depends on model. See the docs.

verbal venture
#

how close the values of x are to y ?

queen cradle
#

It depends on model.

verbal venture
#

it says x is input data and y is target data

#

my model is a CNN

queen cradle
#

How are you constructing model?

verbal venture
#
def classification_model():
  model = Sequential()
  model.add(Dense(num_pixels, activation='relu', input_shape=(num_pixels, )))
  model.add(Dense(100, activation='relu', ))
  model.add(Dense(num_classes, activation='softmax'))

  model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
  return model```
#

using Keras

queen cradle
#

Oh, so that's not sklearn.

#

Well, then I think you should look in the Keras docs.

#

It should explain to you what model.fit does.

verbal venture
#

im coming from the docs bro

#

and it said x is input y is target. i started ml 2 days

queen cradle
#

It's constructing a function that approximately sends x to y. That's all.

#

The function is a neural network with some coefficients, and .fit chooses those coefficients so that the input data gets mapped to the target data, more or less.

tidal bough
# verbal venture and it said x is input y is target. i started ml 2 days

The docs say quite a lot more than that:
https://keras.io/api/models/model_training_apis/#fit-method

Trains the model for a fixed number of epochs (iterations on a dataset).
...
epochs: Integer. Number of epochs to train the model. An epoch is an iteration over the entire x and y data provided (unless the steps_per_epoch flag is set to something other than None). Note that in conjunction with initial_epoch, epochs is to be understood as "final epoch". The model is not trained for a number of iterations given by epochs, but merely until the epoch of index epochs is reached
and epochs defaults to 1 according to the signature.

verbal venture
tidal bough
#

Do you know what backpropagation is?

verbal venture
tidal bough
#

It splits the dataset into batches of 32 (by default) datapoints, runs backpropagation on each according to the model's optimizer (Adam in your case), until the entire dataset gets trained on this way (which is called a single epoch).

verbal venture
#

like fit as a method itself is always backprop?

tidal bough
#

Not sure what you mean by that. One needs to do backpropagation to determine how to alter the model's weights to decrease loss. How exactly it does it, though, depends on the optimizer - it can be just gradient descent (if using the SGD optimizer), or something fancy like Adam, like in your case.

hasty mountain
#

Hey guys, about Reinforcement Learning...
I'm trying to understand Policy-based algorithms, according to ChatGPT:
In policy-based reinforcement learning, the goal is to learn a policy that directly maps states to actions, rather than learning a value function that predicts the expected return for each action. The agent's policy is typically represented by a probability distribution over the action space, which specifies the probability of selecting each possible action given a particular state.

So, if I want to make a policy-based algorithm, would I have to actually make 2 models: one to receive the current state and predict the possible actions through a softmax, and another one that will receive the current state and predict the actual action?

#

If this is correct...then it feels like my agent model will be working together with a vectorizer model... The vectorizer receives a context(current state) and, based on that, generates vectors(actions probabilities), which will be the cross-entropy label for the agent output. pithink

final jewel
#

Hey there!How are you doing?
Δ° need some help,how to use chatterbot now?Δ°t owners are not upgrading that.Δ°ts an ai chat bot that's why im writing here.Do you know how to use this in 2023?

worldly dawn
hasty mountain
verbal venture
#

does epoch * batch size = total amount of data trained on?

#

if I have a dataset of 60k, 10 epochs and a batch size of 200, would that mean 2k images got trained on?

hasty mountain
#

A single epoch is when you've iterated through the entire dataset(which is divided in batches)

wanton merlin
#

Hey guys are there any models that are trained to estimate the size of insects and worms ?

wanton merlin
#

based on insects or worms image datasets

verbal venture
hasty mountain
#

If your batch is 64, then you'll be making your model deal with 64 samples at each pass

verbal venture
#

samples of how much data?

#

64/60k?

hasty mountain
#

No, it's just the samples from your entire dataset

#

If your dataset has 60k samples and your batch size is 64, then you'll be selecting 64 samples from your 60k dataset to make the forward/backward pass.

stone sluice
#

Hey guys, I have a question, say I have two dataframes in Pandas, where one dataframe holds the name of something (call it value) and two other columns A and B. Now, I want to create a new dataframe, where the value of a new column X is based on the A and B columns, grabbing the value that belongs to it. How would I go about that?

iron basalt
hasty mountain
#

I managed to make a model that tries to predict correctly the reward...but I'm still struggling to make it...like...try different actions, instead of blindly using the same one

iron basalt
hasty mountain
hasty mountain
hasty mountain
#

It's not exactly exploration. I mean... The gradients generated through the MSE(predicted_reward, actual_reward) should generate different actions, shouldn't it?

iron basalt
#

(Or maybe you decide to ignore them entirely)

hasty mountain
iron basalt
hasty mountain
iron basalt
#

Directly / indirectly. Things get a bit weird in terminology and such, it's best to just read that book.

hasty mountain
#

Ugh... I hate how people use different terminologies for this...

iron basalt
#

With RL you are dealing with feedback loops, so there is a loop between actions and values. So talking about it is a bit difficult.

#

*In general, which is why we talk about specific algorithms or things get difficult and then people make up their own terms and start just using math to more specific.

hasty mountain
#
action selection over action values there is always an " probability of selecting a random
action. Of course, one could select according to a soft-max distribution based on action
values, but this alone would not allow the policy to approach a deterministic policy.
Instead, the action-value estimates would converge to their corresponding true values,
which would di↡er by a finite amount, translating to specific probabilities other than 0 and
1.```
Oh...so this explains why the softmax(possible_actions, true_action) didn't work for me...
hasty mountain
# iron basalt Not necessarily. If you are estimating values, you can choose actions based on t...
# Sample an action from the distribution
        action = torch.multinomial(action_probs, 1).item()

(ChatGPT)
Oh...so my action is not necessarily the action with higher probability in the distribution?
Then, the predicted reward is not based on my action or in the state that gave origin to such action, but rather on the state that comes right after the model executed such action?

Or is it wrong?

#

My model is predicting both the action and, based on that, the reward. It's managing to make nice predictions, but the actions tend to be always the same.
Should I make it predict the action and, after the action has been executed, predict the reward?

iron basalt
hasty mountain
# iron basalt I don't you have mentioned which RL method you are using. Unless you are making ...

I don't really know the method. All I know is that my model receives an image and the previous actual reward as inputs.
Then, extracts features from the image and tries to predict the actions(this is then passed through a softmax).

It also passes the previous reward through a linear layer and concatenates the output to the actions(before softmax). With this concatenation, it tries to predict the reward.

#

Then the backpropagation is done exclusively through MSE(predicted_reward, actual_reward). Always using cumulative reward.

velvet abyss
#

Trying to install TF from the source, how long is the PKG build process supposed to take?

#

It's been like, a solid 15 minutes

tough sun
#

I'm new to ai and I'm trying to find some models that can classify if a sequence of heads or tails was computer generated or human written. I have a small dataset of labeled data and was wondering what kind of model would work. Would a LSTM rnn work for this?

serene scaffold
hasty mountain
serene scaffold
tough sun
#

like a sequence of thhhththh

hasty mountain
tough sun
#

Basically a human trying to fake a random coin toss

serene scaffold
#

You could try it with an lstm, I guess.

#

I don't think the solution would generalize very well.

hasty mountain
#

Aw...my GAN doesn't want to converge anymore just because I'm using a lower batch size.
I didn't know the batch size was also a parameter to make a GAN converge... I thought it was just for better generated images.

mild dirge
#

I think for humans you can probably find some kind of relation between a coin flip and all previous coin flips, but when a machine does it, there should be absolutely no dependence @tough sun

#

So you should make an algorithm that can find out if there is a dependence between a coin flip and a certain amount of previous coin flips

fierce harbor
#

I'm working on the interpretation of regression analysis but I get really confused about theΒ characteristics of an approximately normal distribution

#

I have mean=42 mins, std dev = 3 mins

#

The shape of the probability distribution is roughly symmetric, the center would rest at 42 mins, what would be the variability?

#

Would the variability just be the standard dev so 3 mins? My statistical knowledge is minimal. I am also familiar with the normalCDF and invNorm functions if that is needed

lavish swift
#

Does anyone have any good resources for learning/exploring Polars (https://www.pola.rs/). Can be an article, book, video...just wondering if anyone has a resource they've personally found useful. I've watched some YT videos and also this article (https://kevinheavey.github.io/modern-polars/) just looking for other resources beyond just a YT or Google search. Thanks!

serene scaffold
lavish swift
#

@serene scaffold yeah, I mentioned that just to save people time doing a google/yt search which I've done. And let folks know the format wasn't super important to me if they found it useful. So really, a YT vid or Google result would work, but if it's in the top results, I've probably checked it out already πŸ™‚

#

During 2022 I worked my way through Matt Harrison's "Effective Pandas" book and loved it. So a book (or e-book) recommendation would be good too.

modest wagon
#

Hi. I've been trying to use a CNN to identify sign-language in a jupyter kernel (3.9.13), but running this "from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from tensorflow.keras.callbacks import TensorBoard" keeps on killing my kernel. Does anybody have any advice/ideas?

serene scaffold
#

@orchid mist we don't allow surveys here, sorry

modest wagon
#

This is the error:
ModuleNotFoundError: No module named 'tensorflow'

arctic wedgeBOT
serene scaffold
#

and then go back to the py file and add the same code, run it, and make sure it's the same.

modest wagon
#

sys.executable = '/Users/myusername/opt/anaconda3/bin/python'

#

the .py file still catches the error in the beginning and doesn't run this line

serene scaffold
modest wagon
#

okay

#

invalid syntax

serene scaffold
#

!traceback

arctic wedgeBOT
#

Please provide the full traceback for your exception in order to help us identify your issue.
While the last line of the error message tells us what kind of error you got,
the full traceback will tell us which line, and other critical information to solve your problem.
Please avoid screenshots so we can copy and paste parts of the message.

A full traceback could look like:

Traceback (most recent call last):
  File "my_file.py", line 5, in <module>
    add_three("6")
  File "my_file.py", line 2, in add_three
    a = num + 3
        ~~~~^~~
TypeError: can only concatenate str (not "int") to str

If the traceback is long, use our pastebin.

serene scaffold
#

^ this shows what is meant by "whole error"

modest wagon
#

/usr/local/bin/python3 /Users/username/Downloads/MeDetectionWHOA.py
(base) username@MyDevice ~ % /usr/local/bin/python3 /Users/username/Downloads/MeDetectionWHOA
.py
File "<fstring>", line 1
(sys.executable = )
^
SyntaxError: invalid syntax

#

I get this

serene scaffold
#

print(f'{sys.executable = }') is not (sys.executable = )

modest wagon
#

I copied this "import sys
print(f'{sys.executable = }')
", and I still keep getting this /usr/local/bin/python3 /Users/username/Downloads/MeDetectionWHOA.py
(base) username@DeviceName ~ % /usr/local/bin/python3 /Users/username/Downloads/MeDetectionWHOA
.py
File "<fstring>", line 1
(sys.executable = )
^
SyntaxError: invalid syntax
(base) username@Insanity ~ %

serene scaffold
#

but I suspect it's going to print /usr/local/bin/python3, which is a different executable than /Users/myusername/opt/anaconda3/bin/python. so the things you have installed for anaconda won't appear when you use /usr/local/bin/python3

serene scaffold
#

anyway, how did you start the jupyter notebook?

modest wagon
#

I found a youtube tutorial on sign-language

#

πŸ™‚

#

I have some python experience, but not much with ML

serene scaffold
#

watching a video to reproduce all your steps is more than I'm willing to commit to. but if you can explain how you started the notebook, please let me know.

modest wagon
#

I just opened anaconda, clicked launch, and pressed new

serene scaffold
#

I see

modest wagon
#

that's in jupyter

#

the error I sent was in the .py file

#

In jupyter I get. sys.executable = '/Users/myusername/opt/anaconda3/bin/python'

serene scaffold
#

imo, anaconda shouldn't be used by beginners anymore. though I understand why it was in the past.

modest wagon
#

so, any advice on what I should do?

serene scaffold
#

how much Python experience do you have prior to this? like what's the most complicated thing you've done?

modest wagon
#

Some stuff with arrays, loops, if-statements and strings

#

And some raspberry pi stuff

serene scaffold
#

are you sure you're talking about arrays? because lists and arrays are different.

modest wagon
#

lists

#

not arrays

#

sorry

serene scaffold
#

if you're not doing DS/AI/ML, you can get away with using "list" and "array" interchangeably. but for us, the distinction is critical, and using the wrong term could result in miscommunicating without realizing it.

modest wagon
#

oh

serene scaffold
#

anyway, I'm not too sure how to get into DS as a Python beginner without learning bad practices. I learned Python first.

modest wagon
#

what would you suggest I focus on learning?

serene scaffold
modest wagon
#

okay

#

thanks

#

I'll read up on them and try again later

serene scaffold
modest wagon
#

thanks!

hasty mountain
serene scaffold
hasty mountain
serene scaffold
hasty mountain
lapis sequoia
#

nvm didnt read the full sentence

#

my bad

#

i mean it depends on the user and their wants generally it has its pros and cons

honest verge
#

anyone know if Python for Data Science and Machine Learning Bootcamp by Jose Portilla on Udemy is any good?

vague sable
#

Hi all, I have two CSV data files that contain a large of grocery products (1000+ items) from two different grocery stores.
I am trying to compare the prices from each store to identify the best price.
To do this, I first need to match the products the best I can. I am struggling to find the best way to do this, I have tried multiple different methods to best match the product names together.
For example
Store1 names a certain apple as follows
'Fresh Pink Lady Apples Each'
Store2 names the same apple as follows
' Pink Lady Apples 200g each'

I have tried to use levenshtein fuzzy string matching
difflib
natural language processing
greedy algorithm

cant seem to get it precise enough to have something useful so I can move on to comparing the prices.

Any suggestions?

queen cradle
vague sable
#

They are kinda all over the place.
Here is a sample of some of the products.

 Kitchen Superfood Slaw Mix  350g
 Granny Smith Apples Prepacked  1kg
 Asian Buk Choy  1 bunch
 Mini Sweet Pineapple  1 each
 Kitchen Chicken Caesar Salad Bowl  180g
 Yellow Nectarines  1 Kg
 Kitchen Carrot Sticks  150g
 Chives Punnet  10g
 Spinach And Kale  300g
 Snacking Carrots  200g
 Roasted & Salted Cashews  400g
 Kitchen law  200g
 Kitchen Broccoli & Cauliflower Florets  150g
 Brown Mushrooms Loose   200g
 Mix-A-Mato Grape Tomatoes  300g
 Flat Mushrooms loose   350g
 Mediterranean Style Salad Bowl  185g
Fresh Red Entertain Peri Tomato  350g
 Green Kiwifruit Prepack  6 pack
 Roma Tomatoes Loose   100g each
 Plums Prepacked 1kg  1 each
 Roasted & Salted Pistachios  400g
#

Im starting to incorporate the pricing into matching. So if the prices differ by more than say 15%, it shouldnt be a match as generally grocery prices dont differ too greatly between stores

strong sedge
#

I have a csv file that has data that is not sequential, ie the each row doesnt comprise 1 entry, a single entry may be made of multiple rows
is there a pandas function to read this kind of csv file or do I manually have to read and parse it ?
(for legal reasons I cant really share the dataset)

prime crystal
#

Friends what is that technique called for testing a model where you e.g. Split it into 5 subsets and iterate through each one, using it as test data and the rest as training, then get the mean of the accuracies?

#

Nm just found it - it's k-fold cross validation

cinder schooner
#

Greetings everyone, so I have a little question:
Its the first time I participate in a kaggle competition, I created a notebook where I worked on EDA then modeling then validation and I want to make my first submission. But since its a notebook only competition, it needs to run and give the submission file at the end. Thus it will run all the modeling code and take hours training and wouldn't use the model I just trained. So my question is how do people submit in those competition? Do I need to make two notebooks one for training and one for submissions (where I upload the model from the training notebook)?

prime hearth
#

Im following this tutorial online but when i try to replicate the results with their dataset , the math comes out wrong

#

when the author says "Reviews proportion" does he mean the quantity of reviews( an integer value)?

#

nvm

hasty mountain
#

Also, now I know that my true action is not necessarily the possible action with higher probability given by the policy network, but just a sampled action from those probabilities.
I just wonder if there's any Pytorch built-in function that can help be make a weighted sample...

#

Wait...I could make my model sample that action by itself brainmon

wild sluice
#

I know how to use python and c++ to an intermediate level. Can someone help me with a book I can use to start learning about artificial intelligence

serene scaffold
serene scaffold
# tacit basin Not sure why you said that

what common problem does anaconda solve that isn't now solved by venv and pip? and whatever it is, is it worth creating a knowledge gap between DS/AI people and the rest of the Python community?

hasty mountain
# hasty mountain *Wait...I could make my model sample that action by itself* <:brainmon:439516188...

What if I calculate the expected reward for each action in the probability distribution, and then make a weighted sample, where the weights are the expected rewards for each action?
I suppose this could mitigate the softmax tendency of always outputting the same action, right? pithink

If an action probability is 0.15 and its expected reward is 10, the result would be 1.5, while if the best action is 0.55 and its expected reward is 1, the result would be 0.55, so the first action tend to be chosen...
But then, if the policy thinks action B is better than A, maybe the tendency is to reward for B to be way higher than A...even if the layer for predicting the reward is independent...I guess.

tacit basin
#

How do you know what envs you have with venv. Do you use special folder for them ? Conda keeps them organized for you ...

mint palm
#

where can i learn about how training process takes place when we use multiple GPUs?

serene scaffold
serene scaffold
mint palm
tacit basin
#

So at least I know where they are lol

serene scaffold
tacit basin
#

Can you have different python versions installed in different venvs?

serene scaffold
#

I'm not sure if linux has the py launcher. but if I'm on linux, I'm usually on a project-specific VM, and only install the python version I plan to use.

tacit basin
serene scaffold
#

but it's probably basically the same as on linux

#

so, probably with brew. or you could download it from python.org

tacit basin
serene scaffold
young granite
#

good evening data boys and girls,
i want to generate a comparison like r^2 for ifft spectra based on the amplitude and frequency of the resulting curves.
Does one of u know if there is a lib for that?

young granite
#

i was thinking about writing my own r^2 function for both amplitude and freq

young granite
hasty mountain
#

Oh, for light waves?

young granite
#

cant i compare the abs(fft)

young granite
#

UV/VIS

#

all sorts of things

hasty mountain
#

I guess they might do...they have spectrograms in frequency x amplitude, so...

young granite
#

if i keep only 5 freq. i can compare the abs() of them and calculate a score on that i guessed

hasty mountain
#

There might be something that uses frequency x amplitude that could be adapted to eletromagnetic waves, I guess...

#

I remember that Pytorch also has a built-in function for ifft, but it's not in torchaudio

young granite
#

@hasty mountain thanks ill give it a try tomorrow

hasty mountain
#

Guys, I'm trying to implement PPO and I'm running into a problem with autograd.
It won't compute the gradients for my policy. I guess the problem is in the surrogate loss, but I don't know why.

possible_actions, true_action, predicted_reward = model(frame, reward_input)

advantage = predicted_reward - reward_input

    # PPO updates the policy based on the previous policy parameters, which can also be seen as the previous policy outputs
    ratio = (possible_actions.retain_grad().argmax()/previous_possible_actions)

    ratio = torch.clamp(ratio, min=0.2, max=0.2)

    surrogate_loss = ratio * advantage

    value_loss = reward_loss(predicted_reward*uncertainty_factor, reward)

    total_loss = surrogate_loss + (value_loss * 0.5)

    total_loss.backward()

The gradients backpropagate through the layers that predict the reward and through the feature extraction layers, but the layer that generates the possible actions specifically has no grads computed.

Using print(possible_actions.grad) gives me this:
UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead

But even with possible_actions.retain_grad().argmax() I'm having problems with this. Any tip on how to fix this?

#

Also, when I used print(surrogate_loss, ratio), it shows me that those variables have grads:

tensor([[0.0915, 0.0915, 0.0915, 0.0915, 0.0915, 0.0915, 0.0915, 0.0915, 0.0915,
         0.0915, 0.0915, 0.0915, 0.0915]], device='cuda:0',
       grad_fn=<MulBackward0>)
tensor([[0.2000, 0.2000, 0.2000, 0.2000, 0.2000, 0.2000, 0.2000, 0.2000, 0.2000,
         0.2000, 0.2000, 0.2000, 0.2000]], device='cuda:0',
       grad_fn=<ClampBackward1>)
steel oxide
#

any interesting fun data science project ideas for a relative beginner? (know the theory but haven't done too much in practice)

#

preferably things related to neural networks

hasty mountain
# steel oxide preferably things related to neural networks

Try a classification problem with digits MNIST dataset, then try with CIFAR100.
And try plotting the outputs of your layers into images to see how your network is dealing with the data(this might be easier to do with Pytorch or tensorflow, rather than keras)

#

Then go to DCGAN using the classic CelebA
||And get crazy because GAN things||

hasty mountain
prime hearth
#

Hello how would you guys find the best cut off mark for this type of data, so that anything below x is labeled as 0 and anything above is label as 1?

#

I thought of doing histoplot but i dont see that bell curve shape. So i dont think I can do IQR or normal distribution

#

One way i was thinking is just looking at my dataset and see a cut mark where it makes sense

steel oxide
steel oxide
hasty mountain
#

Thanks for the suggestion, but I managed to solve it. Now I'm just having vanishing gradients

EDIT: Now it's crazy gradients py_guido

hasty mountain
#

Or Transformer

molten hamlet
#

30 books for 17 euro, all ml

#

Thinking if they wirth it, thats a lot of reading pithink pithink

fringe anvil
#

hello. i have a question about RDD with statsmodels and seaborn. im kinda getting where i want to be. but im not sure what the end goal should be. im doing some covid dataset RDD. about the reopening of the schools back in august 2020 (canada)

#

i cant get the dates and the line to show properly.. like a line that breaks.. or follows the scatter more closely

austere swift
#

that plot doesn't look like it would fit well with a linear model

#

you'd be better off using a polynomial function instead, which would be able to better match the curves of the plot

wary citrus
#

This may be more relevant in the #databases channel (to my limited knowledge), but if I were trying to create an entire database of actions in the sport of fencing (i.e. attack, retreat, lunge, etc.) is there any method that is more efficient than having to manually download and edit videos in order to create a few 100-1000 clips? I need to know whether or not there are more efficient methods because it may mean that I may focus on finding a new project that won't require a stupid amount of time πŸ˜„

serene scaffold
#

for your reference:
a database is a data store that can be queried.
a dataset is a collection of data that can be used for ML.
the distinction will become more clear to you with time.

hasty mountain
#

I wonder if someone has already tried implementing a RL algorithm that lets the model itself decide its reward pithink

EDIT: I just remembered that this is what ChatGPT does. A separate model generates the reward for each text generated by the text generator...interesting...

hasty mountain
#

Meh...maybe I'll try this someday. I just hope I don't have to waste that much time before realizing if my RL model is inefficient.

warm badge
#

hi , can i know how can i learn python and data science what are the topics i need to complete

latent tundra
#

I am doing binary classification on highly imbalanced data. I have a good baseline model that gets 0.9 predictive equality at 50% True Positive Rate (TPR). I now want to do hyperparameter tuning to get the highest TPR at a predictive equality above 0.8. How can that be done?

latent tundra
#

I thought about the following:
Create two keras_tuner.Objectives:

  1. abs(0.8-predictive_equality), min
  2. TPR, max

Would that lead to what I think it would lead to or is my understanding of objectives wrong

north barn
latent tundra
#

Thats why I use the second metric predictive equality that prohibits such behaviour

north barn
#

otherwise there's a free variable

#

namely the scaling factor between 1 and 2

sharp anchor
#

hey there! can anyone help me with dqn using the keras package using it for an openai gym custom environment

sharp anchor
#

Thanks man! so am getting dimension based errors. In my custom environment the state space is a 4d. state = (fes_pos, Uav_pos, uav_energy, lost_person)
here, fes_pos , uav_pos and lost_person are lists of x and y coordinates whereas uav_energy is a 1d list

#

this is my model
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.optimizers import Adam
states = env.observation_space.shape
actions = env.action_space.n
actions
def build_model(states, actions):
model = Sequential()
model.add(Dense(24, activation='relu', input_shape=(4,)))

model.add(Dense(24, activation='relu'))
#model.add(Flatten())
model.add(Dense(actions, activation='linear'))
return model
model = build_model(states, actions)

#

when i try to fit the model
dqn.fit(env, nb_steps=50000, visualize=False, verbose=1)
getting this error
ValueError: Error when checking input: expected dense_32_input to have 2 dimensions, but got array with shape (1, 1, 4)

north barn
#
states = env.observation_space.shape
#

build_model usually takes shapes as input?

#

ok

#

then you need to flatten or use reshape

#

flatten before you pass it to a dense layer

#

at the start

#

ie right after Sequential()

sharp anchor
#

ive tried adding flattern. however somewhere in the middle. okay letme add it in the beginning

#

hey should i add it after sequential or after the first layer
" model.add(Dense(24, activation='relu', input_shape=(4,)))"
because im getting this error :/
ValueError: This model has not yet been built. Build the model first by calling build() or by calling the model on a batch of data.

unique ridge
#

Hey there, I have the following code:

p1 = Pipeline([('Lineair Regression', LinearRegression())])
p2 = Pipeline([("Scaler", StandardScaler()), ('Lineair Regression', LinearRegression())])

p1.fit(greenhouse_X_tr, greenhouse_y_tr)
test_preds = p1.predict(greenhouse_X_v) 
test_score = p1.score(greenhouse_X_v, greenhouse_y_v)
print('p1: ' + str(test_preds))
print('p1: ' + str(test_score))

p2.fit(greenhouse_X_tr, greenhouse_y_tr)
test_preds1 = p2.predict(greenhouse_X_v) 
test_score1 = p2.score(greenhouse_X_v, greenhouse_y_v)
print('p2: ' + str(test_preds1))
print('p2: ' + str(test_score1))

When printing the results, the standardscaler doesnt seem to append when checking the results.

p1: [19.06856434 18.27648482 20.56294229 ... 23.65166504 19.11782744
 21.71606506]
p1: 0.9883787684006002
p2: [19.06856434 18.27648482 20.56294229 ... 23.65166504 19.11782744
 21.71606506]
p2: 0.9883787684006001

https://michael-fuchs-python.netlify.app/2021/05/11/machine-learning-pipelines/
https://scikit-learn.org/stable/auto_examples/compose/plot_column_transformer_mixed_types.html

north barn
#

you can pass input shape to flatten layer

#

i think

sharp anchor
#

okay added this after sequential()
model.add(Flatten(input_shape=(4,)))
the model could be built now. but still getting this error
ValueError: Error when checking input: expected flatten_9_input to have 2 dimensions, but got array with shape (1, 1, 4)

#

is my input_shape argument ok? :/

north barn
#

input_shape=(1,4)

ripe sapphire
#

No the argument input_shape should have 2 dimensions so,:it will be input_shape = (4, 1)

latent tundra
north barn
toxic coral
#

A = ((2,2),(3,10),(5,5))
print(max(A))
How would this piece of code compute?

north barn
#

like you could have obj2 - lambda * obj1

#

lambda is the free variable

latent tundra
#

mmh, seems like a good idea. I have to look into it

sharp anchor
#

after it tried to fit the model again

north barn
#

if its a standard gym problem

sharp anchor
#

okay sure!

unique ridge
arctic wedgeBOT
#

Hey @sharp anchor!

It looks like you tried to attach file type(s) that we do not allow (.ipynb). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.

Feel free to ask in #community-meta if you think this is a mistake.

toxic coral
#

A = ((2,2),(3,10),(5,5))
print(max(A))
How would this piece of code compute?

arctic wedgeBOT
north barn
#

@sharp anchor what has input shape (2, 1)

sharp anchor
#

i dont suppose ive given anything with a (2,1) dimension?! can you point out which line youre talking about :l

young granite
worn stratus
wooden sail
dusty valve
#

What's the math behind gradient descent in deep learning? I learnt it but don't really understand

fierce kiln
#

It's just some simple calculus.

#

according to the calculated loss, the algorithm simply upgrades the values of the weights and biases such that it converges to the minimum point for accurate prediction.

#

Gradient Descent is the workhorse behind most of Machine Learning. When you fit a machine learning method to a training dataset, you're probably using Gradient Descent. It can optimize parameters in a wide variety of settings. Since it's so fundamental to Machine Learning, I decided to make a "step-by-step" video that shows you exactly how it wo...

β–Ά Play video
#

This might help.

boreal cape
#

hey anyone here use bert

serene scaffold
boreal cape
#

How to fine tune a pre trained model of bert for text to vector conversion

serene scaffold
boreal cape
#

I have a dataset of text

#

Intially I just used transfer learning to generate the vectors

serene scaffold
boreal cape
#

yup

serene scaffold
#

what will the fine-tuned BERT do that is different from what BERT already does?

boreal cape
#

I get those vectors and then pass them through different machine learning algorithms like svm for classification

#

use fine tune bert I am hoping to generate more acurate vectors

serene scaffold
boreal cape
#

aranesp ug
1 aranesp ug
2 aranesp ug
3 problems with arterial cannula had to discard ...
4 aranesp ug
... ...
14866 arenesp
14867 aranesp, stopped early due to bleeding
14868 iron and rnsp infussion
14869 very good
14870 good

#

I have this dataset

#

I wanna fine tune bert

serene scaffold
boreal cape
#

model = BertModel.from_pretrained('bert-base-uncased',output_hidden_states = True,)

#

oh the ouput is just yes and no

serene scaffold
#

what does the yes or no mean?

boreal cape
#

its just the class of the sentence

serene scaffold
#

so you're trying to use BERT as a classifier?

boreal cape
#

its basically a classification problem

#

no I am trying to use bert to generate vectors

#

then pass the vectors through different classifiers

#

@serene scaffold

serene scaffold
#

okay, so you're using vectors produced by BERT without fine-tuning BERT. or you're fine-tuning BERT to use it as a classifier.

#

it might be that if you train BERT as a classifier, but then don't use the classification layer, then vectors from the next-to-last layer would be closer together for inputs that belong to the same class.

young granite
#

but yeh im not yet convinced what is best for my approach of comparison

wooden sail
#

that's basically all you need to know about deterministic gradient descent. some stats is needed for the stochastic flavor

serene scaffold
wooden sail
serene scaffold
#

this is notebook usage I can get behind

young granite
#

so difference of predicted/observed

wooden sail
#

MSE is a commonly used performance indicator

young granite
wooden sail
#

that they measure different things

young granite
#

or numpy sqrt

wooden sail
#

MSE measures distance, coherence measures similarity as an angle

#

MSE measures orthogonal distance, at that

young granite
#

i struggle a bit to find a good approach in classifying my predicted curves vs observed ones

#

i think they are all suited for comparison but i dunno what is "best in slot"- if one wants to call it like that

#

coherence is good to show full overview

wooden sail
#

unfortunately there is no "best approach" here, it depends on what you wanna emphasize

young granite
#

MSE gives a good overall value for curve vs curve

keen notch
#

hey does anyone the error here

young granite
wooden sail
arctic wedgeBOT
#

Hey @keen notch!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

keen notch
wooden sail
keen notch
#

can't see where this would fix it

young granite
wooden sail
#

well, this would certainly be one πŸ˜›

young granite
#

πŸ˜›

wooden sail
#

1 good, 0 (and negative) bad

young granite
#

would u say its a sufficient approach to describe the relationship of 2 curves

wooden sail
#

it's a very common one

#

it ignores small errors. if that's ok, then yeah

keen notch
young granite
#

if i split the complex value into .real and .imag part and calc. each score individual the resulting r^2's differ quiet a bit. does that only imply symmetry mismatching?

sturdy glade
#

Hi, i'm trying scrape with selenium but i receive this message
"NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//a[@class='ui_button nav next primary ']"}"
, Any suggestion ? Thanks

hard sluice
clear blaze
#

I tried a couple courses online for datascience, hyperskill, dataquest, pluralsight, 365 data science (trial )

#

I really like the interactivity, or "by doing" learning and text of dataquest.

#

I like hyperskill's intergration with intellij, however the courseware is lacking, questions are a bit... what on earth?

#

I like to figure out how to autosync dataquest, and use intellij as my editor while learning

hasty mountain
#

@wooden sail since you're a mathmagic, can you help me with a crazy trick?

I want to make a Reinforcement Learning model that will predict every possible action given a certain state, and will also predict the reward for every possible action it might take in that state, sum those rewards and get the mean.
In order to preserve my hardware, I've divided my commands between 3 input types: command_type, action1 and action2.

Each input type will have its own feedforward layer to output the probability distribution.
However, I don't know exactly how to calculate the reward for each input type without having to create a monster variable var = command_type * action1 * action2 and having to iterate through each one.

I'm currently testing, by hand, the possibility of total_reward = reward_command_type + reward_action1 + reward_action2 and I noticed that, if the lenght of every input type is 2, I'll have 2*2*2 = 8 possibilities, and, if I mark each possibility by hand, I noticed that each input type repeats itself 4 times(Possibilities divided by input length, perhaps?)

Problem is...I don't know how to continue after this

wooden sail
#

i don't know anything about reinforcement learning tbh :x but wouldn't the possible actions given a state be something deterministic? why would it need to be predicted?

hasty mountain
#

More or less. Depends on how the policy is trained. I'm using PPO.

But knowing RL is kinda irrelevant for the question. My problem is more with the math

#

I was thinking about

avg_command_type = reward_command_typeA + reward_command_typeB
avg_action1 = reward_action1A + reward_action1B
avg_action2 = reward_action2A + reward_action2B

avg_rewards = avg_command_type + avg_action1 + avg_action2
#

Makes sense to me, at least.

Taking the mean reward for all command_types, the mean reward for all actions1 and the mean for all actions2.

I just don't know if, after summing those means, would make sense to take the mean of those sums or not

wooden sail
#

off the top of my head, if i understand correctly, it's really an expected payoff. so it would be the sum of payoff for action i * probability of taking action i

hasty mountain
#

So, instead of having to deal with a monster, I'm thinking about simply taking the mean reward for each input type separately, then summing those mean rewards up to get the final, expected reward.

wooden sail
#

how do the inputs interact with each other? or what are they

hasty mountain
#

the command_type determines whether the controller is the keyboard or the mouse.
The action1 determines whether to press or release a keyboard key(determined by command_type) or it's an X coordinate if the controller is the mouse.
The action2 determines the keyboard key specifically or an Y coordinate.

wooden sail
#

and you get several of these, are they supposed to be sequential or?

hasty mountain
#

Nah, they're a probability distribution

wooden sail
#

then i'm not sure what the problem is :x

#

based on the current state, a probability is assigned to each input independently, with the only restriction that the probabilities add up to 1 (which you can do as a normalization step afterwards)

#

or what did i miss

#

it's sleepy time for me so i might not be understanding your problem

hasty mountain
#

Let me see...

if action1 = [x for x in range(1000)]
There's 1000 possible X coordinates, then the probability distribution is got through:

probs = feedforward_layer(inputs, len(action1))
output = softmax(probs)

wooden sail
#

mhm

hasty mountain
#

output is a probability distribution over all possible action1, right?

wooden sail
#

no

#

softmax picks 1 output

hasty mountain
wooden sail
#

softmax is a smooth approximation to argmax

#

it'll pick the event with highest probability

#

so output is either 1 action, or an array that tells you the index of the action you should pick

#

probs is the probability dist

hasty mountain
#
import torch

test = torch.randn((1, 100))
softmax = torch.nn.Softmax(-1)

output = softmax(test)

print(output.size())

.eval()

#

Uh...what's the command?

wooden sail
#

hmm?

arctic wedgeBOT
#
Missing required argument

code

wooden sail
#

idk if the both has pytorch

hasty mountain
#

!eval

import torch

test = torch.randn((1, 100))
softmax = torch.nn.Softmax(-1)

output = softmax(test)

print(output.size())
#

Uh...

wooden sail
#

anyway, softmax does not yield a probability

#

it yields a 1hot vector

#

(based on a probability or something similar)

hasty mountain
wooden sail
#

as i said, a smooth approximation though

wooden sail
#

right

hasty mountain
#

It's enough, I guess

wooden sail
#

an approximation to argmax πŸ™‚

#

not a pdf

hasty mountain
#

Anyway, this is how it's implemented, so it's kinda prob distribution

wooden sail
#

it isn't

hasty mountain
#

And I want to get the avg reward for every single possibility

wooden sail
#

you don't even need the softmax for that

#

all you need is the dot product of the payoff vector with the probabilities vector

#

the softmax is not a pdf

hasty mountain
#

Ok, then let me correct myself:
the output is supposed to be the probabilities of the actions that can be taken, from the worst one to the best one.
Is this better?

wooden sail
#

nope

#

you had probabilities before taking softmax

#

once you took softmax, this has nothing to do with probabilities anymore

hasty mountain
#

How I hate Sutton and Barto for creating different terms for RL...

wooden sail
#

copy paste here what they wrote

hasty mountain
#

I'm currently using a code example in tensorflow


class Policy_net:

    def __init__(self, name: str, sess, ob_space, act_space, activation=tf.nn.relu, units=64):
        """
        :param name: string
        """
        self.sess = sess
        with tf.variable_scope(name):
            self.obs = tf.placeholder(dtype=tf.float32, shape=[None, ob_space], name='obs')
            with tf.variable_scope('policy_net'):
                layer_1 = layer.dense_layer(self.obs, units, "DenseLayer1", func=activation)
                layer_2 = layer.dense_layer(layer_1, units, "DenseLayer2", func=activation)
                self.act_probs = layer.dense_layer(layer_2, act_space, "DenseLayer4", func=tf.nn.softmax)

                if P.use_dual_policy_value:
                    self.v_preds = layer.dense_layer(layer_2, 1, "DenseLayer5", func=None)
                else:
                    with tf.variable_scope('value_net'):
                        layer_1 = layer.dense_layer(self.obs, units, "DenseLayer1", func=activation)
                        layer_2 = layer.dense_layer(layer_1, units, "DenseLayer2", func=activation)
                        self.v_preds = layer.dense_layer(layer_2, 1, "DenseLayer5", func=None)                    

            self.act_stochastic = tf.multinomial(tf.log(self.act_probs), num_samples=1)
            self.act_stochastic = tf.reshape(self.act_stochastic, shape=[-1])

            self.act_deterministic = tf.argmax(self.act_probs, axis=1)

            self.scope = tf.get_variable_scope().name
#

Heads up for the self.act_probs = layer.dense_layer(layer_2, act_space, "DenseLayer4", func=tf.nn.softmax)

wooden sail
#

well, it's a pdf in the sense that it adds up to 1, sure

#

but this is enforcing a prior

#

it can only spit out limited flavors of pdfs if you choose that as an activation funct for the last layer of something predicting a pdf

#

seems pretty restrictive

hasty mountain
#

Anyway, how can I get the average reward for every possible action?

#

Any tip?

wooden sail
#

do you know the payoffs for the actions?

hasty mountain
#

No, they'll be predicted by the model

#

(The "value" network)

wooden sail
#

well, if you predict them, you'll have a vector of payoffs

#

if these are in the same order as the vector of probabilities, the average payoff is their dot product

hasty mountain
#

Yes, but the thing is that I have too many actions, divided in 3 (command_type, action1 and action2).

#

If I use all of then at once, I'll have more than 4 million possibilities, and the reward must be predicted based on each possibility

wooden sail
#

ah, i see where we had the misunderstanding. according to wikipedia, reinforcement learning usually uses extra parameters in the denominator of the exponent in the softmax, so this works like regulating the variance of pdfs in the exponential family. bleh

hasty mountain
#

This is why I want to try to predict the reward for each command_type, for each action1, for each action2 and manipulate then so I can get the average reward for every possible action, without having to deal with 4 million possibilities directly

wooden sail
#

sadly the probabilities here do depend on each other. if you compute the pdfs by splitting the actions into disjoint groups, you have no guarantee you'll get the same pdf you would have gotten if you feed all of them at once

#

i don't have a good answer for you

hasty mountain
wooden sail
#

this is a joint pdf, after all

#

or hmm

hasty mountain
#

Well, if the predicted command type is a keyboard command, and the action1 refers to a mouse, the model will simply do nothing, if this helps with anything.

#

The idea is to optimize the model so this won't be happening over time

#

I'll even use some Supervised Learning before the actual Reinforcement Learning to prevent this.

#

Hm... I tried considering 2 possible command_types, one with reward 0.5 and the other with reward 1. Also 2 action1 and 2 action2, with rewards 0.25, 0.5 and 0.25, 0.75

I've taken the average reward for command_types, for action1 and for action2 and then summed the 3 average rewards to get the total payoff.

But, unfortunately, the result was different than when I summed every possible combination and took the mean

wooden sail
#

yep

#

i can't think of any feasible way off the top of my head

hasty mountain
#
C1, A1, A2 = 1
C1, A1, B2 = 1,5
C1, B1, A2 = 1,25
C1, B1, B2 = 1,75
C2, A1, A2 = 1,5
C2, A1, B2 = 2
C2, B1, A2 = 1,75
C2, B1, B2 = 2,25
sum = 13
avg = 1.625

C_avg = 0.75
1_avg = 0.5
2_avg = 0.625
sum_avg = 1.875
#

Do you know if I can mitigate this deviation in the averages somehow?

wooden sail
#

expectation can be represented as an integral. if you come up with an interesting parametrization, you could replace expectation with a handful of montecarlo trials (c.f. montecarlo integration)

#

but yeah just arbitrarily picking chunks and adding them up isn't gonna work

#

best of luck with that, i need to sleep

hasty mountain
#

Ugh... Maybe if I can mitigate the error, somehow? Make it the lowest possible?

#

Okay, sweet dreams

wooden sail
#

then you're basically training a network to predict the expected payoff. you can try that too, idk what you'd use to train it though

hasty mountain
#

The idea is to sample one action, get the reward for that action and check how advantageous taking that action is, compared to the average reward of the other actions

hazy saddle
#

Hi everyone, I'm following a tutorial to train a neural network for nlp. I'm using google colab. I'm running out of memory in this part:

max_len_sequence = max([len(x) for x in input_sequences])
input_sequences = np.array(pad_sequences(input_sequences, maxlen=max_len_sequence, padding='pre'))

total_words = len(tokenizer.word_index)+1
predictors, label = input_sequences[:,:-1], input_sequences[:,-1]
label = ku.to_categorical(label, num_classes= total_words)

I believe the problem is when I use keras.utils.to_categorical.

Any ideas to solve this problem?

mild dirge
#

You could test this out by commenting parts out until you do not get the error, but the problem is probably that total_words is pretty big

#

Maybe a few thousands or even more, and the amount of labels you have might also be a lot

#

And the size of the resulting matrix is the product of these two integers, which might be really really big

#

So that is why you would not have enough memory

#

to categorical probably 1 hot encodes it, you might want to look into other methods of encoding the words

#

You could also convert the labels to one hot encodings in batches instead of all at once

hazy saddle
mild dirge
hazy saddle
mild dirge
#
max_len_sequence = max([len(x) for x in input_sequences])
input_sequences = np.array(pad_sequences(input_sequences, maxlen=max_len_sequence, padding='pre'))

total_words = len(tokenizer.word_index)+1
predictors, label = input_sequences[:,:-1], input_sequences[:,-1]

# Edited part
batch_size = 64
for i in range(0, len(label), batch_size):
  predictors_batch = predictors[i:i+batch_size]
  label_batch = label[i:i+batch_size]
  label_batch = ku.to_categorical(label_batch, num_classes= total_words)
  # Do stuff with the batch of features and labels
#

Something like this

#

Typically you do not train the network with all of the data you have at once, since sometimes you have many gigabytes of data

stray nymph
#

does anyone know pulp optimization?

serene scaffold
stray nymph
#

im trying to do a pulp optimization model for reorder quantities

#

but im not sure how to work with it

#

Stock < Predictions:
Reorder Amount = Prediction - Stock

When Prediction > Stock: Use from Safety Stock
Reorder Amount = Stock + (Safety Stock - Prediction)

Prediction > Stock + Safety Stock:
Reorder Amount = Prediction
Increase Safety Stock level = Prediction - Stock

#

im trying to generate a reorder amount

#

but i cant really differentiate the objective function and the constraints

#

most examples that i find online are very straightforward while mine feels very conditional

#

import pulp as pl

''' SOLVER SETUP '''
date = data.index
sales = data['Sum']
stock = data['Stock (Yearly)']
safety = data['Safety Stock (Monthly)']
reorder = data['Reorder Level (Yearly)']
predictions = datapred['Predictions']


# Create a variable for the reorder quantities
x = pl.LpVariable.dicts('Date', date, 0, None, pl.LpInteger)

''' SET THE OBJECTIVE FUNCTION '''
problem = pl.LpProblem('Reorder Level', pl.LpMinimize)


'''CONSTRAINTS'''

if stock<predictions:
    problem+= predictions - stock

elif predictions>stock:
    problem+= stock+(safety-predictions)

elif predictions > stock + safety:
    problem += predictions
#

this is what i have right now

#

but im pretty sure its completely off

#

data is my dataset

serene scaffold
#

please edit your code sample to show the import statements.

stray nymph
#

can i just show a portion of my dataset

#

w a screenshot

serene scaffold
#

the import statements are the ones that start with import or from.

stray nymph
#

like that?

serene scaffold
#

the only one that was needed was import pulp as pl, since that's the only one that you use in the code.

#

think of it from the perspective of someone looking at your code sample: pl isn't defined, which means it could be literally anything, unless you show in the code where it comes from.

stray nymph
#

oh

#

sorry

serene scaffold
#

that's okay. now you know.

stray nymph
#

thank you for telling me

serene scaffold
#

you are welcome

stray nymph
#

datapred

#

is there a way to "merge" these 2 dataframes but only move one column from datapred (Predictions) to data

serene scaffold
#

"merge" is an official thing in pandas.

stray nymph
#

sku and date

serene scaffold
#

so you'd do data.merge(datapred, on=['SKU', 'Date'])

stray nymph
#

You are trying to merge on datetime64[ns] and object columns. If you wish to proceed you should use pd.concat

serene scaffold
#

sounds like your Date column has different types in each df.

#

also it looks like the date is the index of data

drifting wagon
#

anyone know how to avoid Can't convert non-rectangular Python sequence to Tensor.
without losing data? If i understand it correctly tensor is trying to create dataset from senteces with different length which it cant use, so i need to either lobotomise sentences to correct length or filter them out.
and i woulndt like either option

serene scaffold
midnight kayak
#

I want to build a basic ml app using TF just to get some exposure. My data are objects containing strings and ints. I just want the user to type some characteristic contained within the objects and the AI to spit out the correct object. Is there a good tutorial for this?

serene scaffold
#

knowing the types of the data you have is important. but what they represent matters. that's why it's AI.

stray nymph
#

is there a way to convert datetime back to an object

serene scaffold
stray nymph
#

im not sure how to merge

serene scaffold
#

!docs pandas.to_datetime

arctic wedgeBOT
#

pandas.to_datetime(arg, errors='raise', dayfirst=False, yearfirst=False, utc=None, format=None, exact=True, unit=None, infer_datetime_format=False, origin='unix', cache=True)```
Convert argument to datetime.

This function converts a scalar, array-like, [`Series`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.html#pandas.Series "pandas.Series") or [`DataFrame`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html#pandas.DataFrame "pandas.DataFrame")/dict-like to a pandas datetime object.
serene scaffold
#

use this to make sure that all your strings (which are objects) that look like timestamps are actual datetimes.

midnight kayak
#

sorry not ints but floats rather

serene scaffold
arctic wedgeBOT
#

Here's how to format Python code on Discord:

```py
print('Hello world!')
```

These are backticks, not quotes. Check this out if you can't find the backtick key.

midnight kayak
#
Airport = TypedDict(
    'Airport',
    {
        'icao': str,
        'iata': str,
        'name': str,
        'city': str,
        'subd': str,
        'country': str,
        'elevation': float,
        'lat': float,
        'lon': float,
        'tz': str,
        'lid': str,
    },
)
verbal venture
#

hey, so does the kernel of a CNN just basically extract the features of an image? and then later down the architecture, you add all the features (neurons) together, to recreate a meaningful portion of the image?

serene scaffold
#

though I wouldn't say that subsequent layers "recreate meaningful portions"

hasty mountain
#

The neurons kinda serve for the network to analyze the features it has available and then take a conclusion about them

verbal venture
#

so basically you're guessing the important parts of the image (features), reconstructing it down the line, and then using your final image to compare to the original image or whatever input data you're getting?

serene scaffold
#

the kernel of a convolutional layer decides what that layer is looking for. and then subsequent layers figure out what to do with that information.

#

nothing is getting "reconstructed" or "recreated".

hasty mountain
#

If you have a dataset composed of dog images, the conv layers will extract the most relevant features(what is a remarkable trait dogs use to show?), then the neuron layer will analyze those features and conclude if it's a dog or not

verbal venture
serene scaffold
#

comparison between what things?

verbal venture
#

sorry, does the final layer compare itself to the original image or the input image

hasty mountain
#

Nah, that's a GAN, not a simple CNN

serene scaffold
#

it doesn't go without saying that the final layer of a neural network involves comparisons.

verbal venture
#

if you have a dataset of 10k dog images and 1 image you're using to see if it's a dog, after all the convolutions, does it use that 1 image and compare it to all the convoluted images?

serene scaffold
#

and if you have an image classifier, there is no output image. the output is a class.

verbal venture
#

to determine if it's a dog?

serene scaffold
#

@verbal venture does this make any sense?

stray nymph
#

like that?

serene scaffold
stray nymph
#

my optimization model doesnt work at all as expected

#

im completely new to optimization and i cant find any solutions online similar to my situation online

#

is it okay if i pm u for guidance

serene scaffold
#

no

stray nymph
#

do you have any suggestions then

#

i can describe my situation here

#

so in my data i have different products/items with its sales, stock, safety stock and predictions

#

so what i want to do is to create optimization model(s) to generate reorder quantities for the next few months

#

to my knowledge there are linear and non linear optimization models

#

and that depends on whether my items are linear or non linear

#

Stock < Predictions:
Reorder Amount = Prediction - Stock

When Prediction > Stock: Use from Safety Stock
Reorder Amount = Stock + (Safety Stock - Prediction)

Prediction > Stock + Safety Stock:
Reorder Amount = Prediction
Increase Safety Stock level = Prediction - Stock

#

and this is what im supposed to use to calculate my reorder amount

hasty mountain
# stray nymph data

Try scaling or normalizing your data. The difference between the numbers is too big.

tacit basin
hasty mountain
#

Does anyone know how the weights of Stable Diffusion are initialized? I can't find anything but hype, hype and more useless hype.

#

I tried using median 0 and std=0.2 like GANs, but this leads to vanishing gradients.

#

Also...I'm testing a simple Diffusion model(not Stable, just a sketch), and I'm wondering if someone has an idea on how many complete diffusion steps it takes before generating some images.

wooden sail
# hasty mountain The idea is to sample one action, get the reward for that action and check how a...

i'm back to life. here's what i had mentioned before https://en.wikipedia.org/wiki/Monte_Carlo_integration you can use naive monte carlo integration to estimate the expected payoff

In mathematics, Monte Carlo integration is a technique for numerical integration using random numbers. It is a particular Monte Carlo method that numerically computes a definite integral. While other algorithms usually evaluate the integrand at a regular grid, Monte Carlo randomly chooses points at which the integrand is evaluated. This method i...

#

but also after mulling it over in bed, i don't think i've ever seen an example of reinforcement training that considers every pixel on the screen as a possible input, which is where your complexity comes from. i think that task lends itself better to computer vision instead

#

those are my final 2 cents

rare flicker
#

I'm using pre-trained PyTorch data models to classify pictures, and I can't very bad results
Anyone willing to help me understand why?

https://pastecord.com/revacahefo

Training model resnet18
Epoch 0, Train loss = 0.6610
Epoch 0, Test loss = 0.4550
Accuracy: 86.6359%
---------------------
Epoch 1, Train loss = 0.6618
Epoch 1, Test loss = 0.3809
Accuracy: 86.8664%
---------------------
Epoch 2, Train loss = 0.6235
Epoch 2, Test loss = 0.3852
Accuracy: 88.1336%
---------------------
Epoch 3, Train loss = 0.6400
Epoch 3, Test loss = 0.4629
Accuracy: 87.7880%
---------------------
Epoch 4, Train loss = 0.5870
Epoch 4, Test loss = 0.4349
Accuracy: 87.9032%
---------------------
Epoch 5, Train loss = 0.5756
Epoch 5, Test loss = 0.3587
Accuracy: 89.2857%
---------------------
Training model alexnet
Epoch 0, Train loss = 4.2033
Epoch 0, Test loss = 4.2741
Accuracy: 9.1014%
---------------------
Epoch 1, Train loss = 4.2003
Epoch 1, Test loss = 4.2859
Accuracy: 7.9493%
---------------------
Epoch 2, Train loss = 4.1948
Epoch 2, Test loss = 4.2790
Accuracy: 7.9493%
---------------------
Epoch 3, Train loss = 4.1934
Epoch 3, Test loss = 4.2813
Accuracy: 7.9493%
---------------------
Epoch 4, Train loss = 4.1921
Epoch 4, Test loss = 4.2732
Accuracy: 9.1014%
---------------------
#

😦

celest agate
rare flicker
#

And they are also random transformations for the training set

#

Yet the models seems to be overfitting

#

And I have no idea why

boreal gale
#

is it overfitting though? wouldn't overfitting manifest itself as train loss going down and test loss going up?

#

i haven't used these models before, so i can't really comment on why this is happening. but i would make sure your input is exactly how the model(s) expects it, specifically if there is any special treatment to the colour channels, you have to replicate that treatment.

rare flicker
#

There are only random flips and resizes to the training pictures

#

Plus the training pictures and the test pictures do need to be different

boreal gale
#

Plus the training pictures and the test pictures do need to be different
yes of course they need to be different. i didn't imply the contrary just in case you thought i did.

rare flicker
#

Oop, sorry, I misunderstood

#

I am just losing my mind over this issue

#

Anyone seeing an issue in the code?

boreal gale
#

any particular reason why you use transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])?

rare flicker
#

"Normalization helps get data within a range and reduces the skewness which helps learn faster and better. Normalization can also tackle the diminishing and exploding gradients problems." πŸ€·β€β™‚οΈ

#

I have been instructed to do so

boreal gale
#

okay. i meant where did you get the values from?

rare flicker
#

From the dataset I got

boreal gale
#

wait actually i read your output wrong.
resnet18 looks pretty good, no?

rare flicker
rare flicker
#

And other models are giving me awful results

#
Training model vgg16
Epoch 0, Train loss = nan
Epoch 0, Test loss = nan
Accuracy: 5.4147%
---------------------
Epoch 1, Train loss = nan
Epoch 1, Test loss = nan
Accuracy: 5.4147%
---------------------
Epoch 2, Train loss = nan
Epoch 2, Test loss = nan
Accuracy: 5.4147%
---------------------
Epoch 3, Train loss = nan
Epoch 3, Test loss = nan
Accuracy: 5.4147%
---------------------
Epoch 4, Train loss = nan
Epoch 4, Test loss = nan
Accuracy: 5.4147%
---------------------
boreal gale
#

88% to 91% is not bad at all.

rare flicker
#

Maybe it is, but why are other models so bad?

#
Training model squeezenet
Epoch 0, Train loss = 3.4460
Epoch 0, Test loss = 3.4808
Accuracy: 24.7696%
---------------------
Epoch 1, Train loss = 3.6549
Epoch 1, Test loss = 3.6978
Accuracy: 22.4654%
---------------------
Epoch 2, Train loss = 3.8187
Epoch 2, Test loss = 3.8768
Accuracy: 21.0829%
---------------------
Epoch 3, Train loss = 3.9259
Epoch 3, Test loss = 4.0139
Accuracy: 17.3963%
---------------------
Epoch 4, Train loss = 4.0195
Epoch 4, Test loss = 4.1126
Accuracy: 16.0138%
---------------------
Epoch 5, Train loss = 4.0796
Epoch 5, Test loss = 4.1778
Accuracy: 16.4747%
---------------------
Epoch 6, Train loss = 4.1304
Epoch 6, Test loss = 4.2190
Accuracy: 16.2442%
---------------------
Epoch 7, Train loss = 4.1559
Epoch 7, Test loss = 4.2428
Accuracy: 15.8986%
---------------------
#

Accuracy going downhill just like my life

boreal gale
#

the model itself is unlikely to be wrong.
the only moving part here is your training/test data
as people say... "shit in, shit out"

rare flicker
#

I see

#

But I've tried many things

#

Doesn't seem to help

boreal gale
#

wrong link, edited

#

yeah in particular let's look at vgg16.

The images are resized to resize_size=[256] using interpolation=InterpolationMode.BILINEAR, followed by a central crop of crop_size=[224]. Finally the values are first rescaled to [0.0, 1.0] and then normalized using mean=[0.48235, 0.45882, 0.40784] and std=[0.00392156862745098, 0.00392156862745098, 0.00392156862745098].
that's not what you did. you use drastically different normalisation transformation, which explains the poor performance (probably).

paper crest
#

Has anyone got 5min to help me fit a regression model to my scatter plot?

midnight kayak
ruby depot
#

Do you know any libraryt that could tell me by analizing a graph which regression i should use? in python compatible with matlop lib

wooden sail
#

i don't think such a thing exists

#

one usually uses either prior knowledge on the phenomenon, some method of model order estimation, and/or some method for measuring "goodness of fit" to pick the best model

lapis sequoia
#

Hello everyone! Can someone explain me why this tensor gets, an error? Ive imported TensorFlow and NumPy. I dont see any mistake in this syntax

boreal gale
#

import tensorflow as tf or import tensorflow? only the former is correct given you decided to use tf.constant

lapis sequoia
#

import tensorflow as tf is in my syntax yes @boreal gale

#

Is there an alternative for tf.constant??

boreal gale
#

make sure you actually have ran the code block that imports tensorflow, i assume you are using some sort of jupyter lab / code lab

lapis sequoia
#

google labs yes

#

I will try and solve this again, its frustrating

boreal gale
#

put import tensorflow as tf at the top of the cell you have shown

#

just to debug

lapis sequoia
#

@boreal gale you were right

#

So actually I have to import tensorflow as tf on every separated cel?

#

Thank you very much πŸ˜„

wooden sail
#

you don't, but you need to execute the cells in order

#

think of notebooks as an ipython terminal. every time you run a cell, it's the same as copying all the code from the cell and running it in ipython

#

if you haven't run the previous cell, tensorflow has not been imported

lapis sequoia
boreal gale
#

the potential of running cells out of order is one of the biggest gripes i have with jupyterlab/codelab. which is a variant of the problem you had.

imagine you have x += 1 in a cell, and you randomly reran it by mistake and altered the state of x, sometimes you don't even realise it until it's too late.

jupyterlab is still super useful though πŸ€·β€β™‚οΈ just something you need to be aware of

lapis sequoia
#

@boreal gale @wooden sail Thank you guys!! I will screenshot all these tips if you dont mind, got to pick up my daughter from nursery now

#

Good luck and keep coding!! πŸ˜„

ripe sapphire
#

One thing I want to know: how can you remember all the arguments, modules in Ml Do I need to learn them all

#

for ex: in keras do I need to learn all the layers their arguments

boreal gale
#

imo don't forcefully remember things that's just a google search away unless there is a reason.

hasty mountain
wooden sail
#

well but how does the CNN policy work?

hasty mountain
#

I'm trying to make something more complex, to play Steam games rather than some Atari games.
Something like AlphaStar, but in a really, really small scale

wooden sail
#

that's already using image processing methods, as soon as it says cnn

hasty mountain
# wooden sail well but how does the CNN policy work?

I guess it was you who teached me about the importance of vectorization and why I couldn't assign an input map to an arbitrary number.
Well, the CNN Policy is basically a vectorizer model which uses, as a way to extract the context, an image

wooden sail
#

sure

#

but consider this

hasty mountain
#

Based on that context, it tries to predict the better actions for that situation

wooden sail
#

for all your other inputs, including the directional keys, all you can do is change status from pressed to unpressed, for instance

#

but for the mouse you want to move the cursor all the way to a specific pixel, instead of telling the mouse to move some delta x or delta z

#

this is the reason your model has so many inputs

#

it's the wrong parametrization of the inputs for such large images

#

you're getting massively sparse inputs

hasty mountain
#

Hm... I couldn't think of a better idea to get mouse commands

#

But...seems like an interesting idea... Instead of "move to X", using "move a bit more to the left"

wooden sail
#

this does have the limitation of how much you can move the mouse, depending on how fast your network is

#

i think there must be a clever way, something like slicing the input image interatively into quadrants, for example

#

then you should reach a pixel close enough to the destination in log(n) steps

lapis sequoia
#

hi guys, so im learning data scraping and just wanted to know if its profitable in 2023 as a freelancer?
because there are so many websites and organizations available now that are willing to scrape data so i was thinking why woudl anyone want to buy my services from fiverr or upwork etc

wooden sail
#

that, or compute an acceleration and a velocity vector and apply that to the cursor so that you follow some sort of trajectory to the target

#

that makes the mouse input a 2d vector, which is much more tractable

#

but requires thinking the physics out a little

#

idk, i'm just tossing ideas. maybe they won't work at all πŸ‘€

hasty mountain
#

This solves the problem of having a computationally expensive net, but I suppose it may affect how good the model can get in a game

#

In the way my model is structured right now, the real computation power goes to the feature extraction part, which is similar to VGG19

#

Though I suppose I should consider something like ResNet

clear blaze
#

i remember when stock buying/selling was done on nvidia cuda cards, and cuda was just released. it had a gateway that prevented losses on bad software. of the 270+ days of the year, that company only lost one day of trading and it was 11 million

#

then others started getting into HFT, then they started wiring fiber to the nasdaq exchange to datacenters. nano'seconds started counting soon after thant

#

we started with a 130 ms lag, and i think its now 50 nanoseconds trades

#

these trading bots on github should never be used.

drifting wagon
#

guys so i have this simple script to
create numpy array ```import numpy as np
import os
import vocab

Read in vocab set from vocab.txt file

with open("vocab.txt", "r") as f:
vocab = set(f.read().splitlines())

def text_to_array(texts, vocab):
# Initialize an empty array with the same dtype as vocab
data = np.zeros((len(texts), len(vocab)), dtype=int)
for i, text in enumerate(texts):
for word in text:
if word in vocab:
data[i, vocab.index(word)] += 1
return data

Read in text files from /txt_source directory

texts = []
for filename in os.listdir("txt_source"):
with open(os.path.join("txt_source", filename), "r", encoding="utf-8") as f:
texts.append(f.read().split())

data = text_to_array(texts, vocab)

Save the resulting array as a .npy file

np.save("data.npy", data)```

but i keep geting 'charmap' codec can't decode byte 0x81 in position 1067: character maps to <undefined> error, the thing is every single file is encoded as utf-8

mint palm
#

why cloning a repo and getting it to work a bigger pain in ass than writing my own code, lmao

drifting wagon
#

nothing changed

#
  File "C:\Users\Reny\PycharmProjects\crossoverwriter\raw_to_array.py", line 7, in <module>
    vocab = set(f.read().splitlines())
  File "C:\Users\Reny\AppData\Local\Programs\Python\Python310\lib\encodings\cp1250.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 1067: character maps to <undefined>

this is how looks whole error

north barn
#

add it here with open("vocab.txt", "r") as f:

drifting wagon
#

ok
it worked
thanks
now i haw new set of errors but i fight then on my own for a while

#
    # Initialize an empty array with the same dtype as vocab
    data = np.zeros((len(texts), len(vocab)), dtype=int)
    for i, text in enumerate(texts):
        for word in text:
            if word in vocab:
                data[i, vocab.index(word)] += 1
    return data```
this part of the script is goint to be repated with every world in vocabulary right?
#

yeah it does

#

i have vocab of over 81k words [*]

drifting wagon
#

okey i got this

fringe anvil
#

hello, im using madplotlib and im trying to figure out how to show more dates on my xaxis. i have 94 dates for x, but only 7 shows up on the plot.. is there a bin= argument or xticknumber that i overlooked?
i mean as labels for the x axis, sorry im not too clear

lapis sequoia
#

if u wanted to display all 94 :

plt.xticks(range(0, 94), dates, rotation=90)

vague sable
#

Hi all, What would be the best way to pair wines with similar other wines. For example, I enter a wine I like & I am returned a small list of wines with similar Adjectives used in the falvour description, from the same country, the same variety of wine & the same year?

odd meteor
# vague sable Hi all, What would be the best way to pair wines with similar other wines. For e...

If you have a pandas data frame that contains all these information you can use different approaches to get that done.

Perhaps you could start with semantic similarity score, then filter the dataframe based on highest score. You could then subset the filtered dataframe by year etc

If you wanna build a recommendation engine, the best approach would be, to get a dataset that contains information about each wine's:

  • flavours
  • manufactured year
  • company name
  • country the wine was brewed
  • perhaps the price etc
  • a rating for each variety of wine.
odd meteor
#

More so, you even extend this by

  1. building a function that takes in the recommendation from your recommendation engine.

  2. Create a corpus and get the semantic similarity score of each recommended wine in #1 on the description of the wine you entered in your recommendation engine (presuming you have 'wine_description' column in your original dataframe for each wine)

  3. Finally your function should then return names of recommended wine with their similarity score respectively (in descending order.)

PS: There could be another way to approach this, so feel free to explore other options.

Meanwhile, have fun while at it πŸ˜‚

mint palm
#

i see that in a research paper, for "same dataset" and "fine tuning" dataset, they report accuracies for "Zero-shot text-to-video retrieval" task and "text-to-video retrieval" task.
How could they have done that? I mean zero shot is when all classes is not given right?
Did they purposefully remove some classes to make it Zero-shot text-to-video retrieval. And for normal text-to-video retrieval task, did they use full dataset. Could it be that?

north barn
mint palm
north barn
#

a database vector is retrieved that hasn't been trained on

mint palm
#

i can think of only that corresponding labels might be imperfect

mint palm
north barn
#

ie a database vector is retrieved that was never retrieved during training

#

much like a class that was never used during classification

#

except there are no special architecture modifications required

mint palm
north barn
#

you don't use the output of the network for classification

#

you use the vectorization based on the hidden

mint palm
north barn
#

yes, it can be, it doesn't matter what loss is used, all that matters for it to be zeroshot is that the image was never retrieved during training

outer tapir
#

hey can you check why this isnt running, like no error, the program just feels stuck there
before displot function, it was running alright
heres the code

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df=pd.read_csv('D:\\Python for Data Science and Machine Learning Bootcamp\\05-Seaborn\\dm_office_sales.csv')
# # plt.figure(figsize=(5,8),dpi=200)
sns.rugplot(x='salary',data=df,height=0.5)#height is total % of y axis
# # can be considered as 1 dimensional scatterplots
# # do not use distplot, its deprecated, use dstplot()
sns.set(style='darkgrid')#theme can be set darkgrid or whitegrid
sns.displot(x='salary',data=df,bins=100)#we can manipulate the number of bins as per need
plt.show()
print('hello')
lapis sequoia
mint palm
#

while reproducing results, if batch_size doesnt fit,
we should decrease LR with batch_size, right?
and should we also increase the epoch ? because LR has been decreased

final storm
#

i have no prior experience with ai and want to wrangle stable diffusion into making pixel art
what would be a good place to research this shit?

clever owl
#

I get a physical piece of paper like this and have to transcribe the rows and columns into an excel file. What's the best way to read text from an image of a piece of paper like this?

final storm
clever owl
#

Is there like a defacto best scanner? I'm looking at Google Vision API rn

#

Btw I have no experience with any AI/ML

final storm
#

oh i meant like a physical one lol
there probably is some image to pdf shit but eh

clever owl
#

oh lmaoooo a physical scanner i didnt even think of that

drifting wagon
#

hey guys, so yesterday i created dataset in form on numpy array to train text generator in writing stories, now i wanted to write said generator but the more i look into it the more I see that i should use tenserflow and fine-tune already pre trained model.
So my question sounds, is there any pre-trained model that can be feed my numpy array without any conversions ?

#

i used numpy cuz tenserflow refused to work on my pc

austere swift
#

You can use tensorflow with keras, which will accept numpy arrays directly (just make sure they're in the same format as the data the model was originally trained on)

you can get pretrained models from the huggingface transformers library, which can be loaded into keras to fine-tune

drifting wagon
#

the thing is im unable to get tenserflow with keras to work, like 2 days ago i spend whole night, reinstalling and rebuillding it, cuz pycharm installed "light version" or something and in the end i gave up

mint palm
#

hi,
i noticing a lot of architectures which mentions use of pre trained model, but the model they mention is usually just one component, such as in transformer followed by maybe some projection, the pre trained model could only be transformer part right. So what happens to initialisation of gradient of other parts of model?

#

And i am also wondering that assuming some part is still randomly initialised, are gradients of those part able to adjust in few epochs that model is often trained?

hasty mountain
hasty mountain
odd meteor
# clever owl I get a physical piece of paper like this and have to transcribe the rows and co...
  1. OCR
  2. You can convert this paper to pdf and use any of your preferred python library to extract text from a pdf
  3. If this is all the data needed to copy, you can manually type it
  4. Feeling too lazy to use #3, then use Google Lens app on your android phone / any other equivalent app. This is pretty straightforward.
    a) Ensure your pc is online and you're signed in on your pc to same google account on your phone.
    b) Use Google Lens mobile app to grab all text, click on copy to computer on the app, then on your pc press control + V anywhere you wanna paste the copied text.
mint palm
#

in this paper aim was to make modalities such that they are represented using such vector which are close irrespective of modality
for example: pizza making sentence ~= pizza making audio ~= pizza making video IN ONE EMBEDDING SPACE

#

and they mention model was pretrained, but didnt mention what and how "projections" were initialised

flint falcon
#

Hello, I am a bachelor student in software engineering (second year). I already know the basics of python and I want to develop my skills outside of university. I would like to learn artificial intelligence (I don't know what subjects exactly). I thought about trying to create cool projects from scratch and gain knowledge as I go. My first idea was a trading bot in python. Do you have any books or resources you recommend (my plan is to build a basic bot and improve it over and over again at least until it is "decent" (mentionable on a resume, an "accomplishment" not in profit but in skill gain).

hasty mountain
#

To make sure all inputs have the same shape?

mint palm
# hasty mountain To make sure all inputs have the same shape?

yeah probably "projection" seemed a fancy name, they only say modalities even after going through transformer(acting like a "encoder" here) are still a bit-different in representation(NOT close in 1 embedding space) so they have this projection here

hasty mountain
#

Does anyone know if Reinforcement Learning has been tested with generative networks?
(Now that I think I managed to make a RL algorithm, I want to play a bit)
Perhaps a GAN where the Discriminator is actually a reward model?

arctic crown
#

In ML why do we have multidimensional arrays? why not just use 1d array to store all the data?

hasty mountain
# arctic crown In ML why do we have multidimensional arrays? why not just use 1d array to store...

I don't know the technical, mathematical details, but... which one is easier: decomposing a X-Ray image of a hand into a single, flattened 1D array and then having to figure out which number corresponds to each coordinate(considering many numbers will be 0 and many will be 1), or simply storing that image in a 2D array without moving any pixel, thus, without having the chance of messing your data?

frank shard
#

Hey im wondering what should i lookup to make my idea possible
I want to make ai that will show you what should u dress acording to current weather, i know i should make my own dataset but how csn my ai learn basing on images and text?

iron basalt
hasty mountain
iron basalt
hasty mountain
#

But yeah, I was thinking about the Discriminator assigning a reward to the generator's output based on how fake or how real such output is

hasty mountain
iron basalt
#

IRL is not being given an explicit reward function, but instead making one up to mimic some observed behavior.

#

The expert can be observed, which is what you want to mimic.

#

E.g. watching a human pick up a ball, and then mimicking that.

hasty mountain
#

Hm... Seems like a tricky version of Supervised Learning...