iron basalt Dec 6, 2022, 5:58 AM

#

So now with multiple outputs the idea is that we do that, but just multiple times.

#

With 1 output it looked kind of like this in feel:

#

#

Scalar-vector multiplication.

#

This is where some of the magic of matrices comes into play.

#

#

This looks like the dot product, but the other one is transposed.

#

Note how instead of squishing down to one thing, it expands out.

#

Now imagine a are the outputs and b are the inputs, you can see that it's doing what we did before for each output.

#

1 * inputs, 2 * inputs, 3 * inputs

#

Are the rows.

plush jungle Dec 6, 2022, 6:07 AM

#

iron basalt Now imagine `a` are the outputs and `b` are the inputs, you can see that it's do...

but these are different shapes, whereas in the picture they're both same length vectors

rugged comet Dec 6, 2022, 6:07 AM

#

This doesn't seem right. Have you ever seen a validation dataset so low?

iron basalt Dec 6, 2022, 6:08 AM

#

plush jungle but these are different shapes, whereas in the picture they're both same length ...

Try length 2 for a and 3 for b.

#

2 outputs, 3 inputs.

#

You can also try 1 output to see that it's the same as the backwards pass you had before.

plush jungle Dec 6, 2022, 6:09 AM

#

iron basalt Try length 2 for `a` and 3 for `b`.

you mean with np.matmul?

iron basalt Dec 6, 2022, 6:09 AM

#

plush jungle you mean with np.matmul?

By hand or matmul.

#

Also if you have 2 outputs and 3 inputs, what is the shape of W here?

plush jungle Dec 6, 2022, 6:11 AM

#

!e

import numpy as np
a = np.array([1,2])
b = np.array([3,4,5])
print(np.matmul(a,b))

arctic wedgeBOT Dec 6, 2022, 6:11 AM

#

@plush jungle :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 4, in <module>
003 | ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 2)

plush jungle Dec 6, 2022, 6:11 AM

#

iron basalt Also if you have 2 outputs and 3 inputs, what is the shape of W here?

2x3?

iron basalt Dec 6, 2022, 6:12 AM

#

plush jungle 2x3?

Yes, because the input is 3x1 and the output is 2x1 (2x3 * 3x1 -> 2x1).

plush jungle Dec 6, 2022, 6:13 AM

#

but what's with the valueerror

#

it doesn't work when I transpose a or b either

iron basalt Dec 6, 2022, 6:16 AM

#

plush jungle but what's with the valueerror

Not exactly sure what you mean.

#

Do you mean in your code?

plush jungle Dec 6, 2022, 6:16 AM

#

iron basalt Not exactly sure what you mean.

why is the code throwing an error in the code snippet I posted

iron basalt Dec 6, 2022, 6:17 AM

#

plush jungle why is the code throwing an error in the code snippet I posted

This is a numpy detail. Your vectors have shape (2,) and (3,), not (2, 1), and (3, 1).

#

Matmul works on matrices.

#

You can reshape them first.

plush jungle Dec 6, 2022, 6:19 AM

#

!e

import numpy as np
a = np.array([1,2])
b = np.array([3,4,5])

a = np.reshape(a, (2,))
b = np.reshape(b, (3,))

print(np.matmul(a,b))```

arctic wedgeBOT Dec 6, 2022, 6:19 AM

#

@plush jungle :x: Your 3.11 eval job has completed with return code 1.

001 | Traceback (most recent call last):
002 |   File "<string>", line 8, in <module>
003 | ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 2)

iron basalt Dec 6, 2022, 6:25 AM

#

!e```py
import numpy as np
a = np.array([1,2])
b = np.array([3,4,5])

a = np.reshape(a, (2,1))
b = np.reshape(b, (3,1))

print(np.matmul(a,b.T))

arctic wedgeBOT Dec 6, 2022, 6:25 AM

#

@iron basalt :white_check_mark: Your 3.11 eval job has completed with return code 0.

001 | [[ 3  4  5]
002 |  [ 6  8 10]]

iron basalt Dec 6, 2022, 6:25 AM

#

@plush jungle

plush jungle Dec 6, 2022, 6:26 AM

#

oh ok nice

iron basalt Dec 6, 2022, 6:26 AM

#

Note the transpose on b as described in the image.

plush jungle Dec 6, 2022, 6:26 AM

#

right

iron basalt Dec 6, 2022, 6:27 AM

#

For reference: https://en.wikipedia.org/wiki/Outer_product

Outer product

In linear algebra, the outer product of two coordinate vectors is a matrix. If the two vectors have dimensions n and m, then their outer product is an n × m matrix. More generally, given two tensors (multidimensional arrays of numbers), their outer product is a tensor. The outer product of tensors is also referred to as their tensor product, and...

#

#

Now note that the result of this operation gives a matrix with dimensions that match the weight matrix.

plush jungle Dec 6, 2022, 6:30 AM

#

ok i'm following so far

#

can you tie it into this?

Neuron 1  -> neuron 2  -> neuron 3

                Neuron 3 gradient =         dE/dw3
        Neuron 2 gradient = dE/dw2 =        d3w/d2w * dE/dw3
Neuron 1 gradient = dE/dw1 =                d2w/d1w * d3w/d2w * dE/dw3```

iron basalt Dec 6, 2022, 6:35 AM

#

So sticking the the matrix notation, we want dE/dW for updating reasons.

#

And because of the way the weight update rule works, we need that to be a matrix with the same shape as the weights.

#

Note here you had w_i, but since we are working with matrices, there is no subscript, it's just W.

#

Let's let a be the W^Tx+b:

#

.latex $$\bm{o}=\sigma(\bm{a})$$

strange elbowBOT Dec 6, 2022, 6:42 AM

#

$latex.png$

plush jungle Dec 6, 2022, 6:45 AM

#

so I've got this

    def backpropagate(self, output, y):

        # output neuron update
        
        # this outputs a 3-vector
        output_gradient = -(y - output[2]) * output[2]* (1-output[2]) * output[1]
        # update the weights with this 3-vector
        self.output_layer.weights -= output_gradient * self.lr


        # hidden layer update

        # this outputs a 3-vector
        hidden_delta = -(y - output[1]) * output[1] * (1-output[1])
        # reshape it from (3,) to (3,1)
        hidden_delta = np.reshape(hidden_delta, (3,1))
        # reshape the input from (10000,) to (1,10000)
        output[0] = np.reshape(output[0], (1,10000))

        # matrix multiply the delta and the input
        # this returns a (3,10000) matrix
        hidden_gradient = np.matmul(hidden_delta, output[0])

#

but the hidden_gradient needs to be multiplied by the output_gradient for the chain rule part before it can update the hidden layer weights

plush jungle Dec 6, 2022, 6:56 AM

#

iron basalt And because of the way the weight update rule works, we need that to be a matrix...

but I don't understand how I can multiply a (3,1) gradient and a (3,10000) gradient and get a (3,10000) gradient

solemn topaz Dec 6, 2022, 7:16 AM

#

Does anyone know a way to visualize the changes made to pandas data frames when transforming them? I do not mean visualize the data in a graph but visualize how the data frame itself is changed. I know of pandas tutor but it seems to be only usable as a website. Not as a way of documenting a data transformation pipeline. Thanks!

iron basalt Dec 6, 2022, 7:21 AM

#

plush jungle so I've got this ```py def backpropagate(self, output, y): # output...

A y - o does not happen in the hidden layers. Only the output layer.

#

If you have 1 output neuron, then y will also be length 1, but if your hidden layer has 3 neurons, then that won't work. Can't do 1 dimensional thing minus 3 dimensional thing.

#

Before trying to get backprop to work, try just 1 layer with multiple outputs.

gaunt hamlet Dec 6, 2022, 7:26 AM

#

hey, anyone knows any open source projects to practise what I learnt in python course?

young granite Dec 6, 2022, 7:41 AM

#

gaunt hamlet hey, anyone knows any open source projects to practise what I learnt in python c...

kaggle

young granite Dec 6, 2022, 7:42 AM

#

solemn topaz Does anyone know a way to visualize the changes made to pandas data frames when ...

isnt it enough to check input/output of ur frame?

vivid hemlock Dec 6, 2022, 7:53 AM

#

Hello. I require help on choosing a dataset for speed calculations based on gps tracking. I have been searching for it however haven't been able to get a dataset. Our teacher has told to do proper research for the project but I just can't get hold on where to start. I and my team would be really thankful if somebody could help.

solemn topaz Dec 6, 2022, 7:54 AM

#

young granite isnt it enough to check input/output of ur frame?

No, there's a lot of transformations going on. Many different batches of data from many different sources each requiring many transformations.

gaunt hamlet Dec 6, 2022, 8:43 AM

#

young granite kaggle

thanks

fresh tiger Dec 6, 2022, 9:06 AM

#

Hi! I have a question regarding feature engineering, in particular feature selection.

My data set consists of text and has features such as: ``` Scentence - {"scentence in here", "another scentence here", ...}

Topic - {"Sports", "elections", "food"}

Label - {"Bias", "Non-Biased}


For numeric data, I know there are things like: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.corr.html. Is there something similar but for textual data?

midnight vessel Dec 6, 2022, 12:03 PM

#

Hello everyone

#

I want to ask a question related to AI and ML

#

can I please ask here

mild harness Dec 6, 2022, 12:08 PM

#

hi everyone, I'm just getting started on Python, is this the right place to ask questions ?

austere swift Dec 6, 2022, 1:04 PM

#

midnight vessel I want to ask a question related to AI and ML

just ask

midnight vessel Dec 6, 2022, 1:04 PM

#

Hello everyone
I seen many new tools regarding ai. So I want to ask this question, I am planning to create something, that will help people to generate illustrations, ui designs according to their needs, is this something possible?
Ui like the final results, like dribbble present ui's, people often time don't know what / how to design
is it something doable to build a tool like this and solve this problem with ai?
If its possible, then how can I do it?

#

Making beautiful illsutrations or web ui's according to individual needs

#

#

Like this, I found this attractive design on dribbble, but wants to automate it with the help of AI, I think AI can also be used in things that help people, what are your thoughts on this? is it a useful / helpful concept that would be very helpful for people in order to generate designs?

grand quarry Dec 6, 2022, 1:23 PM

#

If you know their job position and want to predict what they will be earning why not simply use the mean or median of their position?
If you have more information and want very accurate predictions, you could make multiple separate models for each job position.

molten latch Dec 6, 2022, 3:13 PM

#

hi i am having trouble with merging two data frames

#

import pandas as pd
df=pd.DataFrame({'name':['hamid','meow','billu'],'job':['cs','cat','cat']})
df2=pd.DataFrame({'hobby':['playing','eating','sleeping'],'friends':['mahab','carry','kutta']})
df3=pd.merge(df,df2, left_on='name', right_on='hobby')
print(df3)

kindly help the above code displays empty data frame on output

lavish kraken Dec 6, 2022, 3:16 PM

#

explain why machine learning model will perform better than the others

molten latch Dec 6, 2022, 3:16 PM

#

lavish kraken explain why machine learning model will perform better than the others

would u like to help

lavish kraken Dec 6, 2022, 3:17 PM

#

molten latch hi i am having trouble with merging two data frames

If you know how to use vlookup in excel or joins in SQL, you should be able to merge data in python using merge()well

#

Just pay attention to the columns that match in both dataframe

molten latch Dec 6, 2022, 3:19 PM

#

i am quite new to this function however

df3=df.join(df2)
print(df3)

#

works perfect

fresh tiger Dec 6, 2022, 3:20 PM

#

Hi! I have a question regarding this approach of finding the correlation between a feature that consists of strings: https://stackoverflow.com/questions/51241575/calculate-correlation-between-columns-of-strings. Does this actually state the accuracy of the feature? I know this may be a silly question, I am just a bit curious (I have a dataset that has some articles with a bias/non-bias label, and a topic feature. When trrying this method, the topic feature had quite a low (around 0.03) correlation to the bias label)

Stack Overflow

Calculate correlation between columns of strings

I've got a df that contains the columns profession and media. I would like to calculate the correlation between those two columns.

Is there a short hack of calculating the correlation of columns of

wooden sail Dec 6, 2022, 3:23 PM

#

there's no unique way of doing this. you can encode the strings in different ways

#

this categorical approach has the caveat that the order in which you encode strings affects the correlation because the code is not equidistant

#

you could one-hot instead, which would make strings equidistant, but also pairwise orthogonal and yield high dimensional vectors

#

pick your poison, no method is perfect

proud yarrow Dec 6, 2022, 3:29 PM

#

i generally use onehot

#

pd.dummies(df)```

#

or you could encode them as giant binary numbers (how they are internally stored) and feed them in, that has the problem of managing such large numbers, it's less hard in python, but for large strings it could be diffcult.

lavish kraken Dec 6, 2022, 3:36 PM

#

molten latch ``` import pandas as pd df=pd.DataFrame({'name':['hamid','meow','billu'],'job':[...

This should help

arctic wedgeBOT Dec 6, 2022, 3:36 PM

#

Hey @lavish kraken!

It looks like you tried to attach file type(s) that we do not allow (.pdf). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.

Feel free to ask in #community-meta if you think this is a mistake.

serene scaffold Dec 6, 2022, 3:42 PM

#

@lavish kraken we only allow links to PDFs, sorry

lavish kraken Dec 6, 2022, 3:43 PM

#

ok, just trying to help the guy..let me copy the image and paste right?

wooden sail Dec 6, 2022, 3:49 PM

#

the image is fine. if the link to the pdf does not involve piracy, that's also fine

hushed kraken Dec 6, 2022, 4:09 PM

#

#

I get an error when importing a sequential model but I don't see why

#

Can someone help me please?

fresh tiger Dec 6, 2022, 4:11 PM

#

wooden sail this categorical approach has the caveat that the order in which you encode stri...

Does this mainly impact strings with more than 1 word? Ie would a single worded string not be impacted by this?

wooden sail Dec 6, 2022, 4:12 PM

#

fresh tiger Does this mainly impact strings with more than 1 word? Ie would a single worded ...

it would also affect the 1 word case

fresh tiger Dec 6, 2022, 4:16 PM

#

Just to confirm that im not doing it wrong/missunderstanding, with one-hot, would that essentially check the correlation between each value under the feature (in this case topic) and the outcome (when I give hot encoding a try, this is what happens)? I.e with the StackOverflow version i sent in the previous message, I can get a correlation between the feature as a whole and the outcome value, while with onehot it would return the correlation between each value that the feature can have and the outcome value?

wooden sail Dec 6, 2022, 5:15 PM

#

fresh tiger Just to confirm that im not doing it wrong/missunderstanding, with one-hot, woul...

no, they would both do the same thing

#

take one feature and compare it to another

#

that ofc includes taking a single value if you like

thin remnant Dec 6, 2022, 5:25 PM

#

I'm trying to convert this data to be a dataclass that holds a list of dataclasses

data:
DATA_POINTS = {
'data_points': [[1.0, 1.2132985766400843], [2.0, 1.164865727865016], [3.0, 1.1534609099056354],
[4.0, 1.148530443569608], [5.0, 1.1488081940756838], [6.0, 1.156518190001923]
}

my code:

async def test_parse_datapointsTO():
    datapoint = DataTO.from_dict(DATA_POINTS)
    print("worked")

@dataclass
class DataTO(JSONWizard):
    some_field: str
    raw_data_points: list[DataPointTO]

@dataclass
class DataPointTO(JSONWizard):
    class _(JSONWizard.Meta):
        debug_enabled = True
        raise_on_unknown_json_key = True


    x: float
    y: float

    @property
    def data_point(self) -> list[float, float]:
        return [self.__x, self.__y]

    @data_point.setter
    def data_point(self, data_point: list[float, float]):
        self.__x = data_point[0]
        self.__y = data_point[1]

Can someone explain me why this isn't working or help me fix it 😮

Thanks in advance!

onyx geyser Dec 6, 2022, 5:34 PM

#

Hi dear fellows.

#

Does anyone here own this book:

Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and The Cloud.

If so, could you share it here please ...

serene scaffold Dec 6, 2022, 5:41 PM

#

onyx geyser Does anyone here own this book: **Intro to Python for Computer Science and Dat...

This server is not a place to seek out pirated books

lofty charm Dec 6, 2022, 5:42 PM

#

hi, is there a way to get confidence intervals when using scipy's fisher_exact?

plush jungle Dec 6, 2022, 6:28 PM

#

ok clearly I've done something wrong here

    def backpropagate(self, output, y):

        # output neuron update
        
        # this outputs a scalar
        output_delta = -(y - output[2]) * output[2]* (1-output[2])
        # update the bias with this scalar
        self.output_layer.biases -= output_delta * self.lr

        # multiply the delta times the input to produce a 3-vector
        output_gradient = output_delta * output[1]
        # update the weights with this 3-vector
        self.output_layer.weights -= output_gradient * self.lr


        # hidden layer update

        # this outputs a 3-vector
        hidden_delta = output_gradient * self.output_layer.weights * (1 - output[1])
        # update the biases with this 3 vector
        self.hidden_layer.biases -= hidden_delta * self.lr
        
        # reshape it from (3,) to (3,1)
        hidden_delta = np.reshape(hidden_delta, (3,1))
        # reshape the input from (,10000) to (1,10000)
        output[0] = np.reshape(output[0], (1,10000))
        
        # this returns a (3,10000) matrix
        hidden_gradient = np.matmul(hidden_delta, output[0])
        # update the hidden layer weights with this matrix
        self.hidden_layer.weights -= hidden_gradient * self.lr```

#

because the loss for one class keeps going up, and the loss for the other class keeps going down

agile cobalt Dec 6, 2022, 8:23 PM

#

some that I checked out myself:
just want to get stuff done? fast.ai on their website
want to understand it? andrew ng's on Coursera
don't like andrew's for whichever reason? sklearn inria mooc

#

there are some others on our website as well

#

!resources

arctic wedgeBOT Dec 6, 2022, 8:24 PM

#

Resources

The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.

agile cobalt Dec 6, 2022, 8:26 PM

#

some basic notion of algorithms and data structures might help, but you really don't need a lot to get started, and imo it's better to look things up as you need them than to try to get ready for the unknown

plush jungle Dec 6, 2022, 11:48 PM

#

thanks for the help! but I'm not sure I understand. output_gradient and self.output_layer.weights does have a * symbol

#

and hidden_gradient is using np.matmul

honest sentinel Dec 6, 2022, 11:49 PM

#

yeah my bad im only 16 im probably dumb lol

plush jungle Dec 6, 2022, 11:50 PM

#

honest sentinel yeah my bad im only 16 im probably dumb lol

don't discount yourself, imposter syndrome is bad enough as it is among programmers

#

but I don't see where you changed the code

sweet wagon Dec 6, 2022, 11:53 PM

#

I have zero experience whatsoever programming but I have had an AI write something for me, and I have been attempting to work out the kinks in visual studio code, not sure if it’s worth it but it’s pretty fun messing around with it nonetheless

#

Anybody have any advice for an absolute beginner in this situation lol

honest sentinel Dec 6, 2022, 11:56 PM

#

@plush jungleyou should use the output layer's input (which is the hidden layer's output) to calculate the gradient, instead of the hidden layer's output. You can do this by changing the line output_gradient = output_delta * output[1] to output_gradient = output_delta * output[0]. This should fix the issue where the loss for one class keeps going up and the loss for the other class keeps going down.

#

maybe

iron basalt Dec 7, 2022, 12:00 AM

#

plush jungle ok clearly I've done something wrong here ```py def backpropagate(self, outp...

https://en.wikipedia.org/wiki/Backpropagation#Derivation

Backpropagation

In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions generally. These classes of algorithms are all referred to generically as "backpropagation". In fitting a neura...

#

#

plush jungle Dec 7, 2022, 12:04 AM

#

honest sentinel <@433856634192789504>you should use the output layer's input (which is the hidde...

but if I do that, then the outut_gradient becomes shape (1,10000)

#

since output[0] is the pixel vector of the image

plush jungle Dec 7, 2022, 12:07 AM

#

iron basalt

this doesn't make sense. shouldn't there also be a multiplication with the previous layer's input? I don't see that on here

iron basalt Dec 7, 2022, 12:07 AM

#

plush jungle this doesn't make sense. shouldn't there also be a multiplication with the prev...

o_i

plush jungle Dec 7, 2022, 12:08 AM

#

the top equation for an output neuron looks exactly like my line here

output_delta = (output[2] - y) * output[2]* (1-output[2])```
but then I do this
```py
output_gradient = output_delta * output[0]```

iron basalt Dec 7, 2022, 12:09 AM

#

Extra bit of information that might help.

plush jungle Dec 7, 2022, 12:10 AM

#

oh wait I see it squiggle. the second picture you posted has it

fresh tiger Dec 7, 2022, 12:11 AM

#

wooden sail that ofc includes taking a single value if you like

For example, when using one hot encoding: ```python
s_corr = df.topic.str.get_dummies().corrwith(df['Label_bias'].astype('category').cat.codes)
print(s_corr)

------output-------
abortion -0.051860
coronavirus 0.002015
elections-2020 -0.024377
environment 0.017935
gender 0.055068
gun-control 0.005897
immigration 0.029517
international-politics-and-world-news 0.026522
middle-class -0.013546
sport 0.145206
student-debt 0.005479
trump-presidency -0.085066
vaccines -0.015487
white-nationalism -0.095884

```python
dfCopy = df.copy()

dfCopy['topic'] = dfCopy['topic'].astype('category').cat.codes
dfCopy['Label_bias'] = dfCopy['Label_bias'].astype('category').cat.codes
dfCopy.corr()['Label_bias']['topic']

------output-------
-0.03312874060844754

I see these as 2 different perspectives, ie the onehot shows how each specific value of topic has an impact on the label. Not sure if this makes sense, but to convert it to a single value (i.e to judge the corrolation of the feature "topic" as a whole with the output label) would it make sense to just calc the average of all the output values of the One-hot encoding?

Stack Overflow

Calculate correlation between columns of strings

I've got a df that contains the columns profession and media. I would like to calculate the correlation between those two columns.

Is there a short hack of calculating the correlation of columns of

plush jungle Dec 7, 2022, 12:11 AM

#

so how is my code any different? it looks exactly the same

iron basalt Dec 7, 2022, 12:13 AM

#

https://en.wikipedia.org/wiki/Matrix_multiplication#Definition

Matrix multiplication

In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first...

#

Remember that * in numpy is element-wise multiplication.

#

Not matrix multiplication or the dot product.

iron basalt Dec 7, 2022, 12:16 AM

#

iron basalt

What operation is being done here?

plush jungle Dec 7, 2022, 12:17 AM

#

that's matrix multiplication right?

iron basalt Dec 7, 2022, 12:18 AM

#

Yes, but also note that the delta only has one subscript.

#

So delta is a vector.

#

(Which can be seen as a matrix with only 1 column, so the indices are like j, 1 optionally)

plush jungle Dec 7, 2022, 12:19 AM

#

right

iron basalt Dec 7, 2022, 12:20 AM

#

Note that in the forward pass we did matrix-vector multiplication for multiple outputs.

#

Sort of distributing the inputs to all of the outputs.

#

And when we go backwards, we want to distribute the deltas to the inputs.

plush jungle Dec 7, 2022, 12:25 AM

#

so this line is wrong then right?

        hidden_delta = output_gradient * self.output_layer.weights *(1 - output[1])```
it doesn't have np.matmul(w_j, dl)

iron basalt Dec 7, 2022, 12:27 AM

#

Yes, also you can implement it with plain old loops which follows the subscript notation exactly.

#

Then convert to numpy / the linear algebra notation way after.

plush jungle Dec 7, 2022, 12:27 AM

#

but what are w_j and dl here

#

w_j I'm guessing is

#

self.output_layer.weights```

#

but is dl the output_gradient?

iron basalt Dec 7, 2022, 12:30 AM

#

Whatever you named it.

plush jungle Dec 7, 2022, 12:30 AM

#

iron basalt

de/do is definitely output_gradient

#

but I actually don't know what do/dnet is

iron basalt Dec 7, 2022, 12:31 AM

#

plush jungle but is dl the output_gradient?

"net" here is explained in the wikipedia post, it's the input to the sigmoid (sigmoid(net)).

plush jungle Dec 7, 2022, 12:31 AM

#

so if we're calculating the gradient to adjust weights j, it would be the input to neuron j

iron basalt Dec 7, 2022, 12:32 AM

#

weights_i,j

#

Weights are a matrix.

plush jungle Dec 7, 2022, 12:32 AM

#

right

iron basalt Dec 7, 2022, 12:32 AM

#

iron basalt

.

plush jungle Dec 7, 2022, 12:33 AM

#

and delta_j is an elementwise multiplication of those two derivatives?

iron basalt Dec 7, 2022, 12:33 AM

#

Exactly as the equations are written, you just need the loops for the indices.

#

Which can be absorbed into notation as before with dot and matrix product.

#

delta here is a vector, as can be seen by the single subscript, so delta_j is a single component of that vector.

iron basalt Dec 7, 2022, 12:34 AM

#

iron basalt

Try reading through the wikipedia section on matrix multiplication and see if you can understand how the subscript definition works.

#

Then see if you can write matrix-vector multiplication in subscript form.

plush jungle Dec 7, 2022, 12:35 AM

#

iron basalt

I'm still struggling with this

#

what does dl represent

iron basalt Dec 7, 2022, 12:36 AM

#

The deltas or """errors""".

plush jungle Dec 7, 2022, 12:36 AM

#

so not the previous layer's gradient, but the previous layer's delta?

iron basalt Dec 7, 2022, 12:37 AM

#

You can get away with the term gradient, but it's not exactly right.

#

So just "deltas".

plush jungle Dec 7, 2022, 12:38 AM

#

delta for the output is calculated like this

output_delta = (output[2] - y) * output[2]* (1-output[2])```
but once you do this
```py
output_gradient = output_delta * output[1]```
what is it called if not the gradient?

iron basalt Dec 7, 2022, 12:38 AM

#

Gradient is the end result you get from all this stuff before you use it to update the weights.

plush jungle Dec 7, 2022, 12:39 AM

#

so technically it's only a gradient once you have it for every layer?

iron basalt Dec 7, 2022, 12:39 AM

#

Deltas is not dE/dW, it used to form it though.

#

It's part of it.

iron basalt Dec 7, 2022, 12:40 AM

#

plush jungle delta for the output is calculated like this ```py output_delta = (output[2] - y...

As for this, i'm not exactly sure if it's correct (i'm not following all of your code), but if it's what you end up applying to the weights when you subtract by alpha times it, then yeah.

plush jungle Dec 7, 2022, 12:40 AM

#

I see. is there a shorter way to write that than

derivative_of_error_with_respect_to_weights```

sweet crypt Dec 7, 2022, 12:41 AM

#

Sorry to jump in in the comversation but has anyone used Jax Jit functions with Haiku?

iron basalt Dec 7, 2022, 12:43 AM

#

plush jungle I see. is there a shorter way to write that than ``` derivative_of_error_with_r...

nabla?

plush jungle Dec 7, 2022, 12:43 AM

#

iron basalt

for the inner neurons I'm thinking something like this

        hidden_delta =  np.matmul(self.hidden_layer.weights, output_delta) * output[1] * (1-output[1])

but I think it's wrong since it's the wrong shape.

#

everything looks right except output_delta

iron basalt Dec 7, 2022, 12:44 AM

#

What is output_delta?

plush jungle Dec 7, 2022, 12:44 AM

#

which was calculated like this

output_delta = (output[2] - y) * output[2]* (1-output[2])```

#

but if dl isn't output_delta, what is it?

iron basalt Dec 7, 2022, 12:45 AM

#

So from the equation you can see that there is no (o - y) for the inner neurons.

#

If this is the output layer then it's correct.

#

Oh wait, ok.

#

So what is the shape mismatch?

plush jungle Dec 7, 2022, 12:47 AM

#

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 1)```

#

hidden layer weights should be a 3x10000 matrix

iron basalt Dec 7, 2022, 12:47 AM

#

What is the shape of output_delta?

plush jungle Dec 7, 2022, 12:48 AM

#

(1,1), it's a scalar

iron basalt Dec 7, 2022, 12:49 AM

#

The weights should be the ones from the hidden to the output.

plush jungle Dec 7, 2022, 12:49 AM

#

iron basalt The weights should be the ones from the hidden to the output.

oh

iron basalt Dec 7, 2022, 12:50 AM

#

w_j,l

#

Weights from previous layer j, to l.

plush jungle Dec 7, 2022, 12:50 AM

#

ok it now looks like this and throws no errors

    def backpropagate(self, output, y):

        # output neuron update
        
        # this outputs a scalar
        output_delta = (output[2] - y) * output[2]* (1-output[2])
        # update the bias with this scalar
        self.output_layer.biases -= output_delta * self.lr

        # multiply the delta times the input to produce a 3-vector
        output_gradient = output_delta * output[1]
        
        # update the weights with this 3-vector
        self.output_layer.weights -= output_gradient * self.lr


        # hidden layer update

        # this outputs a 3-vector
        hidden_delta =  np.matmul(output_delta, self.output_layer.weights) * output[1] * (1-output[1])
        # update the biases with this 3 vector
        self.hidden_layer.biases -= hidden_delta * self.lr
        
        # reshape it from (3,) to (3,1)
        hidden_delta = np.reshape(hidden_delta, (3,1))
        # reshape the input from (,10000) to (1,10000)
        output[0] = np.reshape(output[0], (1,10000))
        
        # this returns a (3,10000) matrix
        hidden_gradient = np.matmul(hidden_delta, output[0])
        # update the hidden layer weights with this matrix
        #self.hidden_layer.weights -= hidden_gradient * self.lr
        self.hidden_layer.weights -= hidden_delta * self.lr```

#

but something is still wrong

#

when given class 0 and class 1, it only goes in one direction, adjusting the weights either positively or negatively. it should alternate, adjusting it towards the class label

iron basalt Dec 7, 2022, 12:54 AM

#

The reshaping seems error prone.

plush jungle Dec 7, 2022, 12:54 AM

#

why

iron basalt Dec 7, 2022, 12:56 AM

#

Your hidden delta should be the correct shape from having done the matmul correctly.

plush jungle Dec 7, 2022, 12:58 AM

#

iron basalt Your hidden delta should be the correct shape from having done the matmul correc...

if I print its shape out before I reshape, it's 1,3

#

then reshape makes it 3,1

iron basalt Dec 7, 2022, 12:58 AM

#

Other than that, it's just setting off my warnings, idk, subconscious.

#

So, one thing that could be wrong is signs. Try += instead of -= for the weight updates.

#

Oh.

#

Don't update the weights until the end.

#

You are updating the output layer weights before continuing back.

plush jungle Dec 7, 2022, 1:00 AM

#

OH

#

wow yeah

iron basalt Dec 7, 2022, 1:01 AM

#

Then multiplying with the updated, not the old.

plush jungle Dec 7, 2022, 1:01 AM

#

I'm changing the weights and then using the changed version

iron basalt Dec 7, 2022, 1:01 AM

#

This is why everyone uses an autodiff tool, manually doing it is painful.

#

Do phases of compute the gradients, then update with them.

#

Two separate paragraphs / sections.

plush jungle Dec 7, 2022, 1:16 AM

#

ok the only reshapes I have now are turning the biases from (,3) to (1,3) and the x from (,10000) to (1,10000)

#

    def backpropagate(self, output, y):
        # this outputs a scalar
        output_delta = (output[2] - y) * output[2]* (1-output[2])
        # multiply the delta times the input to produce a 3-vector
        output_gradient = output_delta * output[1]


        # this outputs a 3-vector
        hidden_delta =  np.matmul(output_delta, self.output_layer.weights) * output[1] * (1-output[1])
        # reshape the input from (,10000) to (1,10000)
        output[0] = np.reshape(output[0], (1,10000))
        # this returns a (3,10000) matrix
        hidden_gradient = np.matmul(hidden_delta.T, output[0])

        
        
        # update the output bias
        self.output_layer.biases -= output_delta * self.lr
        # update the output weights
        self.output_layer.weights -= output_gradient * self.lr

        # update the hidden layer biases
        self.hidden_layer.biases -= hidden_delta * self.lr
        # update the hidden layer weights
        self.hidden_layer.weights -= hidden_gradient * self.lr```

#

instead of flipping it with reshape I used transpose for this line

hidden_gradient = np.matmul(hidden_delta.T, output[0])```

#

but the predictions are still moving in only one direction

iron basalt Dec 7, 2022, 1:17 AM

#

You have a += instead of -= for one of them.

plush jungle Dec 7, 2022, 1:17 AM

#

fixed that, but still no dice

#

error goes up for one class and down for the other

iron basalt Dec 7, 2022, 1:18 AM

#

plush jungle error goes up for one class and down for the other

Did you shuffle the inputs?

#

What is the lr?

plush jungle Dec 7, 2022, 1:19 AM

#

iron basalt Did you shuffle the inputs?

I do not shuffle the inputs. I've tried learning rates between .5 and .00005 but it's the same thing. high learning rates eventually converge and stop updating

#

['0[[0.25051035]]', '1[[0.00313093]]']
['0[[0.24686542]]', '1[[0.24683338]]']
['0[[0.2468654]]', '1[[0.2468654]]']
['0[[0.2468654]]', '1[[0.2468654]]']
['0[[0.2468654]]', '1[[0.2468654]]']```

#

loss for class 0 and class 1 examples with a high learning rate

iron basalt Dec 7, 2022, 1:20 AM

#

Instead of setting output[0] to the shape, just pass the reshape directly to the matmul.

plush jungle Dec 7, 2022, 1:21 AM

#

same thing

iron basalt Dec 7, 2022, 1:21 AM

#

What is the forward pass?

plush jungle Dec 7, 2022, 1:23 AM

#

        output_vectors = nn.forward(input_vector)
        output_vectors.insert(0,input_vector)                  
        prediction = output_vectors[-1]
        error = L2_loss(image_class, prediction)
        nn.backpropagate(output_vectors, image_class)```

iron basalt Dec 7, 2022, 1:24 AM

#

What is forward doing and what is backpropagate doing?

plush jungle Dec 7, 2022, 1:25 AM

#

backpropagate is the above function. forward is this

    def forward(self,x):
        output_vectors = []
        
        for layer in self.layers:
            # if the first layer
            if not output_vectors:
                # pass it the input vector
                x = np.tile(x,(layer.num_neurons, 1))
                output = layer.forward(x)
            else:
                # otherwise pass it the previous layer's output
                output = layer.forward(output_vectors[-1])

            output_vectors.append(output)
        
        return output_vectors```

iron basalt Dec 7, 2022, 1:25 AM

#

backpropagate takes a single output vector, not multiple.

#

Oh wait, ok, it's for the layers.

#

Why is x tiled?

plush jungle Dec 7, 2022, 1:27 AM

#

because x is the input vector, 10000 length, but it gets passed to 3 hidden layer neurons

#

so I need to pass it to each neuron

#

so I make it (3,10000)

iron basalt Dec 7, 2022, 1:28 AM

#

The 3 neurons all share the same 10000 inputs.

#

If it's fully connected.

#

You don't duplicate them.

plush jungle Dec 7, 2022, 1:28 AM

#

yes, but np.tile duplicates the input vectors right

iron basalt Dec 7, 2022, 1:28 AM

#

You need to review what matrix multiplication does.

#

What does layer.forward do?

plush jungle Dec 7, 2022, 1:29 AM

#

    def forward(self,x):        
        return sigmoid(np.sum(self.weights * x) + self.biases)```

#

ok i think I see. this only works if the weights and the input are the same shape

#

but if I didn't do tile and instead did matmul(x, self.weights)

iron basalt Dec 7, 2022, 1:30 AM

#

is element-wise multiplication.

#

We want matrix mutliplication, Wx+b.

plush jungle Dec 7, 2022, 1:31 AM

#

they're the same thing if the matrices are the exact same shape, right?

iron basalt Dec 7, 2022, 1:31 AM

#

Matrix multiplication basically resamples the inputs again for each output.

#

So no duplication needed.

#

You want each output neuron to do the dot between its weight vector and the input vector.

#

So there is 1 input vector, and N weight vectors.

#

Wx, matrix multiplication, does multiple dot products.

#

The weight vectors are stored in W.

#

Side by side.

#

When you do W^Tx it's basically doing the dot product between each weight vector and x.

#

And so you get out a vector where each component is the dot product result.

plush jungle Dec 7, 2022, 1:34 AM

#

like this

 def forward(self,x):        
   return sigmoid(np.dot(self.weights, x) + self.biases)

 output = layer.forward(x)```

iron basalt Dec 7, 2022, 1:34 AM

#

We already covered this and you find it in our previous messages.

plush jungle Dec 7, 2022, 1:34 AM

#

yeah

#

except wait, since weights is a matrix

#

it should be

 def forward(self,x):        
   return sigmoid(np.matmul(self.weights, x) + self.biases)

 output = layer.forward(x)```

iron basalt Dec 7, 2022, 1:35 AM

#

That requires x to be a matrix and numpy lets us be lazy with dot.

plush jungle Dec 7, 2022, 1:35 AM

#

ok

iron basalt Dec 7, 2022, 1:36 AM

#

Let's us do stuff like (n, m) (m,).

plush jungle Dec 7, 2022, 1:36 AM

#

but changing it to np.dot and getting rid of tile threw this

  File "<__array_function__ internals>", line 180, in dot
ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)```

iron basalt Dec 7, 2022, 1:36 AM

#

But note it will output (n,) not (n, 1).

#

Is your biases a vector?

plush jungle Dec 7, 2022, 1:37 AM

#

yes

#

        self.biases = np.random.rand(num_neurons)

        self.biases = np.reshape(self.biases, (1,num_neurons))```

iron basalt Dec 7, 2022, 1:37 AM

#

What is the shape of x and weights?

#

Oh, why is biases reshaped?

#

Let it just be a vector.

plush jungle Dec 7, 2022, 1:38 AM

#

iron basalt Oh, why is biases reshaped?

becuase np.rand outputs (3,)

iron basalt Dec 7, 2022, 1:38 AM

#

We can temporarily reshape in backprop.

plush jungle Dec 7, 2022, 1:38 AM

#

ok

iron basalt Dec 7, 2022, 1:38 AM

#

(3,) is fine.

#

That is a vector.

plush jungle Dec 7, 2022, 1:39 AM

#

x
weights
x
weights

looks like this

(10000,)
(3, 10000)
(3,)
(1, 3)```

iron basalt Dec 7, 2022, 1:41 AM

#

Which dot is causing the error?

craggy patio Dec 7, 2022, 1:41 AM

#

wdym?

iron basalt Dec 7, 2022, 1:41 AM

#

For which weights / x.

plush jungle Dec 7, 2022, 1:42 AM

#

it looks like it's just the last two

#

wait actually

#

the forward error goes away

#

when I get rid of the bias reshape

#

but then there's an error on this line

        # this returns a (3,10000) matrix
        hidden_gradient = np.matmul(hidden_delta.T, output[0])```

iron basalt Dec 7, 2022, 1:43 AM

#

What is the shape of hidden_delta and output[0] and what is the error?

#

I have an idea already.

plush jungle Dec 7, 2022, 1:44 AM

#

(3,) (1, 10000)```

#

ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 3)```

iron basalt Dec 7, 2022, 1:45 AM

#

So before you had (3, 1) i'm guessing.

plush jungle Dec 7, 2022, 1:45 AM

#

yeah so I guess I'll just reshape hidden_delta

iron basalt Dec 7, 2022, 1:45 AM

#

But now that outputs are (n,) and not (n, 1) your deltas are also like that.

plush jungle Dec 7, 2022, 1:45 AM

#

but inside

iron basalt Dec 7, 2022, 1:45 AM

#

Yeah then transpose, transpose on (n,) does nothing.

#

(Annoying detail of numpy is this shape stuff)

#

(But you get used to it)

#

Just keep in mind what is a vector and what is a matrix.

#

(n,) vs (n, 1)

#

Two different things in numpy.

plush jungle Dec 7, 2022, 1:47 AM

#

does tensorflow do it differently?

iron basalt Dec 7, 2022, 1:47 AM

#

IDR, probably not.

plush jungle Dec 7, 2022, 1:47 AM

#

ok, fixed all those things, but the predictions are still going in one direction

#

i'ma go eat dinner, thanks so much for your help so far

iron basalt Dec 7, 2022, 1:48 AM

#

(n,) basically does not assume you want your vectors to be column or row vectors by default.

#

And will try to auto match with stuff like dot, but with matmul specifically it wants matrices for real and won't try to auto match.

silent spade Dec 7, 2022, 2:21 AM

#

Quick question on linear regression models. If our target data is skewed, do we want to transform the data or leave it alone?

charred light Dec 7, 2022, 4:57 AM

#

silent spade Quick question on linear regression models. If our target data is skewed, do we ...

Short answer: Yes, transform the data.*
Long answer: https://anshikaaxena.medium.com/how-skewed-data-can-skrew-your-linear-regression-model-accuracy-and-transfromation-can-help-62c6d3fe4c53

wooden sail Dec 7, 2022, 5:00 AM

#

short and accurate answer: linear regression often uses the least squares solution, but this is only optimal for normally distributed residuals

#

the estimator is fine, but a different solution approach would be needed

sleek tapir Dec 7, 2022, 6:52 AM

#

#

is this funnel shape

charred light Dec 7, 2022, 7:16 AM

#

sleek tapir is this funnel shape

Yes, heteroskedasticity.

sleek tapir Dec 7, 2022, 7:42 AM

#

charred light Yes, heteroskedasticity.

this is np.log as well

charred light Dec 7, 2022, 7:46 AM

#

sleek tapir this is np.log as well

Yes, but taking the log isn't a magical cure all bullet. It depends on the data your working with and the relationships between X and Y.

boreal gale Dec 7, 2022, 9:56 AM

#

https://github.com/microsoft/gather
this looks pretty slick, anyone tried it before?

GitHub

GitHub - microsoft/gather: Spit shine for Jupyter notebooks 🧽✨

Spit shine for Jupyter notebooks 🧽✨. Contribute to microsoft/gather development by creating an account on GitHub.

patent lynx Dec 7, 2022, 10:16 AM

#

I wanted to ask regarding the optimization on linear regression

#

#

how do you write m formula within python?

#

i tried:

#

for i in range(len(xdat)):
  m1 = (np.sum(xdat[i]-xbar)*ydat[i])
  m2 = (np.sum((xdat[i]-xbar)**2))
m = m1/m2 ```

#

but this formula does not give correct value

#

however this works:

#

x = []
xi = []
x.append((xdat[i]-xbar)*ydat[i])
xi.append((xdat[i]-xbar)**2)
m = np.sum(x)/(np.sum(xi))

#

the x dat, y dat represents the array of x and y in a point:

#

correct ans

lapis sequoia Dec 7, 2022, 10:22 AM

#

Hello, I have a question. There is panda’s dataframe.

Index,Number,datetime,counter

2488 196 2022-12-06 08:02:00 14496
2489 186 2022-12-06 09:05:00 15551
2490 138 2022-12-06 10:29:00 5448
2491 140 2022-12-06 10:30:00 4749
2492 140 2022-12-06 10:31:00 4749

I need to create dataframe with newest counter for the each Number
I know how to do that with sql tools, I would like to use pandas. Can you help me?

patent lynx Dec 7, 2022, 10:22 AM

#

wrong method ans:

wooden sail Dec 7, 2022, 12:16 PM

#

patent lynx ```py x = [] xi = [] x.append((xdat[i]-xbar)*ydat[i]) xi.append((xdat[i]-xbar)**...

the only difference i see is that you didn't write the loop in this second one, idk if you did that just to save time when posting here. otherwise the two expressions are the same.

#

ah nvm, i got it. you mixed some stuff up

patent lynx Dec 7, 2022, 12:16 PM

#

Oops yeah both should have the loop

wooden sail Dec 7, 2022, 12:17 PM

#

you do np.sum, but it's not needed because what's inside the sum is just a scalar. you're not summing up at all in the first one

#

you can remove the sum and instead use +=

#

you're just overwriting m1 and m2 at every iteration, ignoring the older results

#

that'd be the difference

#

also note that there is no need to loop if you use numpy

patent lynx Dec 7, 2022, 12:18 PM

#

Thank you

wooden sail Dec 7, 2022, 12:20 PM

#

so i'd do

import numpy as np
#define vectors x, y and scalars x_bar, y_bar
m = np.sum((x - x_bar)*y)/np.sum((x - x_bar)**2)
c = y_bar - m*x_bar

#

you can speed that up a little if you use dot products instead, too

hasty mountain Dec 7, 2022, 12:25 PM

#

Guys, in GANs, if my Generator has around 1.000.000 trainable parameters, and my Discriminator has 50.000 trainable parameters, then does this means that my Generator tends to get more optimized than my Discriminator, thus leading to lack of convergence? Or it doesn't matter at all, since the discriminator coordinates the optimization?

unreal vine Dec 7, 2022, 12:31 PM

#

hi there, I am martin and looking for someone to work with me in python app which is used for data visualization. I have created the basic app and it works fine on android but i am kinda week with matplotlib, if someone is good with matplotlib, should join me

shell sequoia Dec 7, 2022, 1:22 PM

#

unreal vine hi there, I am martin and looking for someone to work with me in python app whic...

Hi martin i am data scientist and working at tableau right now

#

Are you sure you have created a app which works on android without an issue even we at salesforce(tableau) can't do much on android. Our basic graph lacks proper screen size

frozen geyser Dec 7, 2022, 1:27 PM

#

unreal vine hi there, I am martin and looking for someone to work with me in python app whic...

What do you use for visualization and what type of visualization do you plan to do? Matplotlib is nice, but it is not an ultimate package. There are many visually beautiful and easy-to-use frameworks, like seaborn or plotly-dash, that allow dynamic charts.

unreal vine Dec 7, 2022, 1:31 PM

#

frozen geyser What do you use for visualization and what type of visualization do you plan to ...

plotly will not work on android and i have added all the plots of seaborn

unreal vine Dec 7, 2022, 1:32 PM

#

shell sequoia Are you sure you have created a app which works on android without an issue even...

yes i know the problem the thing is i have created my own backend for that with most concepts taken from matplotlib

shell sequoia Dec 7, 2022, 1:47 PM

#

@frozen geyser are you going to join him?

#

I am kinda thinking

frozen marten Dec 7, 2022, 2:40 PM

#

guys I have a task where I need to predict the year of something happening given the scenario. What kind of ml models can be put to use in this kind of scenario?

#

ping me on reply

frozen geyser Dec 7, 2022, 2:55 PM

#

frozen marten guys I have a task where I need to predict the year of something happening given...

Could you be more specific on the dataset? Do you have a time-series task or tabular data?

frozen marten Dec 7, 2022, 3:00 PM

#

frozen geyser Could you be more specific on the dataset? Do you have a time-series task or tab...

ok, so basically i'm working on dataset which has court cases

#

the attributes are yearoffiling, judge position, region where this case was taking place, year of decision, etc... so i need to predict the year when the decision will be taken given above attributes

#

well i dont think this is a time series

#

Linear regression can definitely not be used...

#

knn not possible as need to predict future dates

#

time series model such as xgboost is also not possible.... cos i dont find the problem fits into this category

#

how do i go about with this?🤔

frozen marten Dec 7, 2022, 3:26 PM

#

@frozen geyser anything you would suggest?

floral hollow Dec 7, 2022, 3:41 PM

#

HOW CAN I FIX TENSORFLOW ERROR

#

Call arguments received by layer "conv2d_3" (type Conv2D):
  • inputs=tf.Tensor(shape=(None, 2, 2, 64), dtype=float32)

#

code: ```py
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))

#

error is on the last model.add(layers.Conv2D(64, (3, 3), activation='relu'))

#

python 3.10.7 using tensorflow-cpu==2.10.0

wooden sail Dec 7, 2022, 3:44 PM

#

show the error so w can see

floral hollow Dec 7, 2022, 3:45 PM

#

floral hollow ``` Call arguments received by layer "conv2d_3" (type Conv2D): • inputs=tf.Ten...

this

wooden sail Dec 7, 2022, 3:45 PM

#

show the whole error

floral hollow Dec 7, 2022, 3:46 PM

#

Traceback (most recent call last):
  File "d:/real_Python/projects/test/test/main.py", line 127, in <module>
    model.add(layers.Conv2D(64, (3, 3), activation='relu'))
  File "D:\real_Python\projects\test\lib\site-packages\tensorflow\python\trackable\base.py", line 205, in _method_wrapper
    result = method(self, *args, **kwargs)      
  File "D:\real_Python\projects\test\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "D:\real_Python\projects\test\lib\site-packages\tensorflow\python\framework\ops.py", line 1969, in _create_c_op
    raise ValueError(e.message)
ValueError: Exception encountered when calling layer "conv2d_3" (type Conv2D).

Negative dimension size caused by subtracting 3 
from 2 for '{{node conv2d_3/Conv2D}} = Conv2D[T=, 1], explicit_paddings=[], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](Placeholder, conv2d_3/Conv2D/ReadVariableOp)' with input shapes: [?,2,2,64], [3,3,64,64].

Call arguments received by layer "conv2d_3" (type Conv2D):
  • inputs=tf.Tensor(shape=(None, 2, 2, 64), dtype=float32)

wooden sail Dec 7, 2022, 3:46 PM

#

aha

#

you have too many layers of convolution and maxpooling

#

the size of the intermediate values is too small to go through a conv layer

floral hollow Dec 7, 2022, 3:48 PM

#

would this work? ```py
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))

#

instead

#

just take away the two last model.adds

wooden sail Dec 7, 2022, 3:48 PM

#

should be the case.

#

you should read up on how discrete convolutions work

floral hollow Dec 7, 2022, 3:48 PM

#

new error though

wooden sail Dec 7, 2022, 3:49 PM

#

tensorflow's default behavior is to do no padding, meaning each convolution reduces the dimensions by something like kernelsize/2 + 1

floral hollow Dec 7, 2022, 3:51 PM

#

@wooden sail the code is: ```py
model.compile(optimizer='adam',
loss=losses.sparse_categorical_crossentropy(from_logits=True),
metrics=['accuracy'])

#

the new error is: Traceback (most recent call last): File "d:/real_Python/projects/test/test/main.py", line 132, in <module> loss=losses.sparse_categorical_crossentropy(from_logits=True), File "D:\real_Python\projects\test\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler raise e.with_traceback(filtered_tb) from None File "D:\real_Python\projects\test\lib\site-packages\tensorflow\python\util\dispatch.py", line 1170, in op_dispatch_handler result = api_dispatcher.Dispatch(args, kwargs) TypeError: Missing required positional argument

wooden sail Dec 7, 2022, 3:52 PM

#

TypeError: Missing required positional argument

#

missed an argument 😛

floral hollow Dec 7, 2022, 3:53 PM

#

what could i have miseed??

wooden sail Dec 7, 2022, 3:55 PM

#

not sure

#

are you sure that's the correct line where the error happens?

floral hollow Dec 7, 2022, 3:57 PM

#

floral hollow the new error is: ``` Traceback (most recent call last): File "d:/real_Python/...

thats what it says

#

unless its lying

frozen marten Dec 7, 2022, 3:57 PM

#

frozen marten guys I have a task where I need to predict the year of something happening given...

anyone have any idea on this?

#

@wooden sail or @floral hollow any thoughts on this?

frozen geyser Dec 7, 2022, 4:12 PM

#

frozen marten ok, so basically i'm working on dataset which has court cases

Ok. So that is a tabular data - regressive model.

Prepare the data:

Do some EDA (exploratory data analysis) with statistics, boxplots, categorization, etc.. to understand what is important and what is not.
Perform Data Cleaning (Fill the missing data, eliminate outliers, clean\unite the categories, etc.)
Make a data preparation for the model (labeling/encoding/embedding/binning/PCA analysis)
The choice of the model depends on the size of your dataset.
Based on the data size estimate the model. Try several models, starting with Linear/Tree-based (LGB)/XGB/Catboost and then shallow NNs. If nothing works, try more exotic (FM/FFM/DNNs) and/or Ensembles of these models.

My humble guess - One of the Random Forest models will give you a good result, if your dataset is not very large.

frozen marten Dec 7, 2022, 4:17 PM

#

frozen geyser Ok. So that is a tabular data - regressive model. Prepare the data: * Do some E...

But how can I make the model predict years(integers) cos when I use regressive models they result is horrible resuts which are no way close to expected and also include float numbers

frozen marten Dec 7, 2022, 4:17 PM

#

frozen geyser Ok. So that is a tabular data - regressive model. Prepare the data: * Do some E...

And thanks for such an elusive answer, the way you've put your thoughts into words gives a good headstart

#

Also this is not a classification task, so I felt tree based models can't be used(I'm a beginner, so I'm open for any disagreements and looking for it)

frozen geyser Dec 7, 2022, 4:22 PM

#

frozen marten But how can I make the model predict years(integers) cos when I use regressive m...

I see no problem with float numbers, been rounded to integers. 🙂
Most of the bad results are usually coming from the bad datasets. garbage-in -> garbage-out. This is why we need to make lots of efforts to clean the data.
tree-based models are good for regression tasks as well.
If you have some time, download Orange-Canvas software (you have it in Anaconda as well) and play with your dataset. It has very visual approach for modeling and visualization of results. Very useful as a first step. 🙂

frozen marten Dec 7, 2022, 4:24 PM

#

frozen geyser I see no problem with float numbers, been rounded to integers. 🙂 Most of the ba...

yes i thought initially about the rpunding part.. but the problem is that when it had to predict 2018 or something of this sort it gives 3319.222 which on rounding is also not beneficial

#

garbage-in -> garbage-out

#

wow loved this

#

I'm done with eda and preprocessing

#

the model building is something i was struggling

#

I'll try looking at orange software (i do have anaconda)

#

Thanks a lot.

#

Loved the way you are conveying things

frozen geyser Dec 7, 2022, 4:25 PM

#

🙂

frozen marten Dec 7, 2022, 4:26 PM

#

When you are free let me know how did you get into the field of ML(looking for your first steps and resources that helped you in this journey)

#

(No hurries)

lapis sequoia Dec 7, 2022, 5:24 PM

#

i am dealing with a problem statement that requires me to convert speech (with lot of disturbances) to text
what algorithms can i start with?

low sky Dec 7, 2022, 5:34 PM

#

How can I get only key values from a dictionary?

onyx rock Dec 7, 2022, 5:36 PM

#

Hello

#

I am trying to use open cv to open an image

#

import cv2
import pytesseract


pytesseract.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"

img = cv2.imread # from the tutorial I am watching , the guy is using imread , but in my case that function does not exist

pure sable Dec 7, 2022, 5:46 PM

#

low sky How can I get only key values from a dictionary?

dictionary.keys()

floral hollow Dec 7, 2022, 6:07 PM

#

frozen marten guys I have a task where I need to predict the year of something happening given...

im newb sry

unreal charm Dec 7, 2022, 7:00 PM

#

Hey, how dataset for Q&A chatbot should looks like? Im using Bert and I have those columns so far in csv file: context, questions, answers, starting_point. Have I missed something?

nocturne eagle Dec 7, 2022, 7:20 PM

#

well, the OpenAI GPT-3 chatbot was trained on the davinci corpus, both text and code

#

for the base model, they used code-davinci-002 and then extended it with text-davinci-002 and text-davinci-002

#

woops, text-davinci-003

#

woops, I was wrong there, those are the model names

#

they used the commoncrawl corpus. it can be found here => https://commoncrawl.org/

Common Crawl

Barrett Cox

Homepage

plush jungle Dec 7, 2022, 7:57 PM

#

I'm pretty sure something is wrong with my backpropagation code

def backpropagate(self, output, y):
        # this outputs a scalar
        output_delta = (output[2] - y) * output[2]* (1-output[2])

        # multiply the delta times the input to produce a 3-vector
        output_gradient = np.matmul(
            np.reshape(output_delta, (1,1)),
            np.reshape(output[1], (1,3)))


        # this outputs a 3-vector
        hidden_delta =  np.matmul(output_delta, self.output_layer.weights) * output[1] * (1-output[1])
        # reshape the input from (,10000) to (1,10000)
        output[0] = np.reshape(output[0], (1,10000))


        # this returns a (3,10000) matrix
        hidden_gradient = np.matmul(np.reshape(hidden_delta, (1,3)).T, output[0])

        
        
        # update the output bias
        self.output_layer.biases -= output_delta * self.lr
        # update the output weights
        self.output_layer.weights -= output_gradient * self.lr

        # update the hidden layer biases
        self.hidden_layer.biases -= hidden_delta * self.lr
        # update the hidden layer weights
        self.hidden_layer.weights -= hidden_gradient * self.lr```

#

when I train on two image classes, the predictions move towards one class on every single iteration instead of moving towards the class it's getting trained on

#

so the loss goes up for one class and down for the other

iron basalt Dec 7, 2022, 8:09 PM

#

plush jungle I'm pretty sure something is wrong with my backpropagation code ```py def backpr...

What is image_class for the first 20? Can you print them?

plush jungle Dec 7, 2022, 8:09 PM

#

iron basalt What is image_class for the first 20? Can you print them?

I set it so it alternates 0,1,0,1

#

with learning rate at .05 it goes like this

['0 : 0.49880739489615755', '1 : 0.9286612738113579']
['0 : 0.49685552291208046', '1 : 0.5031616959044104']
['0 : 0.49685551278683343', '1 : 0.5031444873024917']
['0 : 0.49685551278678103', '1 : 0.5031444872132199']```

#

where the number on the left is the class and the number on the right is the prediction

#

as you can see they're both going lower, so even when it's training on image class 1 it adjusts towards 0

iron basalt Dec 7, 2022, 8:13 PM

#

What is the forward for a layer?

plush jungle Dec 7, 2022, 8:16 PM

#

iron basalt What is the forward for a layer?

what do you mean? the output of the forward pass?

iron basalt Dec 7, 2022, 8:16 PM

#

plush jungle what do you mean? the output of the forward pass?

The code for a forward pass on a layer.

plush jungle Dec 7, 2022, 8:16 PM

#

    def forward(self,x):
        return sigmoid(np.dot(self.weights, x) + self.biases)```

#

this is nn.forward()

    def forward(self,x):
        output_vectors = []
        
        for layer in self.layers:
            # if the first layer
            if not output_vectors:
                # pass it the input vector
                output = layer.forward(x)
            else:
                # otherwise pass it the previous layer's output
                output = layer.forward(output_vectors[-1])

            output_vectors.append(output)
        
        return output_vectors```

iron basalt Dec 7, 2022, 8:19 PM

#

output_gradient = output_delta * output[1]

plush jungle Dec 7, 2022, 8:19 PM

#

yeah, that multiplies the output of the hidden layer times the output_delta

#

should that be matmul?

iron basalt Dec 7, 2022, 8:19 PM

#

Final output is 1 neuron, layer before that is 3.

#

Weight matrix (1, 3).

plush jungle Dec 7, 2022, 8:20 PM

#

yeah

iron basalt Dec 7, 2022, 8:20 PM

#

Yeah.

#

\

#

matmul

plush jungle Dec 7, 2022, 8:24 PM

#

iron basalt matmul

ok, changed it to this

        output_gradient = np.matmul(
            np.reshape(output_delta, (1,1)),
            np.reshape(output[1], (1,3)))```
still the same issue though

iron basalt Dec 7, 2022, 8:25 PM

#

Can you simplify this? ```py

reshape the input from (,10000) to (1,10000)

output[0] = np.reshape(output[0], (1,10000))

this returns a (3,10000) matrix

hidden_gradient = np.matmul(np.reshape(hidden_delta, (1,3)).T, output[0])

plush jungle Dec 7, 2022, 8:25 PM

#

iron basalt Can you simplify this? ```py # reshape the input from (,10000) to (1,10000) ou...

I could make it a one liner

#

if that's what you mean

#

    def backpropagate(self, output, y):
        # this outputs a scalar
        output_delta = (output[2] - y) * output[2]* (1-output[2])
        # multiply the delta times the input to produce a 3-vector

        output_gradient = np.matmul(
            np.reshape(output_delta, (1,1)),
            np.reshape(output[1], (1,3)))


        # this outputs a 3-vector
        hidden_delta =  np.matmul(output_delta, self.output_layer.weights) * output[1] * (1-output[1])


        # this returns a (3,10000) matrix
        hidden_gradient = np.matmul(
            np.reshape(hidden_delta, (3,1)),
            np.reshape(output[0], (1,10000)))

        
        
        # update the output bias
        self.output_layer.biases -= output_delta * self.lr
        # update the output weights
        self.output_layer.weights -= output_gradient * self.lr

        # update the hidden layer biases
        self.hidden_layer.biases -= hidden_delta * self.lr
        # update the hidden layer weights
        self.hidden_layer.weights -= hidden_gradient * self.lr```

iron basalt Dec 7, 2022, 8:31 PM

#

plush jungle ```py def backpropagate(self, output, y): # this outputs a scalar ...

Try a more simple task first, not images.

plush jungle Dec 7, 2022, 8:31 PM

#

like what

iron basalt Dec 7, 2022, 8:31 PM

#

The xor problem.

#

[([0, 0], 0), ([0, 1], 1), ([1, 0], 1), ([1, 1], 0)]

plush jungle Dec 7, 2022, 8:35 PM

#

2 inputs, 1 output

iron basalt Dec 7, 2022, 8:35 PM

#

Yes.

plush jungle Dec 7, 2022, 8:35 PM

#

so then the hidden layer neurons would have 2 weights

iron basalt Dec 7, 2022, 8:35 PM

#

Yes, but you should not have to hard code those values.

plush jungle Dec 7, 2022, 8:35 PM

#

yeah i didn't

iron basalt Dec 7, 2022, 8:35 PM

#

Just how many neurons per layer.

iron basalt Dec 7, 2022, 9:22 PM

#

plush jungle ```py def backpropagate(self, output, y): # this outputs a scalar ...

What is hidden_delta's shape? What is the biases shape?

glass shale Dec 7, 2022, 9:59 PM

#

If I were to make a bot that is designed to answer questions from a quiz on chrome, what packages should I use?, new to ai development

plush jungle Dec 7, 2022, 10:00 PM

#

iron basalt What is hidden_delta's shape? What is the biases shape?

I changed to to the xor problem and it does the same thing:

input:ground_truth:prediction

before training
[0, 0] : 0 : 0.884287006863036
[0, 1] : 1 : 0.8970981298788248
[1, 0] : 1 : 0.9055810664805809
[1, 1] : 0 : 0.9148745573457218

after training
[0, 0] : 0 : 0.882638663530517
[0, 1] : 1 : 0.8955110093882327
[1, 0] : 1 : 0.9040762320307318
[1, 1] : 0 : 0.913433964744165```

#

all the predictions just got lower

#

this is making me wonder if I've got like a sign error

iron basalt Dec 7, 2022, 10:17 PM

#

plush jungle I changed to to the xor problem and it does the same thing: ``` input:ground_tr...

Try y - output and += instead of -=.

hasty mountain Dec 7, 2022, 10:23 PM

#

Hey guys, when extracting features from an image, I can use a neural network with an architecture optimized for feature extraction(VGG19, UNet) or I can use PCA, right? There's no "right or wrong". The difference is just that one option requires training and optimization through time, while the other doesn't?

#

Also, is there a parameter I can use to determine how much I should reduce the dimensions of my image in a feature extracture network? I know that VGG19 and UNet encoder tend to reduce dimensionality until it gets feature maps with shapes like 8x8x512, but why not stop at 16x16x256? Or at 4x4x1024? Or even 1x1x4048?

mild dirge Dec 7, 2022, 10:32 PM

#

Well pca and a feature extraction NN both reduce dimensions, but the way they do it is really different

#

I think for images, you would really want to use features extracted by such a NN, because they are made for the purpose of finding "useful" features in images

#

And those feature extraction NNs are normally made by having a encoder-decoder like architecture, or a NN that has some task, and then chopping the last few layers off

#

The output size that you need is hard to say, and will most likely have to be found with some trial-and-error

iron basalt Dec 7, 2022, 10:38 PM

#

hasty mountain Also, is there a parameter I can use to determine how much I should reduce the d...

Depending on your ANN, it may be able to grow more neurons as needed dynamically. If not, it's a parameter that can be chosen via trial and error or some more complicated way. The correct choice is "big enough but not too big that no significant reduction in number of dimensions is happening."

hasty mountain Dec 7, 2022, 10:39 PM

#

I see... And a feature extraction might not be simply dimensionality reduction? Using a decoder might also help it?

iron basalt Dec 7, 2022, 10:39 PM

#

Depends on data / task.

mild dirge Dec 7, 2022, 10:41 PM

#

hasty mountain I see... And a feature extraction might not be simply dimensionality reduction? ...

Not really sure what you mean here

hasty mountain Dec 7, 2022, 10:42 PM

#

mild dirge And those feature extraction NNs are normally made by having a encoder-decoder l...

those feature extraction NNs are normally made by having a encoder-decoder like architecture, or a NN that has some task, and then chopping the last few layers off

UNet has a encoder and a decoder, the last few layers are linear layers, if I remember correctly.

#

But VGG19 is just "encoder" + linear

mild dirge Dec 7, 2022, 10:42 PM

#

Unet only has convolutional layers iirc

hasty mountain Dec 7, 2022, 10:43 PM

#

In the encoder part. The decoder part upsamples

mild dirge Dec 7, 2022, 10:43 PM

#

The upsampling is also convolutional though no?

hasty mountain Dec 7, 2022, 10:43 PM

#

I think it has some conv layers which reduces the channels size and keep height and width, but they're followed by upsampling

hasty mountain Dec 7, 2022, 10:44 PM

#

mild dirge The upsampling is also convolutional though no?

I think the original one was conv + upsampling. Nowadays people tend to use tranpose conv I guess

mild dirge Dec 7, 2022, 10:44 PM

#

Well anyways, vgg19 has convolutional and then fully connected, you can choose where you want to cut off

hasty mountain Dec 7, 2022, 10:46 PM

#

Ok, thanks!
I think I'll stick with the smallest dimension as possible and then try using higher dimensions for the features extracted...but not big enough to kill my GPU

mild dirge Dec 7, 2022, 10:47 PM

#

It also depends on what you plan to do with the features

hasty mountain Dec 7, 2022, 10:49 PM

#

For now I'm planning to use for multi-label classification.
But now I'm curious... what if I want to use them for simple classification? How would things work in both cases?

mild dirge Dec 7, 2022, 10:50 PM

#

Doesn't change too much, in both cases you would basically make a network that takes the features and it ends up with some values for the output nodes

#

The difference is the activation function for the final layer

#

multi-label would be sigmoid, and multi-class (so it's 1 of the classes always) would be softmax

hasty mountain Dec 7, 2022, 10:51 PM

#

(Oh yes, I get confused with that difference)

mild dirge Dec 7, 2022, 10:51 PM

#

But what you are trying to do is basically transfer learning, using a trained model for a similar problem

#

Often you can use the original model (like vgg19) and then take of the final layer and stick a final layer on with the correct amount of nodes and activation function

#

And probably freeze the weights for the convolutional layers and maybe some earlier fully connected

#

And then train on your data

#

So you only really train some fully connected layers, and it uses feature extractionfrom the pre-trained weights

hasty mountain Dec 7, 2022, 10:53 PM

#

Nah, I actually want to train my extractor from scratch. I'll just use a similar architecture because...well...because probably the folks tried to extract the crème de la crème using...like...1000 Tesla GPUs...

#

Besides, I won't be classifying real world images, like VGG was trained on CIFAR, I'll classify games screenshots.

mild dirge Dec 7, 2022, 10:55 PM

#

Well vgg probably recognizes very basic shapes and patterns too like roundness, and cornerness and w/e so you could still use some pre-trained layers

#

but vgg19 is p-retty big, so maybe just doing it from scratch would be easier 😛

plush jungle Dec 7, 2022, 11:26 PM

#

iron basalt Try y - output and += instead of -=.

yeah same thing with that

#

wait

#

it did this after 2000 epochs

#

[0, 0] : 0 : 0.07133328991870373
[0, 1] : 1 : 0.9867524208847188
[1, 0] : 1 : 0.04641128083308918
[1, 1] : 0 : 0.05997767236743279```

#

but that's still not solving the xor problem

#

the two zero predictions are close to zero, and one of the 1 predictions is close to 1

#

but the other 1 prediction is closer to zero than the zero prediction

lapis sequoia Dec 8, 2022, 12:16 AM

#

I have this list of dictionaries I'm using for a boy with a neural network and I've been trying to do two things make it so I can exclude certain dictionaries and take those excluded dictionaries get a different result with them as shown above. But I'm really stumped and could use some help

rn_image_picker_lib_temp_f32acdc0-50f3-46ec-b566-20763a5a893e.jpg

rn_image_picker_lib_temp_e3b1379f-c776-4ca8-95fb-e55111b39b11.jpg

bold timber Dec 8, 2022, 2:22 AM

#

Hello guys I have a question: when we have 32 neurons like this, whether it means we have 32 hidden states in the RNN model?

brave sand Dec 8, 2022, 2:39 AM

#

has anyone worked with a custom image classifier? for a live video?

timber spoke Dec 8, 2022, 7:02 AM

#

onyx rock ```python import cv2 import pytesseract pytesseract.pytesseract.tesseract_cmd ...

cv2.imread(filepath)

true scaffold Dec 8, 2022, 7:05 AM

#

Hi, I'm trying to give a color to whole row if column value exist in a list, but it is not working, below is the code I'm using:

d = {
  'ids': ['id1', 'id2', 'id3', 'id4'],
  'values': ['Vanilla', 'Chocolate', 'Butterscotch', 'Strawberry']
}
dd = pd.DataFrame(d)

def highlight(row):
  if row.ids in ['id1', 'id3']:
    return ['background-color: red']
  else:
    return ''

dd.style.apply(highlight, axis=1)
dd

Can someone help?

torn saddle Dec 8, 2022, 7:09 AM

#

Hey everyone, I am new to Huggingface and I'm trying to Fine tune roberta-base on go_emotions dataset for Multi Label classification.
My final train dataset is a dictionary where the labels are one hot encoded.

I am using the Trainer class of Huggingface to train my model and when I run the trainer class an error pops up saying

"Value error: Classification metrics can't handle a mix of multilabel-indicator and binary targets"

I have tried decoding it by changing the shapes in the metrics function but nothing works.
Can someone please help me out with this?

old meteor Dec 8, 2022, 11:18 AM

#

Hey Python community, I hope this is ok to post here!

I’m currently building an academy for AI products. People can join and learn to build and launch a product based on AI.

You’ve probably seen the rush of AI products building on AI models like GPT3 and Stable Diffusion.

Am working with some friends all with backgrounds in machine learning / software engineering. We’re really excited about generative AI. We built a toy proof of concept here (https://use-persona.com/) that emulates reddit accounts.
Our experience here made us think it’s going to fundamentally change how we interact with computers.

We want to make sure as many people benefit from AI as possible. To do that, we’re making it as easy as possible for people to learn how to build and deploy products using these models. This will let you use AI to make a side income or solve a work problem.

The problem we are solving is: You have an idea for an AI-based solution to a problem, but you don’t know how to build it. We solve it by teaching you and dealing with the boring stuff (website, payments etc)

Here’s how we see it working:

The academy will work as a live online 4-week course. This will be part-time (~10 hours per week)
You’ll meet others on the program and share ideas
Most of the learning is done by tackling on tasks specific to your idea. We’ll have weekly deadlines to keep you on track.
There will be 1-to-1s with us and live sessions each week
By the end, you’ll have built a product that solves a problem using AI. We’ll can handle hosting and infrastructure for you - but it’s your product and you can move it elsewhere if you choose.

The way we’ll make money is primarily through a 20% revenue share of the products that come out. We’d ask for an upfront deposit to make sure we don’t get our time wasted.

We’re accepting applications: https://pages.viral-loops.com/ai-product-academy

Otherwise, we’d love your feedback!

Reddit Impersonator - generate comments from reddit accounts | Persona

Generate Reddit comments which impersonate a specific user. Use this to generate fake comments for a famous Reddit account, your own account, or for a friend's account.

AI Product Academy

fresh tiger Dec 8, 2022, 1:43 PM

#

Hey! Just had a question regarding feature selection and data cleaning. Is selection generally done after cleaning or before?

Mainly want to confirm if my choice here is correct:
I have two feature columns that I do not want to select for training, and these are the only 2 columns that contain null values. I handle null values via removing the data row as a whole. I am thinking, wouldnt it just be better to remove the columns before checking for null values so that I can have more training examples/rows?

serene scaffold Dec 8, 2022, 1:48 PM

#

fresh tiger Hey! Just had a question regarding feature selection and data cleaning. Is selec...

It depends. If you know you don't want to use a feature at all, then you don't need to clean it. But even if you don't use it directly, you might use it to drive other features, or something.

fresh tiger Dec 8, 2022, 1:52 PM

#

Alright thanks! In that case I'll remove them before cleaning (can retrain 500 training examples now)!

coral perch Dec 8, 2022, 2:39 PM

#

Hey i have a very messy CSV file which i imported. I need help with formatting it so the data matches the right headers. Im a beginner at python and ive been trying to solve this for a few hours. Can someone please help me?

lapis sequoia Dec 8, 2022, 3:28 PM

#

can anyone help me with machine learning? i need to classify images according to their sentiments, i have the sentiments ready for each image and i also have the text seperated from each image and stored in a csv, now i have to train the images usingn ML models (from sklearn) but i dont know where to begin

unreal vine Dec 8, 2022, 3:59 PM

#

https://stackoverflow.com/questions/74730896/how-to-plot-max-and-min-value-in-seaborn-scatterplot?noredirect=1#comment131894859_74730896

Stack Overflow

how to plot max and min value in seaborn scatterplot?

import seaborn as sns
df = sns.load_dataset("penguins")
sns.scatterplot(data=df, x="island", y="body_mass_g",hue="sex",s=300)

I am using seaborn to ...

gilded bobcat Dec 8, 2022, 4:09 PM

#

Hey there I have a question on scaling some values, here's the issue:

I have a sparse matrix of tags for games on steam. I wanted to scale the values as frequencies for each game so for instance CSGO might have 90 tags for Shooter and only 10 tags for cooperative, I would want .9 for shooter for .1 for tags (something like that). However I also want to ensure that these weights consider the total amount of votes each game has, so that Shooter X with 9 tags for shooter and 1 tag for cooperate isn't considered just as popular than CSGO....

this might be as easy as not doing anything to my matrix but I was curious on yall's thoughts

runic raft Dec 8, 2022, 5:38 PM

#

@sturdy parrot let's chat here where it's less busy

sturdy parrot Dec 8, 2022, 5:38 PM

#

runic raft <@945343553171759124> let's chat here where it's less busy

Thank you for the time, yes.

#

As for experience, C, python and decent bit of js. I am quite comfortable and confident with python

runic raft Dec 8, 2022, 5:39 PM

#

If you've been writing C + Python for 2-ish years, I'd highly recommend working through the first few chapters of Aurelien Gerón's Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow

#

It is a great book, and the first chapter walks you through a project where work on housing data from California to try and find houses that are of good value from start to finish

sturdy parrot Dec 8, 2022, 5:41 PM

#

runic raft It is a great book, and the first chapter walks you through a project where work...

Thank you for the recommendations!

runic raft Dec 8, 2022, 5:41 PM

#

If you prefer courses, I hear the fast.ai courses are very good, but they put a large emphasis on deep learning, which while it is very popular, only gets used in a handful of places

#

IMO the best way to show that you are competent (aside from job experience) in AI/ML is building your own projects that solve problems you've found in everyday life

sturdy parrot Dec 8, 2022, 5:44 PM

#

runic raft IMO the best way to show that you are competent (aside from job experience) in A...

I am not actively looking for jobs, I am more of a hobbyist.

runic raft Dec 8, 2022, 5:45 PM

#

I see. If that's the case, then you should definitely feel more free to explore what you like 😄

sturdy parrot Dec 8, 2022, 5:45 PM

#

I have decent chunk of projects but unfortunately they are incomplete due to irl work / studies 😆

runic raft Dec 8, 2022, 5:46 PM

#

Are you more interested in just learning how things work, or building stuff from libraries that already exist?

#

Of course most people have some interest in both, but I can suggest content based on whether you prefer building vs. learning

sturdy parrot Dec 8, 2022, 5:47 PM

#

runic raft Are you more interested in just learning how things work, or building stuff from...

I would say both :P

sturdy parrot Dec 8, 2022, 5:47 PM

#

runic raft Of course most people have some interest in both, but I can suggest content base...

Currently I am more into chat bots, although that seems a bit main stream now.

runic raft Dec 8, 2022, 5:48 PM

#

It is quite mainstream now, haha. But there's nothing wrong with that!

sturdy parrot Dec 8, 2022, 5:52 PM

#

runic raft IMO the best way to show that you are competent (aside from job experience) in A...

AI/ML is building your own projects that solve problems you've found in everyday life
I have built projects (again, not complete) that has concerned with parsing. Recently I had projects where python code has been take and standard operations like +, -, * are converted to add, sub, mul from the operator library, and others where overload signature from typing are automatically inferred for kwargs from existing ones.

(actually I am in my alt account disguise 🕵️ , you can see my gh here https://github.com/Achxy if you are interested in my projects)

runic raft Dec 8, 2022, 5:54 PM

#

If you're specifically interested in chat bots, I'd highly recommend you look into:

spaCy, it's a Python library you can use to build software tools that process lots of language. I've used it professionally and it's wonderful. Here are some tutorials so you can get an idea of the sort of tasks that are commonly done with spaCy: https://github.com/explosion/projects/tree/v3/tutorials
The hot talk of the town powering all of these hyped language models are Transformers. Personally I think PyTorch is the best deep learning library for people who want to practice and learn about deep learning at the moment. Their tutorial collection is also very, very good: https://pytorch.org/tutorials/

GitHub

projects/tutorials at v3 · explosion/projects

🪐 End-to-end NLP workflows from prototype to production - projects/tutorials at v3 · explosion/projects

#

These libraries are just tools. The point of me suggesting them is that you can learn about the basic concepts while simulataneously learning how to use these tools. Maybe you'll like other tools more, but regardless of whether you're using these tools or other tools, the same concepts will apply

sturdy parrot Dec 8, 2022, 5:58 PM

#

runic raft These libraries are just tools. The point of me suggesting them is that you can ...

Thank you very much for the time and willingness to help!!
I will be looking to the pytorch tutorials that you sent as I could look into more into general ai things before getting into chatbots in specific.

grave swallow Dec 8, 2022, 6:00 PM

#

is there any site yall would recommend to train an image recognition model? i have the code but need a model would be highly appreciated. thanks.

runic raft Dec 8, 2022, 6:01 PM

#

For sure! Always happy to help!

For what it's worth, I had actually really good time learning how to work with text by writing search engines for text documents. I learned most of my NLP knowledge by trying to write a search engine for academic papers. You start off with basic concepts, and you can keep improving the search as you learn more about information retrieval.

runic raft Dec 8, 2022, 6:02 PM

#

grave swallow is there any site yall would recommend to train an image recognition model? i ha...

If it needs to run on CPU, OpenCV + OpenVINO isn't a bad choice, otherwise I'd go with https://github.com/ultralytics/yolov5

GitHub

GitHub - ultralytics/yolov5: YOLOv5 🚀 in PyTorch > ONNX > CoreML > ...

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite. Contribute to ultralytics/yolov5 development by creating an account on GitHub.

grave swallow Dec 8, 2022, 6:03 PM

#

runic raft If it needs to run on CPU, OpenCV + OpenVINO isn't a bad choice, otherwise I'd g...

the thing is i have no experience of working with image recognition can u get me started a lil on how to create a model?

runic raft Dec 8, 2022, 6:04 PM

#

grave swallow the thing is i have no experience of working with image recognition can u get me...

What hardware do you have available?

grave swallow Dec 8, 2022, 6:04 PM

#

cpu i have gpu i dont sadly

runic raft Dec 8, 2022, 6:04 PM

#

Is this just for learning? Are you trying to train an image classifier on custom data?

grave swallow Dec 8, 2022, 6:06 PM

#

I'll give a overview, basically i have 5 pokemons and 8-10 pics of them i wish to create a classifier that names them. later on I'll like to name all the 800 or so pokemon so thats why i asked if whats a good way to train a model

hasty mountain Dec 8, 2022, 6:08 PM

#

grave swallow cpu i have gpu i dont sadly

Use google colaboratory then

grave swallow Dec 8, 2022, 6:08 PM

#

yea but how do i get started?

hasty mountain Dec 8, 2022, 6:09 PM

#

Hm... I don't remember a good site with ML courses for free... But maybe coursera might help you if you're a student

runic raft Dec 8, 2022, 6:09 PM

#

You are probably going need a lot more than 8-10 pictures per class to get a deep learning model to perform well on pokemon.

Here is a dataset you can use: https://www.kaggle.com/datasets/kvpratama/pokemon-images-dataset

YOLOv5 comes with lots of documentation and tutorials that you can use to get started, they even have lots of content showing how you can use YOLOv5 in Colab: https://docs.ultralytics.com/

Pokemon Images Dataset

Dataset of 819 Pokemon images

hasty mountain Dec 8, 2022, 6:09 PM

#

Oh yes, I think Kaggle has some tutorials on Machine Learning for free

grave swallow Dec 8, 2022, 6:11 PM

#

nono i just need help as to how to speed up the process of training the ai, i was currently using teachable but it took a lot of time as i had to upload every image by hand

hasty mountain Dec 8, 2022, 6:15 PM

#

grave swallow nono i just need help as to how to speed up the process of training the ai, i wa...

You can use a high learning rate and decrease it over time with a scheduler

runic raft Dec 8, 2022, 6:16 PM

#

grave swallow nono i just need help as to how to speed up the process of training the ai, i wa...

I think the Colab free tier has GPUs available, but you have to be careful with time limits. You can definitely bulk upload the dataset to either GDrive or TFHub or any other service and just connect the Colab notebook to the dataset

hasty mountain Dec 8, 2022, 6:16 PM

#

A high learning rate will decrease your weights towards an optimal point faster. When the weights starts oscilating, you can decrease the learning rate so you can get better weights until they oscilate again

#

Just be careful to not overfit your model. Try repeating this until you get a learning rate of 1e-5

grave swallow Dec 8, 2022, 6:40 PM

#

tysm yall

grave swallow Dec 8, 2022, 6:40 PM

#

runic raft I think the Colab free tier has GPUs available, but you have to be careful with ...

also can u tell when do many ml coders like to programm in jupyter or colab?

runic raft Dec 8, 2022, 6:41 PM

#

grave swallow also can u tell when do many ml coders like to programm in jupyter or colab?

Convenience

grave swallow Dec 8, 2022, 6:41 PM

#

and what are thd different block like things in there?

runic raft Dec 8, 2022, 6:41 PM

#

I'm actually personally not a huge fan of these interactive environments, but I have my own workstation with a GPU that I use for my hobbyist ML work

odd meteor Dec 8, 2022, 6:42 PM

#

grave swallow also can u tell when do many ml coders like to programm in jupyter or colab?

Usually, it's because that's what people who got into Data Science and ML used to learn. And it's also very beginner-friendly.

grave swallow Dec 8, 2022, 6:43 PM

#

odd meteor Usually, it's because that's what people who got into Data Science and ML used t...

and what's the block things in those?

runic raft Dec 8, 2022, 6:45 PM

#

grave swallow and what's the block things in those?

the blocks are "cells" which allow to execute Python snippets one cell at a time. A typical python script is passed to the interpreter, translated into bytecode, and then executed in program order.

With these interactive notebooks, you can control the program order by executing different cells at your own discretion

#

@grave swallow you can learn more about them by reading through the tutorial here: https://colab.research.google.com/notebooks/intro.ipynb

Google Colaboratory

odd meteor Dec 8, 2022, 6:48 PM

#

grave swallow and what's the block things in those?

It's called cell. So you can turn a cell into a code cell / markdown / Raw NBConvert

grave swallow Dec 8, 2022, 6:49 PM

#

runic raft <@493320149315747840> you can learn more about them by reading through the tutor...

tysm

grave swallow Dec 8, 2022, 6:49 PM

#

odd meteor It's called cell. So you can turn a cell into a code cell / markdown / Raw NBCon...

tysm

mint palm Dec 8, 2022, 7:24 PM

#

i am trainning YOLO and noticed that augmenting data lessed the accuracy

shell sequoia Dec 8, 2022, 8:10 PM

#

https://stackoverflow.com/questions/74735629/how-to-plot-boundary-lines-for-max-and-min-values-on-each-plot-in-python-seaborn

Stack Overflow

How to plot boundary lines for max and min values on each plot in p...

suppose I am plotting scatter plot in seaborn python
import seaborn as sns
df = sns.load_dataset("penguins")
sns.scatterplot(data=df, x="flipper_length_mm", y="

#

I need help here

#

Want boundaries of x and y value

young granite Dec 8, 2022, 9:17 PM

#

anyone used ML on FFT-data so far and can give some recommendation regarding a good model?
Or maybe a tip how i would be able to create a regression plot for n-freq.

charred light Dec 8, 2022, 9:34 PM

#

mint palm i am trainning YOLO and noticed that augmenting data lessed the accuracy

Accuracy should not be your only measurement. F1 score is a better test.

honest ridge Dec 8, 2022, 10:28 PM

#

Has anyone used Python/ML to do a multivariate correlation (many-to-one) analysis? I need to find where the most significant correlations are to then identify the variables I will use for a customer scoring model ("Expansion Score" something like that). Just curious if anyone has used similar methods and would be open to discussing the pseudo/exo build needed

trim blade Dec 8, 2022, 11:49 PM

#

Hello! I'm trying to get info from the U.S. Census using their API, and Python is throwing an error that I don't quite understand. The relevant portion of the code is:

r = requests.get(base_url, params=predicates) # Making a response object; "predicates" is the format of the request. "base_url" is self-explanatory

df = pd.DataFrame(columns=r.json()[0], data=r.json()[1:])  # Putting it into a dataframe, and using the json file to label the data

df["year"] = year

dfs.append(df) # Just so I know what year this is from, since that's not in the data itself

counties = pd.concat(dfs)

print(counties.head()) # Just to make sure I'm getting the data that I expect

However, I get the following error:

Traceback (most recent call last):
  File "/usr/lib/python3.11/site-packages/requests/models.py", line 971, in json
    return complexjson.loads(self.text, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.11/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "Main.py", line 36, in <module>
    df = pd.DataFrame(columns=r.json()[0], data=r.json()[1:])
                              ^^^^^^^^
  File "/usr/lib/python3.11/site-packages/requests/models.py", line 975, in json
    raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
```
What exactly is it trying to tell me about the 'r.json' bit?

vague sable Dec 9, 2022, 1:22 AM

#

HI all, I am trying to follow this tutorial>
https://thedatafrog.com/en/articles/show-data-google-map-python/

I'm getting stuck on export GOOGLE_API_KEYS=<blah blah>
Its stating export is not recognized as a command

Can anybody advise what I'm doing wrong?

The Data Frog

Show your Data in a Google Map with Python

Create an interactive display for geographical data with python: real-estate prices near Geneva.

#

trim blade Dec 9, 2022, 2:13 AM

#

trim blade Hello! I'm trying to get info from the U.S. Census using their API, and Python i...

(update: the API request was structured wrong. That will do it)

plush jungle Dec 9, 2022, 2:35 AM

#

does the backpropagation derivation change if you calculate loss differently?

wooden sail Dec 9, 2022, 2:37 AM

#

yes

#

after all it's the derivative of the loss that you're computing 😛 naturally if you change the loss, it's derivative changes

plush jungle Dec 9, 2022, 2:39 AM

#

ok second question. What is the difference between a delta and a gradient

lapis sequoia Dec 9, 2022, 3:20 AM

#

is this code wrong?

#

what happens when you give a smaller subset of series a bigger number of values

#

it doesn't give a traceback though

serene scaffold Dec 9, 2022, 3:26 AM

#

lapis sequoia is this code wrong?

I will answer this for you, but this is the last time I will answer any question you ask that involves a screenshot that could have been text.

wooden sail Dec 9, 2022, 3:26 AM

#

plush jungle ok second question. What is the difference between a delta and a gradient

a delta usually means a finite difference. gradients involve infinitesimals

serene scaffold Dec 9, 2022, 3:26 AM

#

lapis sequoia is this code wrong?

the problem with using .loc in this context is that vt.loc[vt ....'value'] potentially has fewer rows than everything to the right of the =

lapis sequoia Dec 9, 2022, 3:27 AM

#

serene scaffold the problem with using `.loc` in this context is that `vt.loc[vt ....'value']` p...

yep

serene scaffold Dec 9, 2022, 3:27 AM

#

(I'm not typing it all out, so if you don't know what I'm referring to, put the code in the chat as text.)

plush jungle Dec 9, 2022, 3:27 AM

#

wooden sail a delta usually means a finite difference. gradients involve infinitesimals

but otherwise the terms are synonymous?

wooden sail Dec 9, 2022, 3:27 AM

#

plush jungle but otherwise the terms are synonymous?

that already means they are not synonymous 👀

lapis sequoia Dec 9, 2022, 3:28 AM

#

Generally I saw that they threw an error

#

if such a scenario arises

#

But does it just take the first n rows of rhs?

#

because that would be really bad

plush jungle Dec 9, 2022, 3:28 AM

#

wooden sail that already means they are not synonymous 👀

so why do we call it gradient descent if it only moves the weights with finite quantities? shouldn't we call it delta descent?

serene scaffold Dec 9, 2022, 3:28 AM

#

lapis sequoia Generally I saw that they threw an error

it might be that per_population doesn't have any nans.

wooden sail Dec 9, 2022, 3:28 AM

#

plush jungle so why do we call it gradient descent if it only moves the weights with finite q...

you're mixing up many things at the same time here

#

you use the direction of the gradient to compute a delta for the parameters

#

how large that delta is is a completely separate question

#

and whether the algorithm works at all depends on whether the gradient is bounded and the curvature satisfies nice conditions

#

one has to prove that taking discrete deltas lets you solve the problem of setting the gradient to 0, it's not something one takes for granted

#

because the two things are not the same

plush jungle Dec 9, 2022, 3:31 AM

#

so it is never correct to say that an algorithm "calculates the gradient"

wooden sail Dec 9, 2022, 3:31 AM

#

no, it is correct

#

you're still missing the point

lapis sequoia Dec 9, 2022, 3:31 AM

#

serene scaffold it might be that `per_population` doesn't have any nans.

nope. It always has

wooden sail Dec 9, 2022, 3:31 AM

#

it computes the gradient, and then uses the gradient to compute a delta

#

you're gonna have to pick up a math book if you're struggling with the concepts

plush jungle Dec 9, 2022, 3:32 AM

#

wooden sail you're gonna have to pick up a math book if you're struggling with the concepts

I'm looking at this from my lecture slides

#

this calculates the derivative of the error with respect to the weights

wooden sail Dec 9, 2022, 3:32 AM

#

yes

plush jungle Dec 9, 2022, 3:33 AM

#

is this the gradient or just a delta?

wooden sail Dec 9, 2022, 3:33 AM

#

neither, it's a partial derivative

#

you use partial derivatives to compute the components of the gradient vector

#

a single partial derivative is a scalar though

plush jungle Dec 9, 2022, 3:35 AM

#

wooden sail you use partial derivatives to compute the components of the gradient vector

ohhhhh the gradient is a vector

#

so that's the difference

#

all of the deltas together as a vector make up the gradient

wooden sail Dec 9, 2022, 3:35 AM

#

no

#

it's "a" difference, not "the" difference

#

a delta can be a vector too

#

the difference is exactly what i told you

#

a delta is a finite difference

#

derivatives, including gradients, are infinitesimal and involve limits

iron basalt Dec 9, 2022, 3:37 AM

#

plush jungle ohhhhh the gradient is a vector

https://en.wikipedia.org/wiki/Gradient

Gradient

In vector calculus, the gradient of a scalar-valued differentiable function f of several variables is the vector field (or vector-valued function)

    ∇
    f
  

{\displaystyle \nabla f}

whose value at a point

    p
  

{\displaystyle p}

is the "direction and rate of fast...

#

It's a specific thing.

wooden sail Dec 9, 2022, 3:38 AM

#

at this point i'd STRONGLY recommend you pick up a book, because you seem to be either struggling with the concepts or trying to interpret/make them up on your own

iron basalt Dec 9, 2022, 3:39 AM

#

How things are named requires the specifics which is still locked away in books or at least something like wikipedia, but wikipedia kind of assumes that you can read and understand it, which is hard to do without enough math knowledge in general (e.g. from books).

#

(Also more of a reference)

plush jungle Dec 9, 2022, 3:41 AM

#

yeah that make sense

#

thanks for answering my questions

iron basalt Dec 9, 2022, 3:42 AM

#

*The reason why math has names for every specific thing is because they need the names to be specific in proofs. Which results in A LOT of different terminology.

serene scaffold Dec 9, 2022, 3:44 AM

#

iron basalt *The reason why math has names for every specific thing is because they need the...

that is, different terminologies for the same set of ideas?

iron basalt Dec 9, 2022, 3:45 AM

#

serene scaffold that is, different terminologies for the same set of ideas?

Yes and no.

wooden sail Dec 9, 2022, 3:45 AM

#

that does happen sometimes, depending on how you're interpreting the object

#

for example you might think of a row vector as just that, a row vector. or maybe as the transpose of a column vector. or maybe as a covector. or perhaps as a linear functional. or an element of the dual space

#

depending on what you're trying to do with it

iron basalt Dec 9, 2022, 3:46 AM

#

(Also there are different spoken languages used, which also use different standard terms like identity vs neutral element)

#

(And also many things in math map to each other, so same thing, different POV (which is one of things that is really looked for in math))

serene scaffold Dec 9, 2022, 3:47 AM

#

I'm learning lemon_hyperpleased

iron basalt Dec 9, 2022, 3:48 AM

#

*Also, Ducky, I kind of skipped over the derivation for the gradient on purpose because it involves a lot of details that would have just been noise without more linear algebra and multivariate calculus knowledge.

#

(The actual full proper way has a lot of steps skipped by most posts you will find online / they are all hand-wavy / intuitive explanations)

#

(Many courses on DL don't do the full thing either)

wooden sail Dec 9, 2022, 3:52 AM

#

multivar calc in most engineering degrees doesn't, either

#

ideally there'd be some discussion about differential forms and jacobians involved

iron basalt Dec 9, 2022, 3:54 AM

#

Yeah, I suppose the reason why is because one can often get away with the informal way but it does leave one open to potential mistakes, which is not something I would want to teach for engineers making stuff like the bridges I drive on.

#

Though I think the idea of a two pass method would be pretty good, the informal way first, to get an overview, then repeat the whole thing again in detail. But for now schools systems have a "do course then done forever with the topic" kind of thing going on.

#

(Also repetition helps with memory)

#

(So I guess my answer to formalism vs intuitionism is both (as good of an understanding as possible (be aware when one is leaving holes open by being informal and when that matters)))

lapis sequoia Dec 9, 2022, 3:59 AM

#

@serene scaffold loc works differently

#

That code was logically correct. I verified

vague sable Dec 9, 2022, 4:22 AM

#

Where should I put my csv data file when using pandas?

#

trying to following this tutorial but it doesnt indicate where I should place the datafile
https://thedatafrog.com/en/articles/show-data-google-map-python/

The Data Frog

Show your Data in a Google Map with Python

Create an interactive display for geographical data with python: real-estate prices near Geneva.

#

Also, I just get a FileNotFoundError and not pointing to where it is looking for the file

serene scaffold Dec 9, 2022, 4:31 AM

#

@vague sable any file input/output you do will be relative to the "current working directory". which you can figure out by doing import os and print(os.getcwd()). but if you're running python from a command line, it's going to be wherever the command line is in the file tree.

vague sable Dec 9, 2022, 4:32 AM

#

This is my working directory
C:\Users\daneb\anaconda3\envs\geovis

#

I put the file there and no luck

serene scaffold Dec 9, 2022, 4:32 AM

#

also, geneva is so weird. the way the boarders of switzerland scoop down and around to include the center--but not all--of its built-up area.

serene scaffold Dec 9, 2022, 4:32 AM

#

vague sable I put the file there and no luck

show code

vague sable Dec 9, 2022, 4:33 AM

#

I've not written any code - just following tha tutorial i posted

#

Im up to this part

#

#

but getting the file not found error when doing df = pd.read_csv('dvf_gex.csv')

serene scaffold Dec 9, 2022, 4:35 AM

#

please make a new cell with this code, and show the result as text. no screenshots

import os
import pathlib

print(os.getcwd())
print(list(pathlib.Path('.').glob('*.csv')))

vague sable Dec 9, 2022, 4:36 AM

#

>>> import pathlib
>>> print(os.getcwd())
C:\Users\daneb
>>> print(list(pathlib.Path('.').glob('*.csv')))
[]
>>>```

serene scaffold Dec 9, 2022, 4:37 AM

#

vague sable ```>>> import os >>> import pathlib >>> print(os.getcwd()) C:\Users\daneb >>> pr...

so you're using a REPL (read-evaluate-print loop)? the tutorial assumes you're using a notebook.

#

anyway, your current working directory is C:\Users\daneb and is not C:\Users\daneb\anaconda3\envs\geovis

misty flint Dec 9, 2022, 4:38 AM

#

@serene scaffold cattohug

#

also

#

i saw something neat

#

about notebooks

vague sable Dec 9, 2022, 4:39 AM

#

serene scaffold anyway, your current working directory is `C:\Users\daneb` and is not `C:\Users\...

Thanks, I whacked it in my user directory & it worked lol

vague sable Dec 9, 2022, 4:39 AM

#

serene scaffold so you're using a REPL (read-evaluate-print loop)? the tutorial assumes you're u...

YOu are correct - Ill go use a notebook now.

misty flint Dec 9, 2022, 4:39 AM

#

misty flint about notebooks

image_d71f4585-bf90-47f9-a914-75ce99a4dd8f20221208_223933.jpg

#

anyway thats just one person's take

#

but one can hope right

#

havent heard of github .devcontainers until now but it looks interesting

vague sable Dec 9, 2022, 5:12 AM

#

Hey all,
When I do
export GOOGLE_API_KEY=<your_key>

It doesnt work, so I use set instead of export which seemed to work but now I try to use the key with api_key = os.environ['GOOGLE_API_KEY'] and It fails

#

What could I be doing wrong

misty flint Dec 9, 2022, 5:18 AM

#

whats your error

#

did you actually go and get an API key

vague sable Dec 9, 2022, 5:19 AM

#

I did yep

misty flint Dec 9, 2022, 5:19 AM

#

and save it in an environment variable

vague sable Dec 9, 2022, 5:19 AM

#

Thats where I think im going wrong actually saving/storing it

misty flint Dec 9, 2022, 5:19 AM

#

you need to save it in the right directory

#

if you are in a notebook environment, you can check the current directory with the magic command %cd

#

then you can 'cd' to the correct directory where you saved it

#

https://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-cd

vague sable Dec 9, 2022, 5:21 AM

#

yep

#

so i gone into the correct directory as far as I am aware

#

C:\Users\daneb\anaconda3\envs\geovis

misty flint Dec 9, 2022, 5:22 AM

#

what is your actual error

#

what is it saying

vague sable Dec 9, 2022, 5:22 AM

#

once in thsat directory, I have exectured the following
export GOOGLE_API_KEY=<AIzaSyB***********************************>

#

Stars are just so nobdy gets my key

misty flint Dec 9, 2022, 5:23 AM

#

good practice

vague sable Dec 9, 2022, 5:23 AM

#

I get this error
File "C:\Users\daneb\AppData\Local\Temp\ipykernel_15540\2829495065.py", line 1 export GOOGLE_API_KEY=<AIza********> ^ SyntaxError: invalid syntax

misty flint Dec 9, 2022, 5:25 AM

#

typically API keys use the TOML format https://toml.io/en/

vague sable Dec 9, 2022, 5:25 AM

#

executing all of this from a notebook btw

misty flint Dec 9, 2022, 5:25 AM

#

should be GOOGLE_API_KEY="AIza********"

#

try it with quotes

vague sable Dec 9, 2022, 5:26 AM

#

Same invalid syntax error

#

Pointing to the capital G of Google

misty flint Dec 9, 2022, 5:27 AM

#

then idk dude sorry

vague sable Dec 9, 2022, 5:27 AM

#

ok

#

thanks though!

misty flint Dec 9, 2022, 5:27 AM

#

idk why you need to export it though if you already have the key saved in a separate file

#

you can just load it in later on when you need it

vague sable Dec 9, 2022, 5:30 AM

#

I've not saved the key anywehre

#

Im just following this tutorial

#

which tells me to run the export code

misty flint Dec 9, 2022, 5:31 AM

#

vague sable I've not saved the key anywehre

take a look at this https://stackoverflow.com/questions/70372120/how-to-use-api-key-without-directly-using-it-in-the-python-code

Stack Overflow

how to use Api Key without directly using it in the python code?

I want get data using HubSpot Api in Python, however, I don't want to show my ApiKey in the code. What would be the best practice to get data using Api Without showing your Api key? I am currently

#

vague sable Dec 9, 2022, 5:35 AM

#

Yep awesome this is helping

#

reading through it all now

#

thank you

#

@misty flint got it working, thank you.

fresh tiger Dec 9, 2022, 8:54 AM

#

Hi! I just want to confirm if I am doing something wrong in my calculations.

I am currently trying to find if two categorical columns are corrolated/related. I have tried to run a correlation test and chi2 test, and I feel as though they are both going against each other result wise:

correlation: ```python
dfCopy = df.copy()

dfCopy['topic'] = dfCopy['topic'].astype('category').cat.codes
dfCopy['Label_bias'] = dfCopy['Label_bias'].astype('category').cat.codes
dfCopy.corr()['Label_bias']['topic']

output: -0.03312874060844754```

chi2 test:```python

method to calculate the p-value via chi squared test.

prints the p-value.

def chi_squared(column_to_compare):
# Create a cross table between topic and label bias.
# This will be used for the chisquared tests.
cross_tabulation = pd.crosstab(index=df[column_to_compare],columns=df['Label_bias'])
print(cross_tabulation)

chi_squared = chi2_contingency(cross_tabulation)
print('p-value is: ', chi_squared[1])

output: p-value is: 5.581641709475194e-09``` So from how i am understanding the data, the corrolation test says the two columns are not correlated - but the chi2 test says that the feature column is dependent to the output column

I am not sure if I am understanding the results incorrectly (ie maybe they are aligning?)

carmine scarab Dec 9, 2022, 9:22 AM

#

if i want to use python for data science, should i take a generic python bootcamp or just go for a python data science bootcamp?

candid escarp Dec 9, 2022, 9:31 AM

#

Does anyone have "Dog Muzzle" image Dataset ?

devout zodiac Dec 9, 2022, 11:54 AM

#

I posted in #1035199133436354600 too (https://discord.com/channels/267624335836053506/1050740175698919434), but generally I don't understand how PyTorch keeps track of the gradients and which operations are allowed and which aren't. I want to rearrange individual values from one array to another, while keeping the gradient intact. PyTorch throws an error calling it an "in place" operation, which I don't see how it is, if I assign from one array/tensor/matrix to another. "In place" to me is e.g. a "+=" operation, which I'm not using anywhere.

Discord

Discord - A New Way to Chat with Friends & Communities

Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.

spice mortar Dec 9, 2022, 2:12 PM

#

code: py model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.01), loss=keras.losses.sparse_categorical_crossentropy(from_logits=True), metrics=['accuracy']) error on loss=keras.losses.sparse_categorical_crossentropy(from_logits=True),

#

error: ```
File "D:\real_Python\projects\draw_guess_game\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "D:\real_Python\projects\draw_guess_game\lib\site-packages\tensorflow\python\util\dispatch.py", line 1170,
in op_dispatch_handler
result = api_dispatcher.Dispatch(args, kwargs)
TypeError: Missing required positional argument

karmic dirge Dec 9, 2022, 3:27 PM

#

Hi guys, I'm new to Data Science and I'm interested in Process Mining and more specifically in the Alpha Miner algo. I have however a slight problem to understand what the highlighted numbers correspond to. Could someone enlighten me ?

craggy rune Dec 9, 2022, 4:55 PM

#

Hello, sorry for the stupid question, but I can’t understand how I can get the number of unique devices in each group and name the Users column + the second column with the number of events, let’s say majong_first_tap and also their number in the group (in the Impressions column). the second screenshot is what i want to get

serene scaffold Dec 9, 2022, 5:00 PM

#

craggy rune Hello, sorry for the stupid question, but I can’t understand how I can get the n...

grouped = all_data_df.groupby('level')
print(grouped['device_id'].nunique())

craggy rune Dec 9, 2022, 5:03 PM

#

serene scaffold ```py grouped = all_data_df.groupby('level') print(grouped['device_id'].nunique(...

this is the number of unique users in levels 0, 1, 2... but how else to display the number of some event in the same table, let's say the number in the zero level of the rubber_challenge_item_loaded event

#

like this

|level|Users|Event count|
| 0 | 400 | 200 |

eager cloud Dec 9, 2022, 5:33 PM

#

Hi guys. My apologies if this isn’t the correct channel. But I’ve got an issue here I can’t seem to figure out. I could probably achieve this via Anaconda but how would I go about displaying the tables within this pdf

mint palm Dec 9, 2022, 7:17 PM

#

best mAP doesnt corrospond to my best f1 score

#

is that fine?

#

i dont know about mAP, but read that best epoch is decided based on max mAP

spare briar Dec 9, 2022, 8:49 PM

#

they measure different things, understand exactly what each measures and choose based on which you prefer for your application

sweet crypt Dec 10, 2022, 3:41 AM

#

while building MCTS tree, do we add all the possible actions for all pieces in the board?

modern belfry Dec 10, 2022, 5:51 AM

#

hi
I recently started getting into ml stuff and I am pretty much a beginner. My end goal is to build a recommender system for my app but I am not sure if I should learn pytorch or go for scikit-learn / tensorflow
I have some experience with scikit-learn but none with tensorflow.. I think investing my time in pytorch will be better ?

misty flint Dec 10, 2022, 5:56 AM

#

modern belfry hi I recently started getting into ml stuff and I am pretty much a beginner. My ...

is your app web or mobile

modern belfry Dec 10, 2022, 5:56 AM

#

misty flint is your app web or mobile

mobile

misty flint Dec 10, 2022, 5:57 AM

#

ok one is def better than the other for mobile apps

#

there was a chart somewhere i posted a long time ago

#

i cant remember off the top of my head

#

lol

modern belfry Dec 10, 2022, 5:57 AM

#

misty flint ok one is def better than the other for mobile apps

one?

modern belfry Dec 10, 2022, 5:57 AM

#

misty flint i cant remember off the top of my head

oh XD

misty flint Dec 10, 2022, 5:57 AM

#

pytorch vs tensorflow

modern belfry Dec 10, 2022, 5:58 AM

#

ah

#

so def not scikit learn for production use right?

misty flint Dec 10, 2022, 5:58 AM

#

ah i found it

#

#

https://www.assemblyai.com/blog/pytorch-vs-tensorflow-in-2022/

News, Tutorials, AI Research

PyTorch vs TensorFlow in 2022

Should you use PyTorch vs TensorFlow in 2022? This guide walks through the major pros and cons of PyTorch vs TensorFlow, and how you can pick the right framework.

#

you dont need SOTA models for this use case

#

so -> N

#

then Y for mobile

#

only you can answer the next part

modern belfry Dec 10, 2022, 6:00 AM

#

nope dont need audio/video input

#

so pytorch and tensorflow both are fine?

modern belfry Dec 10, 2022, 6:01 AM

#

misty flint https://www.assemblyai.com/blog/pytorch-vs-tensorflow-in-2022/

thanks

misty flint Dec 10, 2022, 6:02 AM

#

modern belfry so pytorch and tensorflow both are fine?

yep. and your specific options are between Tensorflow Lite and Pytorch Live

#

gl and let me know how it goes if you end up trying one

modern belfry Dec 10, 2022, 6:02 AM

#

misty flint yep. and your specific options are between Tensorflow Lite and Pytorch Live

ohh

#

I also read about torchrec somewhere

#

anyways I will search pytorch lite 👍

#

also I have decided to watch this lecture from FCC while implementing stuff by myself
https://youtu.be/V_xro1bcAuA

YouTube

freeCodeCamp.org

PyTorch for Deep Learning & Machine Learning – Full Course

Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.

✏️ Daniel Bourke developed this course. Check out his channel: https://www.youtube.com/channel/UCr8O8l5cCX85Oem1d18EezQ

🔗 Code: https://github.com/mrdbourke/pytorch-deep-learning
🔗 Ask a question: https://githu...

▶ Play video

#

I think should be enough before getting into deeper stuff like recommendation system etc

ancient pendant Dec 10, 2022, 6:17 AM

#

Which database should I learn for data science
MySQL or MongoDB?

rugged comet Dec 10, 2022, 7:38 AM

#

ancient pendant Which database should I learn for data science MySQL or MongoDB?

You should learn both.

glossy moth Dec 10, 2022, 8:47 AM

#

Hi everyone! I'm trying to see if I can differentiate between random subsamples of populations with drastically different magnitudes. Is there any approach or technique to identify a difference in distribution in such a case?

wooden sail Dec 10, 2022, 8:48 AM

#

you can try a kolmogorov smirnov test

lapis sequoia Dec 10, 2022, 9:09 AM

#

i have a doubt

#

is there any ai developer here ??

arctic wedgeBOT Dec 10, 2022, 9:13 AM

#

Hey @lapis sequoia!

You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.

round parrot Dec 10, 2022, 9:14 AM

#

Hi! is there anyone here that uses matplotlib?

lapis sequoia Dec 10, 2022, 9:14 AM

#

i want a deep reinforcement ai which:

is standing on a block (orange)
can only move up, right or diagonally. if it moves diagonally it gets -1 point.
can move only once
if it steps in (red) -1 point.
if it stays in one place, it gets -1.
if it reached green it gets +1 point.

what will the ai do ???

#

???? help me...

grand quarry Dec 10, 2022, 10:44 AM

#

lapis sequoia i want a deep reinforcement ai which: * is standing on a block (orange) * can ...

Thinking about it it will learn to move diagonally and recieve 0 as a total reward

tribal harness Dec 10, 2022, 10:44 AM

#

how to scrape user requests from a website using python(i am new to data science)?

steady basalt Dec 10, 2022, 12:44 PM

#

So there’s a exhibition today for a jewlery company that used GAN art and it’s the oddest thing ever seeing members of the public watching a explanation of ML

river sapphire Dec 10, 2022, 3:04 PM

#

so I recently just implemented prioritized experience replay for my DDQN and I now have this strange looking reward graph

#

the reward seems to be increasing then decreasing?

#

and also this loss graph as well

#

the snake agent seems to never learn to get the apple but instead crashes into the wall

#

when I turned off prioritization it seemed a lot more successful

#

why is this?

river sapphire Dec 10, 2022, 3:11 PM

#

lapis sequoia i want a deep reinforcement ai which: * is standing on a block (orange) * can ...

you could solve this without neural networks, q learning could solve this

umbral raptor Dec 10, 2022, 3:28 PM

#

I am working a personal project based o COCO car damage challenge. There is COCOEvaluator that returns the AP metrics based on model prediction. I am trying to create a custom metric that explains how many damages or percentage of damages (based on IOU) the model finds across the test dataset. Note that I have replaced every class of the problem to one class "Damage". Is there a way to get the TP/FP and TN/FN of the prediction to begin with, using detectron2 model outputs?

Coco Car Damage Detection Dataset

Contains the images of cars with damages and their annotations in COCO format

dull trench Dec 10, 2022, 4:04 PM

#

Anyone with knowledge in stats and analysis using python dm me

patent lynx Dec 10, 2022, 4:28 PM

#

Is it best practice to look at code submissions in kaggle? Or develop the solution on my own? Still struggling to understand the data sets

young granite Dec 10, 2022, 4:29 PM

#

patent lynx Is it best practice to look at code submissions in kaggle? Or develop the soluti...

which dataset?

patent lynx Dec 10, 2022, 4:29 PM

#

Competition ones

young granite Dec 10, 2022, 4:30 PM

#

i mean for inspo its always good to be able to compare urself with someone else

#

that said u should first try it on ur own

normal bay Dec 10, 2022, 4:40 PM

#

Do we need to learn OOP in python in order to be a successful Data Engineer?

worn stratus Dec 10, 2022, 4:42 PM

#

yes

normal bay Dec 10, 2022, 4:42 PM

#

Your answer would be yes in case of becoming a data analyst as well?

worn stratus Dec 10, 2022, 4:43 PM

#

less necessary for data analysts

trim blade Dec 10, 2022, 4:53 PM

#

Sorry, new to pandas: How do I divide a range of items in a row by another item in the same row (and then use that for all rows in the same dataframe) ?

(e.g. Divide items in row index 1 with index 83 through 102 by the value in row index 1, column index 14, then keep doing that for every row. So, for example, the values in row index 2, column index 83:102 are divided by the value in row index 2, column 14, same with row 3, row 4, etc)

#

(Some kind of 'for' loop is required. What kind, I don't know)

zenith hawk Dec 10, 2022, 4:56 PM

#

Hey, sorry to interrupt ~. I have small question on how to better do with project I am currently working on. Basically I have volatility index and company revenue, I know they correlate at about 0.25 but the whole dataset is pretty crappy and it’s more of a cloud rather than clear fit. So how should I go with it to create high scoring prediction from values which don’t correlate well ?

zenith hawk Dec 10, 2022, 5:00 PM

#

trim blade Sorry, new to pandas: How do I divide a range of items in a row by another item ...

Honestly it’s much faster to do in excel rather than think of pandas logics (it is not Efficient for that task), by the time I was writing that you could have already completed dataset

trim blade Dec 10, 2022, 5:01 PM

#

zenith hawk Honestly it’s much faster to do in excel rather than think of pandas logics (it ...

The data set I'm working with is too big to analyze in Excel. If you try to open the csv, the computer will crash.

#

(I know, I tried)

native umbra Dec 10, 2022, 5:01 PM

#

Hello, I started HCIA-AI V3 certificate but I can not memorize all methods of machine learning because I did not practice in code, Where can I practice doing simple machine learning project?

zenith hawk Dec 10, 2022, 5:02 PM

#

trim blade The data set I'm working with is too big to analyze in Excel. If you try to ope...

If your computer crashes on opening csv than any kind of “for” looping in python on pandas dataframe will crash too, since it’s extremely slow and inefficient. Have you tried cloud based excel/google sheets ?

trim blade Dec 10, 2022, 5:03 PM

#

zenith hawk If your computer crashes on opening csv than any kind of “for” looping in python...

Hm, no I haven't. That might be an option.

#

I figured a for loop would take a very long time, but I'd be able to work with the data at the end of it. Excel just taps out when you try to open the file.

#

(Sorry I should have been more precise: the computer doesn't crash. Excel crashes)

zenith hawk Dec 10, 2022, 5:05 PM

#

trim blade I figured a for loop would take a very long time, but I'd be able to work with t...

Well you can just for loop over the indexes, not pandas df. And access values of each element through index loop like df.iloc[] and do what you wanted with them

#

But this will be slow too

trim blade Dec 10, 2022, 5:06 PM

#

zenith hawk But this will be slow too

Slow is fine. I just want to be able to work with it.

grizzled verge Dec 10, 2022, 6:20 PM

#

Could someone explain what a target variable is? I'm trying to learn decision trees in python and I was confused by the use of "target" in the code terminology

native umbra Dec 10, 2022, 6:21 PM

#

native umbra Hello, I started HCIA-AI V3 certificate but I can not memorize all methods of m...

help plz

wide pumice Dec 10, 2022, 6:55 PM

#

https://medium.com/p/9bdfaee7ff7e

Medium

ChatGPT: This Incredible AI Chatbot May Replace Junior Software Eng...

OpenAI is changing the world, but how good is ChatGPT at programming?

serene scaffold Dec 10, 2022, 7:12 PM

#

wide pumice https://medium.com/p/9bdfaee7ff7e

not likely.

austere swift Dec 10, 2022, 7:18 PM

#

yeah the thing is about making an ai code is that you'll also have to make sure the code it wrote not only works without erroring but also that it does the thing you want it to

#

and in order to make sure it does the thing you want it to do, you'll have to have people (the aforementioned software engineers that it's "replacing") check it

#

I've been using codex quite a bit lately just seeing how it does things and most of the time the code it writes is pretty wrong

#

It's useful in accelerating my workflow (if i have to repeat a function on several variables it can infer that, and hitting tab is faster than ctrl c ctrl v) but not in making its own code, and I think that'll be similar with chatgpt

cursive island Dec 10, 2022, 8:26 PM

#

https://www.kaggle.com/code/mustafagerme/linear-polynomial-knn-lazy-regressor

Linear, Polynomial, KNN, Lazy Regressor

Explore and run machine learning code with Kaggle Notebooks | Using data from fuel_consumption_CO2

halcyon aurora Dec 10, 2022, 9:04 PM

#

how hard would it be to train an ai to recognize when a basket ball goes into a hoop

#

in a fast pace and not get mixed up with other balls

austere swift Dec 10, 2022, 9:06 PM

#

@grim patrol I didn't get a chance to see your ping till now, the function is called for each individual sample so random augmentations would be different each time you run it

lapis sequoia Dec 10, 2022, 9:46 PM

#

I posted this in general but got no response, i think it may have been too advanced to get a quick answer on. If anyone happens to have experience with this I'd really appreciate your help:

I've got a sort of desperate question for anyone with RL experience, is it a bad idea to use Proximal Policy Optimization to learn a backflip (humanoid) in simulation (pybullet in this case)?

I wrote my own PPO with pybullet/pytorch since no implementations seemed to be able to run on my machine. I know PPO is super sensitive to hyperparameters so I've been tuning them with bayesian optimization for the past couple hours but I'm getting nervous since i'm not seeing anything converge. If anyone can tell me whether PPO is simply not a possible solution (to learning a backflip from an expert trajectory) or if it is and I should be patient and wait for tuning to finish, I would be eternally grateful...

soft badge Dec 10, 2022, 10:04 PM

#

Guys how i do for make a slice in a value on columns

#

example:

#

str = "joao (test)

#

valor = str.find("(")

#

and after i want string = str[:valor]

#

but in my columns of pandas how i do this?

brave sand Dec 11, 2022, 1:55 AM

#

could someone train an ai to use chat gpt3?

serene scaffold Dec 11, 2022, 1:56 AM

#

brave sand could someone train an ai to use chat gpt3?

that's what chatgpt is

misty flint Dec 11, 2022, 3:18 AM

#

serene scaffold that's what chatgpt is

kekHands

#

peeps are gonna feel the real crunch when they close this "research period"

#

hehehe

#

also the CEO himself says the compute costs are "eye-watering" so they will have to monetize it somehow

#

take_my_money

#

which makes sense. i cant even imagine their aws bill

serene scaffold Dec 11, 2022, 3:21 AM

#

misty flint peeps are gonna feel the real crunch when they close this "research period"

more like hype phase

serene scaffold Dec 11, 2022, 3:21 AM

#

misty flint which makes sense. i cant even imagine their aws bill

bye bye money

austere swift Dec 11, 2022, 3:23 AM

#

misty flint which makes sense. i cant even imagine their aws bill

I thought microsoft gave them compute?

#

or was that just for training

misty flint Dec 11, 2022, 3:23 AM

#

the verge says theyre using aws for inference for chatgpt

#

and the aws bill is ungodly

verbal venture Dec 11, 2022, 3:41 AM

#

if someone says a model is a learned what is meant by that

gilded bobcat Dec 11, 2022, 4:13 AM

#

Hey there I had a Q on file management. So far I am in a project where I have gathered the data, cleaned the raw data and this is it.

My question is how should I organize my files? Should I make one for data gathering, data cleaning, EDA, feature engineering, model building? What do you guys typically do?

strange igloo Dec 11, 2022, 5:28 AM

#

Stats question:
Talent is a good predictor of a football team winning a game.
Home field advantage is a weaker predictor but still relevant.

Question, how could probability for these attributes be combined? As in, how can I add to the probability of a win if the talented team also has home field advantage.

agile cobalt Dec 11, 2022, 5:38 AM

#

strange igloo Stats question: Talent is a good predictor of a football team winning a game. H...

depends on which kind of model you are building?

#

also on how you measure "talent"

strange igloo Dec 11, 2022, 5:39 AM

#

talent would be a ranking of high quality recruits

#

so it's a dataset that is already compiled from publications

agile cobalt Dec 11, 2022, 5:41 AM

#

strange igloo talent would be a ranking of high quality recruits

that does not explains anything about how you are actually measuring it, to be more specific, as a number (or a series of numbers)

#

"is home field" is a either True or False (boolean), not much to think about there

#

I guess that turning "talent" into numbers could be ranking them in decreasing order (1, 2, 3, 4, ... with 1 being the "best") or giving them a rating (2.3, 7.8, 10.0 etc with higher being better)

strange igloo Dec 11, 2022, 5:43 AM

#

I see what you mean, the talent ranking would be a number

#

Yeah, just like that. 1 is the best.

agile cobalt Dec 11, 2022, 5:44 AM

#

now as for which kind of model you are building?

#

(after you think about what are your inputs and outputs are gonna be, you have to pick a model that can generate that kind of output)

strange igloo Dec 11, 2022, 5:49 AM

#

This is out of my depth. I was thinking more along the lines of standard probability calculations, like how you multiply the probability of independent events in order to get the probability that a specific outcome will occur in sequence

plush jungle Dec 11, 2022, 5:53 AM

#

someone trained an image transformer on millions of tumblr screenshots and it generated this image

#

They used this model to do it:
https://github.com/Aleph-Alpha/magma

#

what I don't understand is if you look at the top of the image, it's actually really accurate generated text, and as you get lower the quality deteriorates more and more. What would cause this? Is the transformer generating the image from top to bottom and somehow losing quality with every line of pixels?

agile cobalt Dec 11, 2022, 5:57 AM

#

plush jungle what I don't understand is if you look at the top of the image, it's actually re...

if I had to guess, it almost always sees text in the top part of the image, but it sees different things towards the bottom more often

plush jungle Dec 11, 2022, 5:58 AM

#

agile cobalt if I had to guess, it almost always sees text in the top part of the image, but ...

yeah that sounds plausible

#

I'm pretty dumbstruck that image transformers can solve nlp problems now, we've come a long way

agile cobalt Dec 11, 2022, 6:02 AM

#

tbh I forgot what are our rules/policies on self-promotion, but if you are going to, at the very least make it explicit that it is your own post

worldly dawn Dec 11, 2022, 6:02 AM

#

<@&831776746206265384> ads crossposted on multiple channels

devout sail Dec 11, 2022, 6:06 AM

#

@lapis sequoia we don't allow self promotion that is unrelated to the conversation

hollow fossil Dec 11, 2022, 8:36 AM

#

do anyone of you know any good courses for data science ?

fickle venture Dec 11, 2022, 9:50 AM

#

soft badge but in my columns of pandas how i do this?

yo

#

is this solved yet or do you still need help?

#

@soft badge if u still need help with this, just @ me

strange moon Dec 11, 2022, 10:58 AM

#

Hello! I tried installing the stable-diffusion library for python using: "pip install stable-diffusion-ai", but got this error (in the picture): Fatal error in launcher: Unable to create process using '"c:\python27\python.exe" "C:\Python27\Scripts\pip.exe" install stable-diffusion-ai': The system cannot find the file specified.

am not sure why its doing this

young oyster Dec 11, 2022, 11:01 AM

#

Anyone have a good guide to pandas in general, looking to up my skills in data management

#

Specifically using Dataframes*

obsidian jacinth Dec 11, 2022, 11:13 AM

#

in albumentations, does anyone know how to get the bounding boxes value after you've applied an augmentation pipeline to an image you want to augment?

#

this doc shows visualization of the bounding boxes, but how do you actually fetch those bounding boxes in numerical values after augmentation? https://albumentations.ai/docs/examples/example_bboxes/#define-functions-to-visualize-bounding-boxes-and-class-labels-on-an-image

Albumentations Documentation - Using Albumentations to augment boun...

Albumentations: fast and flexible image augmentations

opal stag Dec 11, 2022, 11:52 AM

#

Question: What is the the most important disk access pattern of typical big data applications? Support your answers with arguments.

Professor reply:

The most important access pattern is sequential reads. Adding ‘‘ over
large collections’’ would be even more accurate. This is because the
typical big data applications aim to model the entirety of the data
somehow, and to do that the whole data must be read. Furthermore,
many approaches require repeatedly reading the collection. However,
the specific order of reading the collection is rarely important, allowing
for sequential reads in whatever order the collections happen to be
in.

OpenAI reply:

The most important disk access pattern of typical big data applications is random access. This is because big data applications often involve large and complex datasets that are stored across multiple disks, and that require fast and flexible access for data processing and analytics tasks. Random access allows the application to access any part of the data quickly and efficiently, without the need to sequentially scan through the data.

Random access is particularly important for big data applications that involve iterative and interactive data processing, such as machine learning and analytics. These applications often require fast and flexible access to the data in order to train and evaluate models, run queries, or generate reports. Random access can enable the application to access and update the data quickly and efficiently, which can improve the performance and responsiveness of the application.

In contrast, other disk access patterns, such as sequential access, are less suitable for big data applications. Sequential access involves reading or writing the data in a sequential order, which can be slow and inefficient for applications that require random access to the data. Additionally, sequential access can be limited by the speed of the disk, which can be a major constraint for applications that require high performance.

Overall, random access is the most important disk access pattern for typical big data applications, as it enables fast and flexible access to the data, which is essential for data processing and analytics tasks.

Which one do you guys agree with most? Professor said it is sequential reads, while the openAI said it is random access XD

mint palm Dec 11, 2022, 11:55 AM

#

i did see how positional encoding is done and how its fed into transformer, but i am unable to understand how they are used so that sequence on data is understood

neat crescent Dec 11, 2022, 12:20 PM

#

opal stag Question: `What is the the most important disk access pattern of typical big dat...

Never trust chatGPT to be correct.

#

It will tell you the oceans are made of Sprite and the sky is green, all while sounding super confident.

young granite Dec 11, 2022, 12:21 PM

#

neat crescent It will tell you the oceans are made of Sprite and the sky is green, all while s...

a world we all like to live in

mint palm Dec 11, 2022, 12:31 PM

#

young granite a world we all like to live in

sticky beach, nop

hasty mountain Dec 11, 2022, 1:55 PM

#

misty flint also the CEO himself says the compute costs are "eye-watering" so they will have...

This feels like every recent AI model so far... grumpchib

#

Stares at OpenAI with hatred hyperlemon

fair lion Dec 11, 2022, 2:48 PM

#

does any one know what go Libraries to use for xgboosting a nlp modle

silent flare Dec 11, 2022, 3:46 PM

#

Hey guys, has anyone tried GPT-JT?

#

running it locally i mean

molten yoke Dec 11, 2022, 3:50 PM

#

Hi - I am new to pandas and I have this:

states_medals_summer = athletes[athletes["Season"] == "Summer"].groupby(['NOC'])["Medal"].count()

silent flare Dec 11, 2022, 3:51 PM

#

silent flare Hey guys, has anyone tried GPT-JT?

it seems to be promising, i have some use cases in mind and i'd like to try it out but currently don't have the hardware needed

grizzled verge Dec 11, 2022, 3:51 PM

#

Hey guys Im new to ML trying to learn decision trees for an internship and I get the concept of the trees and how they evaluate which splits are the best and which kinda qualifying attribtued of the vector are most important. However, the code implementation in python still really confuses me

molten yoke Dec 11, 2022, 3:52 PM

#

But when I do this:

states_medals_summer[states_medals_summer["NOC"] == "ALG"].head()

I get: KeyError: 'NOC'

grizzled verge Dec 11, 2022, 3:52 PM

#

I first just needed help making sense of this pseudocode

#

#

So for this scenario, attributes are all these 10 parameters listed here

#

And examples are the data we pass in? how do these differ from parent examples

soft badge Dec 11, 2022, 3:54 PM

#

guys what is happen with sklearn?

#

because i install but is version 0.0 and i can not import on file, anyone know how fix?

pliant ferry Dec 11, 2022, 3:58 PM

#

Hi I'm very new to yolov5 and computer vision in general(Just started today) and I was following a tutorial on YouTube about the topic. I was kinda understanding everything so far then I went on to real time detections then encountered this error. tbh I have no idea what I'm doing

#

This is the code btw

tidal bough Dec 11, 2022, 4:14 PM

#

sounds like frame is None.

pliant ferry Dec 11, 2022, 4:26 PM

#

So how exactly do I fix it

mild dirge Dec 11, 2022, 4:45 PM

#

try loading the image

#

make sure it loads

pliant ferry Dec 11, 2022, 4:46 PM

#

Yh

#

it loads

#

soft badge Dec 11, 2022, 4:49 PM

#

vision computation is dificult understand?

mild dirge Dec 11, 2022, 4:51 PM

#

pliant ferry

So you fixed the error? because previously you got an error on that line

pliant ferry Dec 11, 2022, 4:56 PM

#

The error isn't fixed, the line with the image variable is completely fine but when I run the video capture code I get this "AttributeError: 'NoneType' object has no attribute 'shape' "

#

It's now showing me this

austere swift Dec 11, 2022, 5:18 PM

#

try it in a normal python interpreter rather than a notebook

#

from past experience I've always had issues using opencv's imshow in jupyter

#

or try plt.imshow instead of cv2.imshow

past sinew Dec 11, 2022, 5:47 PM

#

Hello, I have a problem regarding Keras BiLSTM training

#

In order to convert my data frame into sequence matrices, I tried pad_sequece but it didn't work

#

Then I tried keras.utils.timeseries_dataset_from_array to no avail

#

Then I tried keras.layers.TextVectorization

#

Again didn't work

#

This time I converted my data in the df, into arrays

#

And the again attempted the TextVectorization

#

And when I run the from_tensor_slices, it gives me an error saying it failed to convert a Numpy array to a tensor due to unsupported object type : BatchDataset

#

I'm still a student, so sorry if it's an obvious question. Would you please offer some clarification?

tropic matrix Dec 11, 2022, 5:55 PM

#

I'm trying to train a machine learning model to predict the price of an item based off of its attributes, but I'm unsure as to the architecture of DNN i should be using.

After processing, the data has 7823 inputs with 1 output and anywhere between 200k-20m samples are available.

I used sklearn StandardScaler on both the input and on the output data before passing it to the model.

silent epoch Dec 11, 2022, 6:10 PM

#

tropic matrix I'm trying to train a machine learning model to predict the price of an item bas...

are you looking at the kaggle competition too?

tropic matrix Dec 11, 2022, 6:10 PM

#

silent epoch are you looking at the kaggle competition too?

no this is for something else, what competition? i'm interested lmao

silent epoch Dec 11, 2022, 6:10 PM

#

just predicting sales in a list of test shops based on previous data

#

no cash prize for this one tho

#

all the variables are in russian

tropic matrix Dec 11, 2022, 6:11 PM

#

bruh

silent epoch Dec 11, 2022, 6:12 PM

#

i also have no idea how to choose the architecture for DNNs

#

so far, i just follow the general recommendations: spam some filters at the beginning, condense, and slap some regularization tools at the end

strange igloo Dec 11, 2022, 6:52 PM

#

Screen_Shot_2022-12-11_at_12.52.05_PM.png

#

Is this enough data to say that ranking is a better predictor of a win than home field advantage? I'm trying to understand if selecting the better predictor when having simple probabilities like this can be any more complex.

boreal gale Dec 11, 2022, 7:53 PM

#

maybe a chi square test is applicable here?

serene scaffold Dec 11, 2022, 7:54 PM

#

strange igloo Is this enough data to say that ranking is a better predictor of a win than home...

it would appear so, but not by much

strange igloo Dec 11, 2022, 7:54 PM

#

Thank you!

hasty mountain Dec 11, 2022, 8:42 PM

#

neat crescent It will tell you the oceans are made of Sprite and the sky is green, all while s...

Lol, I've just got into Stack Overflow and saw a notice about their policy regarding that

#

Looks like OpenAI just created one of the best scammers so far...and it's not even human

strange igloo Dec 11, 2022, 9:22 PM

#

chatGPT is pretty funny

#

I asked it to come up with myths about cooking steak

#

and it said that you shouldn't let your steak rest to long because it would over cook it

#

I suppose that could only be true if you ate it while it was so hot it burnt your mouth

mint palm Dec 11, 2022, 9:35 PM

#

"ViViT: A Video Vision Transformer"
this was the first transformer with "temporal" attention module, right?

#

for "video" I mean

lapis sequoia Dec 11, 2022, 9:43 PM

#

strange igloo chatGPT is pretty funny


6 + 7 = 13

This shows that the sum of 6 and 7 is 14. I apologize again if my previous response was incorrect, and I hope this helps clarify the issue. Let me know if you have any other questions.```

#

very funny lmao its tweaking if u tell it that its wrong

haughty anvil Dec 11, 2022, 10:35 PM

#

Hi, are there any use cases for using speech-to-text with recommendation systems?

serene scaffold Dec 11, 2022, 11:06 PM

#

haughty anvil Hi, are there any use cases for using speech-to-text with recommendation systems...

if you want your recommendation system to accept voice input. what recommendation system?

haughty anvil Dec 11, 2022, 11:23 PM

#

Hi @serene scaffold thank you. I’d like to build one from scratch that accepts voice.

serene scaffold Dec 11, 2022, 11:23 PM

#

haughty anvil Hi <@253696366952316929> thank you. I’d like to build one from scratch that acce...

how "scratch" is from scratch? can you use libraries?

haughty anvil Dec 11, 2022, 11:32 PM

#

I’d like to build it from scratch for practice but will use a speech recognition library and a Python web framework

serene scaffold Dec 11, 2022, 11:35 PM

#

haughty anvil I’d like to build it from scratch for practice but will use a speech recognition...

do you want to train the speech recognition model yourself, or use an off-the-shelf one?

haughty anvil Dec 11, 2022, 11:38 PM

#

@serene scaffold I’m going to use an off the shelf one.

serene scaffold Dec 11, 2022, 11:38 PM

#

okay. what about the recommendation system? what's that supposed to do?

haughty anvil Dec 11, 2022, 11:40 PM

#

serene scaffold okay. what about the recommendation system? what's that supposed to do?

I’d like to build a recommendation systems for books. Where people give reviews on books and based on those they’d get other recommendations for books they should read.

serene scaffold Dec 11, 2022, 11:40 PM

#

haughty anvil I’d like to build a recommendation systems for books. Where people give reviews ...

can you think of what components that would need to have?

haughty anvil Dec 11, 2022, 11:41 PM

#

In terms of the features I’d like to have?

serene scaffold Dec 11, 2022, 11:42 PM

#

haughty anvil In terms of the features I’d like to have?

what do you know about NLP and recommendation systems in general?

haughty anvil Dec 11, 2022, 11:43 PM

#

Oh, I see. Currently I’m going through a course on building recommendation systems and NLP. I’d like to build the book system with speech-to-text as my final project. I was just trying to make sure it wasn’t overkill with the speech recognition part and there are use cases for it.

serene scaffold Dec 11, 2022, 11:45 PM

#

haughty anvil Oh, I see. Currently I’m going through a course on building recommendation syste...

I would focus on just the recommendation system for now. it wouldn't be difficult to add the StT part onto it.

haughty anvil Dec 11, 2022, 11:45 PM

#

Ok gotcha! Thank you, I’ll do that. It’s been great learning about these systems.

novel python Dec 12, 2022, 1:16 AM

#

what's the main reason my model loss would suddenly drop like that? I can't quite figure out

tranquil cobalt Dec 12, 2022, 5:13 AM

#

Hi, In my new task I am suppose to convert R -> Python. Is there any cheatsheet/documentation to start. Any help is appreciated. Thanks in advance.

eager shuttle Dec 12, 2022, 5:15 AM

#

does anyone in here know how to wait for an openai completion request to finish processing

tribal bloom Dec 12, 2022, 5:17 AM

#

eager shuttle does anyone in here know how to wait for an openai completion request to finish ...

that might just be the stupidest question ive ever seen

eager shuttle Dec 12, 2022, 5:20 AM

#

oh

worldly dawn Dec 12, 2022, 5:20 AM

#

tribal bloom that might just be the stupidest question ive ever seen

That type of attitude is not welcome
Let's stay welcoming and respectful.

eager shuttle Dec 12, 2022, 5:20 AM

#

Why is it a stupid question?

worldly dawn Dec 12, 2022, 5:21 AM

#

eager shuttle Why is it a stupid question?

It's not a stupid question. Ignore them.
Can you expand on your question?

eager shuttle Dec 12, 2022, 5:22 AM

#

worldly dawn It's not a stupid question. Ignore them. Can you expand on your question?

Well I'm trying to implement the new openai chat thing into my discord bot, and I'm running into an issue where the bot tries to respond before the AI has finished processing its response. I'm wondering if theres a way to delay the response until the AI finishes processing its response

worldly dawn Dec 12, 2022, 5:22 AM

#

eager shuttle Well I'm trying to implement the new openai chat thing into my discord bot, and ...

What does the code look like? I haven't used either libraries

eager shuttle Dec 12, 2022, 5:23 AM

#

async def ai(self, interaction: discord.Interaction, prompt: str):
        with open('../config.json', 'r') as f:
            data = json.load(f)
            openai.api_key = data['OPENAI_KEY']
            
        response = openai.Completion.create(
            model="text-davinci-003",
            prompt=prompt,
            temperature=0.80,
            max_tokens=1500,
            top_p=1,
            frequency_penalty=0.80,
            presence_penalty=0
        )
        response_embed = discord.Embed(
            title=prompt,
            description=response['choices'][0]['text'],
            color=discord.Color.green())
        await asyncio.sleep(len(prompt*2))
        await interaction.response.defer(thinking=True)
        await interaction.edit_original_response(embed=response_embed)

I have a simple command here that just creates the request from the API and then places the text into an embed and sends it. but the request takes time to finish processing. Right now I have a hard sleep coded in to try to account for that time

burnt tusk Dec 12, 2022, 6:54 AM

#

hey guys this is the main channel for ai right so I'm trying to make a ai that gets progressively better at aiming/shooting with a bow and my question is where do start when it comes to ai?

lapis sequoia Dec 12, 2022, 8:50 AM

#

What does it mean to evaluate two models then compare their performances with each other?

timber spoke Dec 12, 2022, 9:01 AM

#

lapis sequoia What does it mean to evaluate two models then compare their performances with ea...

basically comparison

lapis sequoia Dec 12, 2022, 9:18 AM

#

timber spoke basically comparison

but what is the difference between 'evaluate' and 'compare'? Because those words mean the same thing but they have them twice in the above statement so it makes me think it is asking for two seperate things

timber spoke Dec 12, 2022, 9:31 AM

#

lapis sequoia but what is the difference between 'evaluate' and 'compare'? Because those words...

let me explain using an example., let's say you have two designs for a digital multiplier. you are not sure which is better so naturally, you need to compare the two designs. so then how do you compare the two designs in this case? you basically have to look at things like overall delay or speed, implementation size, and say maybe whether they can perform both signed and unsigned multiplication or just one of the two. so these are basically things you can evaluate in order to be able to compare the two things. hopefully that makes sense.

patent lynx Dec 12, 2022, 10:12 AM

#

What does df.resample() do?

#

example of code here:

#

a = transactions_df.set_index("date").resample("M").transactions.mean().reset_index()
a["year"] = a.date.dt.year
px.line(a, x='date', y='transactions', color='year',title = "Monthly Average Transactions" )```

#

is it just a group by date then sum the transaction amount or there is a special function here?

celest geyser Dec 12, 2022, 11:51 AM

#

guys should i continue ML/AI path because i like ai i have done some basic just on the basis of tutorial but i m not too comfortable with advanced math like i can understand basic like linear regression and things like that but is math must need for it ? or should i switch my path?

wooden sail Dec 12, 2022, 12:01 PM

#

you can use AI libraries without knowing the math, but your understanding and problem-solving will be hindered

#

ML is, after all, math

lapis sequoia Dec 12, 2022, 12:43 PM

#

celest geyser guys should i continue ML/AI path because i like ai i have done some basic just ...

yeah you can continue

#

you can learn the maths on the go

#data-science-and-ml

reshape the input from (,10000) to (1,10000)

this returns a (3,10000) matrix

method to calculate the p-value via chi squared test.

prints the p-value.