#data-science-and-ml
1 messages · Page 35 of 1
With 1 output it looked kind of like this in feel:
Scalar-vector multiplication.
This is where some of the magic of matrices comes into play.
This looks like the dot product, but the other one is transposed.
Note how instead of squishing down to one thing, it expands out.
Now imagine a are the outputs and b are the inputs, you can see that it's doing what we did before for each output.
1 * inputs, 2 * inputs, 3 * inputs
Are the rows.
but these are different shapes, whereas in the picture they're both same length vectors
This doesn't seem right. Have you ever seen a validation dataset so low?
Try length 2 for a and 3 for b.
2 outputs, 3 inputs.
You can also try 1 output to see that it's the same as the backwards pass you had before.
you mean with np.matmul?
By hand or matmul.
Also if you have 2 outputs and 3 inputs, what is the shape of W here?
!e
import numpy as np
a = np.array([1,2])
b = np.array([3,4,5])
print(np.matmul(a,b))
@plush jungle :x: Your 3.11 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 4, in <module>
003 | ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 2)
2x3?
Yes, because the input is 3x1 and the output is 2x1 (2x3 * 3x1 -> 2x1).
Not exactly sure what you mean.
Do you mean in your code?
why is the code throwing an error in the code snippet I posted
This is a numpy detail. Your vectors have shape (2,) and (3,), not (2, 1), and (3, 1).
Matmul works on matrices.
You can reshape them first.
!e
import numpy as np
a = np.array([1,2])
b = np.array([3,4,5])
a = np.reshape(a, (2,))
b = np.reshape(b, (3,))
print(np.matmul(a,b))```
@plush jungle :x: Your 3.11 eval job has completed with return code 1.
001 | Traceback (most recent call last):
002 | File "<string>", line 8, in <module>
003 | ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 2)
!e```py
import numpy as np
a = np.array([1,2])
b = np.array([3,4,5])
a = np.reshape(a, (2,1))
b = np.reshape(b, (3,1))
print(np.matmul(a,b.T))
@iron basalt :white_check_mark: Your 3.11 eval job has completed with return code 0.
001 | [[ 3 4 5]
002 | [ 6 8 10]]
@plush jungle
oh ok nice
Note the transpose on b as described in the image.
right
For reference: https://en.wikipedia.org/wiki/Outer_product
In linear algebra, the outer product of two coordinate vectors is a matrix. If the two vectors have dimensions n and m, then their outer product is an n × m matrix. More generally, given two tensors (multidimensional arrays of numbers), their outer product is a tensor. The outer product of tensors is also referred to as their tensor product, and...
Now note that the result of this operation gives a matrix with dimensions that match the weight matrix.
ok i'm following so far
can you tie it into this?
Neuron 1 -> neuron 2 -> neuron 3
Neuron 3 gradient = dE/dw3
Neuron 2 gradient = dE/dw2 = d3w/d2w * dE/dw3
Neuron 1 gradient = dE/dw1 = d2w/d1w * d3w/d2w * dE/dw3```
So sticking the the matrix notation, we want dE/dW for updating reasons.
And because of the way the weight update rule works, we need that to be a matrix with the same shape as the weights.
Note here you had w_i, but since we are working with matrices, there is no subscript, it's just W.
Let's let a be the W^Tx+b:
.latex $$\bm{o}=\sigma(\bm{a})$$
so I've got this
def backpropagate(self, output, y):
# output neuron update
# this outputs a 3-vector
output_gradient = -(y - output[2]) * output[2]* (1-output[2]) * output[1]
# update the weights with this 3-vector
self.output_layer.weights -= output_gradient * self.lr
# hidden layer update
# this outputs a 3-vector
hidden_delta = -(y - output[1]) * output[1] * (1-output[1])
# reshape it from (3,) to (3,1)
hidden_delta = np.reshape(hidden_delta, (3,1))
# reshape the input from (10000,) to (1,10000)
output[0] = np.reshape(output[0], (1,10000))
# matrix multiply the delta and the input
# this returns a (3,10000) matrix
hidden_gradient = np.matmul(hidden_delta, output[0])
but the hidden_gradient needs to be multiplied by the output_gradient for the chain rule part before it can update the hidden layer weights
but I don't understand how I can multiply a (3,1) gradient and a (3,10000) gradient and get a (3,10000) gradient
Does anyone know a way to visualize the changes made to pandas data frames when transforming them? I do not mean visualize the data in a graph but visualize how the data frame itself is changed. I know of pandas tutor but it seems to be only usable as a website. Not as a way of documenting a data transformation pipeline. Thanks!
A y - o does not happen in the hidden layers. Only the output layer.
If you have 1 output neuron, then y will also be length 1, but if your hidden layer has 3 neurons, then that won't work. Can't do 1 dimensional thing minus 3 dimensional thing.
Before trying to get backprop to work, try just 1 layer with multiple outputs.
hey, anyone knows any open source projects to practise what I learnt in python course?
kaggle
isnt it enough to check input/output of ur frame?
Hello. I require help on choosing a dataset for speed calculations based on gps tracking. I have been searching for it however haven't been able to get a dataset. Our teacher has told to do proper research for the project but I just can't get hold on where to start. I and my team would be really thankful if somebody could help.
No, there's a lot of transformations going on. Many different batches of data from many different sources each requiring many transformations.
thanks
Hi! I have a question regarding feature engineering, in particular feature selection.
My data set consists of text and has features such as: ``` Scentence - {"scentence in here", "another scentence here", ...}
Topic - {"Sports", "elections", "food"}
Label - {"Bias", "Non-Biased}
For numeric data, I know there are things like: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.corr.html. Is there something similar but for textual data?
Hello everyone
I want to ask a question related to AI and ML
can I please ask here
hi everyone, I'm just getting started on Python, is this the right place to ask questions ?
just ask
Hello everyone
I seen many new tools regarding ai. So I want to ask this question, I am planning to create something, that will help people to generate illustrations, ui designs according to their needs, is this something possible?
Ui like the final results, like dribbble present ui's, people often time don't know what / how to design
is it something doable to build a tool like this and solve this problem with ai?
If its possible, then how can I do it?
Making beautiful illsutrations or web ui's according to individual needs
Like this, I found this attractive design on dribbble, but wants to automate it with the help of AI, I think AI can also be used in things that help people, what are your thoughts on this? is it a useful / helpful concept that would be very helpful for people in order to generate designs?
If you know their job position and want to predict what they will be earning why not simply use the mean or median of their position?
If you have more information and want very accurate predictions, you could make multiple separate models for each job position.
hi i am having trouble with merging two data frames
import pandas as pd
df=pd.DataFrame({'name':['hamid','meow','billu'],'job':['cs','cat','cat']})
df2=pd.DataFrame({'hobby':['playing','eating','sleeping'],'friends':['mahab','carry','kutta']})
df3=pd.merge(df,df2, left_on='name', right_on='hobby')
print(df3)
kindly help the above code displays empty data frame on output
explain why machine learning model will perform better than the others
would u like to help
If you know how to use vlookup in excel or joins in SQL, you should be able to merge data in python using merge()well
Just pay attention to the columns that match in both dataframe
Hi! I have a question regarding this approach of finding the correlation between a feature that consists of strings: https://stackoverflow.com/questions/51241575/calculate-correlation-between-columns-of-strings. Does this actually state the accuracy of the feature? I know this may be a silly question, I am just a bit curious (I have a dataset that has some articles with a bias/non-bias label, and a topic feature. When trrying this method, the topic feature had quite a low (around 0.03) correlation to the bias label)
there's no unique way of doing this. you can encode the strings in different ways
this categorical approach has the caveat that the order in which you encode strings affects the correlation because the code is not equidistant
you could one-hot instead, which would make strings equidistant, but also pairwise orthogonal and yield high dimensional vectors
pick your poison, no method is perfect
i generally use onehot
pd.dummies(df)```
or you could encode them as giant binary numbers (how they are internally stored) and feed them in, that has the problem of managing such large numbers, it's less hard in python, but for large strings it could be diffcult.
This should help
Hey @lavish kraken!
It looks like you tried to attach file type(s) that we do not allow (.pdf). We currently allow the following file types: .gif, .jpg, .jpeg, .mov, .mp4, .mpg, .png, .mp3, .wav, .ogg, .webm, .webp, .flac, .m4a, .csv, .json.
Feel free to ask in #community-meta if you think this is a mistake.
@lavish kraken we only allow links to PDFs, sorry
ok, just trying to help the guy..let me copy the image and paste right?
the image is fine. if the link to the pdf does not involve piracy, that's also fine
I get an error when importing a sequential model but I don't see why
Can someone help me please?
Does this mainly impact strings with more than 1 word? Ie would a single worded string not be impacted by this?
it would also affect the 1 word case
Just to confirm that im not doing it wrong/missunderstanding, with one-hot, would that essentially check the correlation between each value under the feature (in this case topic) and the outcome (when I give hot encoding a try, this is what happens)? I.e with the StackOverflow version i sent in the previous message, I can get a correlation between the feature as a whole and the outcome value, while with onehot it would return the correlation between each value that the feature can have and the outcome value?
no, they would both do the same thing
take one feature and compare it to another
that ofc includes taking a single value if you like
I'm trying to convert this data to be a dataclass that holds a list of dataclasses
data:
DATA_POINTS = {
'data_points': [[1.0, 1.2132985766400843], [2.0, 1.164865727865016], [3.0, 1.1534609099056354],
[4.0, 1.148530443569608], [5.0, 1.1488081940756838], [6.0, 1.156518190001923]
}
my code:
async def test_parse_datapointsTO():
datapoint = DataTO.from_dict(DATA_POINTS)
print("worked")
@dataclass
class DataTO(JSONWizard):
some_field: str
raw_data_points: list[DataPointTO]
@dataclass
class DataPointTO(JSONWizard):
class _(JSONWizard.Meta):
debug_enabled = True
raise_on_unknown_json_key = True
x: float
y: float
@property
def data_point(self) -> list[float, float]:
return [self.__x, self.__y]
@data_point.setter
def data_point(self, data_point: list[float, float]):
self.__x = data_point[0]
self.__y = data_point[1]
Can someone explain me why this isn't working or help me fix it 😮
Thanks in advance!
Hi dear fellows.
Does anyone here own this book:
Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and The Cloud.
If so, could you share it here please ...
This server is not a place to seek out pirated books
hi, is there a way to get confidence intervals when using scipy's fisher_exact?
ok clearly I've done something wrong here
def backpropagate(self, output, y):
# output neuron update
# this outputs a scalar
output_delta = -(y - output[2]) * output[2]* (1-output[2])
# update the bias with this scalar
self.output_layer.biases -= output_delta * self.lr
# multiply the delta times the input to produce a 3-vector
output_gradient = output_delta * output[1]
# update the weights with this 3-vector
self.output_layer.weights -= output_gradient * self.lr
# hidden layer update
# this outputs a 3-vector
hidden_delta = output_gradient * self.output_layer.weights * (1 - output[1])
# update the biases with this 3 vector
self.hidden_layer.biases -= hidden_delta * self.lr
# reshape it from (3,) to (3,1)
hidden_delta = np.reshape(hidden_delta, (3,1))
# reshape the input from (,10000) to (1,10000)
output[0] = np.reshape(output[0], (1,10000))
# this returns a (3,10000) matrix
hidden_gradient = np.matmul(hidden_delta, output[0])
# update the hidden layer weights with this matrix
self.hidden_layer.weights -= hidden_gradient * self.lr```
because the loss for one class keeps going up, and the loss for the other class keeps going down
some that I checked out myself:
just want to get stuff done? fast.ai on their website
want to understand it? andrew ng's on Coursera
don't like andrew's for whichever reason? sklearn inria mooc
there are some others on our website as well
!resources
The Resources page on our website contains a list of hand-selected learning resources that we regularly recommend to both beginners and experts.
some basic notion of algorithms and data structures might help, but you really don't need a lot to get started, and imo it's better to look things up as you need them than to try to get ready for the unknown
thanks for the help! but I'm not sure I understand. output_gradient and self.output_layer.weights does have a * symbol
and hidden_gradient is using np.matmul
yeah my bad im only 16 im probably dumb lol
don't discount yourself, imposter syndrome is bad enough as it is among programmers
but I don't see where you changed the code
I have zero experience whatsoever programming but I have had an AI write something for me, and I have been attempting to work out the kinks in visual studio code, not sure if it’s worth it but it’s pretty fun messing around with it nonetheless
Anybody have any advice for an absolute beginner in this situation lol
@plush jungleyou should use the output layer's input (which is the hidden layer's output) to calculate the gradient, instead of the hidden layer's output. You can do this by changing the line output_gradient = output_delta * output[1] to output_gradient = output_delta * output[0]. This should fix the issue where the loss for one class keeps going up and the loss for the other class keeps going down.
maybe
In machine learning, backpropagation (backprop, BP) is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions generally. These classes of algorithms are all referred to generically as "backpropagation". In fitting a neura...
but if I do that, then the outut_gradient becomes shape (1,10000)
since output[0] is the pixel vector of the image
this doesn't make sense. shouldn't there also be a multiplication with the previous layer's input? I don't see that on here
o_i
the top equation for an output neuron looks exactly like my line here
output_delta = (output[2] - y) * output[2]* (1-output[2])```
but then I do this
```py
output_gradient = output_delta * output[0]```
Extra bit of information that might help.
oh wait I see it squiggle. the second picture you posted has it
For example, when using one hot encoding: ```python
s_corr = df.topic.str.get_dummies().corrwith(df['Label_bias'].astype('category').cat.codes)
print(s_corr)
------output-------
abortion -0.051860
coronavirus 0.002015
elections-2020 -0.024377
environment 0.017935
gender 0.055068
gun-control 0.005897
immigration 0.029517
international-politics-and-world-news 0.026522
middle-class -0.013546
sport 0.145206
student-debt 0.005479
trump-presidency -0.085066
vaccines -0.015487
white-nationalism -0.095884
```python
dfCopy = df.copy()
dfCopy['topic'] = dfCopy['topic'].astype('category').cat.codes
dfCopy['Label_bias'] = dfCopy['Label_bias'].astype('category').cat.codes
dfCopy.corr()['Label_bias']['topic']
------output-------
-0.03312874060844754
I see these as 2 different perspectives, ie the onehot shows how each specific value of topic has an impact on the label. Not sure if this makes sense, but to convert it to a single value (i.e to judge the corrolation of the feature "topic" as a whole with the output label) would it make sense to just calc the average of all the output values of the One-hot encoding?
so how is my code any different? it looks exactly the same
In mathematics, particularly in linear algebra, matrix multiplication is a binary operation that produces a matrix from two matrices. For matrix multiplication, the number of columns in the first matrix must be equal to the number of rows in the second matrix. The resulting matrix, known as the matrix product, has the number of rows of the first...
Remember that * in numpy is element-wise multiplication.
Not matrix multiplication or the dot product.
What operation is being done here?
that's matrix multiplication right?
Yes, but also note that the delta only has one subscript.
So delta is a vector.
(Which can be seen as a matrix with only 1 column, so the indices are like j, 1 optionally)
right
Note that in the forward pass we did matrix-vector multiplication for multiple outputs.
Sort of distributing the inputs to all of the outputs.
And when we go backwards, we want to distribute the deltas to the inputs.
so this line is wrong then right?
hidden_delta = output_gradient * self.output_layer.weights *(1 - output[1])```
it doesn't have np.matmul(w_j, dl)
Yes, also you can implement it with plain old loops which follows the subscript notation exactly.
Then convert to numpy / the linear algebra notation way after.
but what are w_j and dl here
w_j I'm guessing is
self.output_layer.weights```
but is dl the output_gradient?
de/do is definitely output_gradient
but I actually don't know what do/dnet is
"net" here is explained in the wikipedia post, it's the input to the sigmoid (sigmoid(net)).
so if we're calculating the gradient to adjust weights j, it would be the input to neuron j
right
.
and delta_j is an elementwise multiplication of those two derivatives?
Exactly as the equations are written, you just need the loops for the indices.
Which can be absorbed into notation as before with dot and matrix product.
delta here is a vector, as can be seen by the single subscript, so delta_j is a single component of that vector.
Try reading through the wikipedia section on matrix multiplication and see if you can understand how the subscript definition works.
Then see if you can write matrix-vector multiplication in subscript form.
The deltas or """errors""".
so not the previous layer's gradient, but the previous layer's delta?
You can get away with the term gradient, but it's not exactly right.
So just "deltas".
delta for the output is calculated like this
output_delta = (output[2] - y) * output[2]* (1-output[2])```
but once you do this
```py
output_gradient = output_delta * output[1]```
what is it called if not the gradient?
Gradient is the end result you get from all this stuff before you use it to update the weights.
so technically it's only a gradient once you have it for every layer?
As for this, i'm not exactly sure if it's correct (i'm not following all of your code), but if it's what you end up applying to the weights when you subtract by alpha times it, then yeah.
I see. is there a shorter way to write that than
derivative_of_error_with_respect_to_weights```
Sorry to jump in in the comversation but has anyone used Jax Jit functions with Haiku?
nabla?
for the inner neurons I'm thinking something like this
hidden_delta = np.matmul(self.hidden_layer.weights, output_delta) * output[1] * (1-output[1])
but I think it's wrong since it's the wrong shape.
everything looks right except output_delta
What is output_delta?
which was calculated like this
output_delta = (output[2] - y) * output[2]* (1-output[2])```
but if dl isn't output_delta, what is it?
So from the equation you can see that there is no (o - y) for the inner neurons.
If this is the output layer then it's correct.
Oh wait, ok.
So what is the shape mismatch?
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 1)```
hidden layer weights should be a 3x10000 matrix
What is the shape of output_delta?
(1,1), it's a scalar
The weights should be the ones from the hidden to the output.
ok it now looks like this and throws no errors
def backpropagate(self, output, y):
# output neuron update
# this outputs a scalar
output_delta = (output[2] - y) * output[2]* (1-output[2])
# update the bias with this scalar
self.output_layer.biases -= output_delta * self.lr
# multiply the delta times the input to produce a 3-vector
output_gradient = output_delta * output[1]
# update the weights with this 3-vector
self.output_layer.weights -= output_gradient * self.lr
# hidden layer update
# this outputs a 3-vector
hidden_delta = np.matmul(output_delta, self.output_layer.weights) * output[1] * (1-output[1])
# update the biases with this 3 vector
self.hidden_layer.biases -= hidden_delta * self.lr
# reshape it from (3,) to (3,1)
hidden_delta = np.reshape(hidden_delta, (3,1))
# reshape the input from (,10000) to (1,10000)
output[0] = np.reshape(output[0], (1,10000))
# this returns a (3,10000) matrix
hidden_gradient = np.matmul(hidden_delta, output[0])
# update the hidden layer weights with this matrix
#self.hidden_layer.weights -= hidden_gradient * self.lr
self.hidden_layer.weights -= hidden_delta * self.lr```
but something is still wrong
when given class 0 and class 1, it only goes in one direction, adjusting the weights either positively or negatively. it should alternate, adjusting it towards the class label
The reshaping seems error prone.
why
Your hidden delta should be the correct shape from having done the matmul correctly.
if I print its shape out before I reshape, it's 1,3
then reshape makes it 3,1
Other than that, it's just setting off my warnings, idk, subconscious.
So, one thing that could be wrong is signs. Try += instead of -= for the weight updates.
Oh.
Don't update the weights until the end.
You are updating the output layer weights before continuing back.
Then multiplying with the updated, not the old.
I'm changing the weights and then using the changed version
This is why everyone uses an autodiff tool, manually doing it is painful.
Do phases of compute the gradients, then update with them.
Two separate paragraphs / sections.
ok the only reshapes I have now are turning the biases from (,3) to (1,3) and the x from (,10000) to (1,10000)
def backpropagate(self, output, y):
# this outputs a scalar
output_delta = (output[2] - y) * output[2]* (1-output[2])
# multiply the delta times the input to produce a 3-vector
output_gradient = output_delta * output[1]
# this outputs a 3-vector
hidden_delta = np.matmul(output_delta, self.output_layer.weights) * output[1] * (1-output[1])
# reshape the input from (,10000) to (1,10000)
output[0] = np.reshape(output[0], (1,10000))
# this returns a (3,10000) matrix
hidden_gradient = np.matmul(hidden_delta.T, output[0])
# update the output bias
self.output_layer.biases -= output_delta * self.lr
# update the output weights
self.output_layer.weights -= output_gradient * self.lr
# update the hidden layer biases
self.hidden_layer.biases -= hidden_delta * self.lr
# update the hidden layer weights
self.hidden_layer.weights -= hidden_gradient * self.lr```
instead of flipping it with reshape I used transpose for this line
hidden_gradient = np.matmul(hidden_delta.T, output[0])```
but the predictions are still moving in only one direction
You have a += instead of -= for one of them.
Did you shuffle the inputs?
What is the lr?
I do not shuffle the inputs. I've tried learning rates between .5 and .00005 but it's the same thing. high learning rates eventually converge and stop updating
['0[[0.25051035]]', '1[[0.00313093]]']
['0[[0.24686542]]', '1[[0.24683338]]']
['0[[0.2468654]]', '1[[0.2468654]]']
['0[[0.2468654]]', '1[[0.2468654]]']
['0[[0.2468654]]', '1[[0.2468654]]']```
loss for class 0 and class 1 examples with a high learning rate
Instead of setting output[0] to the shape, just pass the reshape directly to the matmul.
same thing
What is the forward pass?
output_vectors = nn.forward(input_vector)
output_vectors.insert(0,input_vector)
prediction = output_vectors[-1]
error = L2_loss(image_class, prediction)
nn.backpropagate(output_vectors, image_class)```
What is forward doing and what is backpropagate doing?
backpropagate is the above function. forward is this
def forward(self,x):
output_vectors = []
for layer in self.layers:
# if the first layer
if not output_vectors:
# pass it the input vector
x = np.tile(x,(layer.num_neurons, 1))
output = layer.forward(x)
else:
# otherwise pass it the previous layer's output
output = layer.forward(output_vectors[-1])
output_vectors.append(output)
return output_vectors```
backpropagate takes a single output vector, not multiple.
Oh wait, ok, it's for the layers.
Why is x tiled?
because x is the input vector, 10000 length, but it gets passed to 3 hidden layer neurons
so I need to pass it to each neuron
so I make it (3,10000)
The 3 neurons all share the same 10000 inputs.
If it's fully connected.
You don't duplicate them.
yes, but np.tile duplicates the input vectors right
def forward(self,x):
return sigmoid(np.sum(self.weights * x) + self.biases)```
ok i think I see. this only works if the weights and the input are the same shape
but if I didn't do tile and instead did matmul(x, self.weights)
they're the same thing if the matrices are the exact same shape, right?
Matrix multiplication basically resamples the inputs again for each output.
So no duplication needed.
You want each output neuron to do the dot between its weight vector and the input vector.
So there is 1 input vector, and N weight vectors.
Wx, matrix multiplication, does multiple dot products.
The weight vectors are stored in W.
Side by side.
When you do W^Tx it's basically doing the dot product between each weight vector and x.
And so you get out a vector where each component is the dot product result.
like this
def forward(self,x):
return sigmoid(np.dot(self.weights, x) + self.biases)
output = layer.forward(x)```
We already covered this and you find it in our previous messages.
yeah
except wait, since weights is a matrix
it should be
def forward(self,x):
return sigmoid(np.matmul(self.weights, x) + self.biases)
output = layer.forward(x)```
That requires x to be a matrix and numpy lets us be lazy with dot.
ok
Let's us do stuff like (n, m) (m,).
but changing it to np.dot and getting rid of tile threw this
File "<__array_function__ internals>", line 180, in dot
ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)```
yes
self.biases = np.random.rand(num_neurons)
self.biases = np.reshape(self.biases, (1,num_neurons))```
What is the shape of x and weights?
Oh, why is biases reshaped?
Let it just be a vector.
becuase np.rand outputs (3,)
We can temporarily reshape in backprop.
ok
x
weights
x
weights
looks like this
(10000,)
(3, 10000)
(3,)
(1, 3)```
Which dot is causing the error?
wdym?
For which weights / x.
it looks like it's just the last two
wait actually
the forward error goes away
when I get rid of the bias reshape
but then there's an error on this line
# this returns a (3,10000) matrix
hidden_gradient = np.matmul(hidden_delta.T, output[0])```
What is the shape of hidden_delta and output[0] and what is the error?
I have an idea already.
(3,) (1, 10000)```
ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 1 is different from 3)```
So before you had (3, 1) i'm guessing.
yeah so I guess I'll just reshape hidden_delta
But now that outputs are (n,) and not (n, 1) your deltas are also like that.
but inside
Yeah then transpose, transpose on (n,) does nothing.
(Annoying detail of numpy is this shape stuff)
(But you get used to it)
Just keep in mind what is a vector and what is a matrix.
(n,) vs (n, 1)
Two different things in numpy.
does tensorflow do it differently?
IDR, probably not.
ok, fixed all those things, but the predictions are still going in one direction
i'ma go eat dinner, thanks so much for your help so far
(n,) basically does not assume you want your vectors to be column or row vectors by default.
And will try to auto match with stuff like dot, but with matmul specifically it wants matrices for real and won't try to auto match.
Quick question on linear regression models. If our target data is skewed, do we want to transform the data or leave it alone?
Short answer: Yes, transform the data.*
Long answer: https://anshikaaxena.medium.com/how-skewed-data-can-skrew-your-linear-regression-model-accuracy-and-transfromation-can-help-62c6d3fe4c53
short and accurate answer: linear regression often uses the least squares solution, but this is only optimal for normally distributed residuals
the estimator is fine, but a different solution approach would be needed
Yes, heteroskedasticity.
this is np.log as well
Yes, but taking the log isn't a magical cure all bullet. It depends on the data your working with and the relationships between X and Y.
https://github.com/microsoft/gather
this looks pretty slick, anyone tried it before?
I wanted to ask regarding the optimization on linear regression
how do you write m formula within python?
i tried:
for i in range(len(xdat)):
m1 = (np.sum(xdat[i]-xbar)*ydat[i])
m2 = (np.sum((xdat[i]-xbar)**2))
m = m1/m2 ```
but this formula does not give correct value
however this works:
x = []
xi = []
x.append((xdat[i]-xbar)*ydat[i])
xi.append((xdat[i]-xbar)**2)
m = np.sum(x)/(np.sum(xi))
the x dat, y dat represents the array of x and y in a point:
correct ans
Hello, I have a question. There is panda’s dataframe.
Index,Number,datetime,counter
2488 196 2022-12-06 08:02:00 14496
2489 186 2022-12-06 09:05:00 15551
2490 138 2022-12-06 10:29:00 5448
2491 140 2022-12-06 10:30:00 4749
2492 140 2022-12-06 10:31:00 4749
I need to create dataframe with newest counter for the each Number
I know how to do that with sql tools, I would like to use pandas. Can you help me?
wrong method ans:
the only difference i see is that you didn't write the loop in this second one, idk if you did that just to save time when posting here. otherwise the two expressions are the same.
ah nvm, i got it. you mixed some stuff up
Oops yeah both should have the loop
you do np.sum, but it's not needed because what's inside the sum is just a scalar. you're not summing up at all in the first one
you can remove the sum and instead use +=
you're just overwriting m1 and m2 at every iteration, ignoring the older results
that'd be the difference
also note that there is no need to loop if you use numpy
Thank you
so i'd do
import numpy as np
#define vectors x, y and scalars x_bar, y_bar
m = np.sum((x - x_bar)*y)/np.sum((x - x_bar)**2)
c = y_bar - m*x_bar
you can speed that up a little if you use dot products instead, too
Guys, in GANs, if my Generator has around 1.000.000 trainable parameters, and my Discriminator has 50.000 trainable parameters, then does this means that my Generator tends to get more optimized than my Discriminator, thus leading to lack of convergence? Or it doesn't matter at all, since the discriminator coordinates the optimization?
hi there, I am martin and looking for someone to work with me in python app which is used for data visualization. I have created the basic app and it works fine on android but i am kinda week with matplotlib, if someone is good with matplotlib, should join me
Hi martin i am data scientist and working at tableau right now
Are you sure you have created a app which works on android without an issue even we at salesforce(tableau) can't do much on android. Our basic graph lacks proper screen size
What do you use for visualization and what type of visualization do you plan to do? Matplotlib is nice, but it is not an ultimate package. There are many visually beautiful and easy-to-use frameworks, like seaborn or plotly-dash, that allow dynamic charts.
plotly will not work on android and i have added all the plots of seaborn
yes i know the problem the thing is i have created my own backend for that with most concepts taken from matplotlib
guys I have a task where I need to predict the year of something happening given the scenario. What kind of ml models can be put to use in this kind of scenario?
ping me on reply
Could you be more specific on the dataset? Do you have a time-series task or tabular data?
ok, so basically i'm working on dataset which has court cases
the attributes are yearoffiling, judge position, region where this case was taking place, year of decision, etc... so i need to predict the year when the decision will be taken given above attributes
well i dont think this is a time series
Linear regression can definitely not be used...
knn not possible as need to predict future dates
time series model such as xgboost is also not possible.... cos i dont find the problem fits into this category
how do i go about with this?🤔
@frozen geyser anything you would suggest?
HOW CAN I FIX TENSORFLOW ERROR
Call arguments received by layer "conv2d_3" (type Conv2D):
• inputs=tf.Tensor(shape=(None, 2, 2, 64), dtype=float32)
code: ```py
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10))
error is on the last model.add(layers.Conv2D(64, (3, 3), activation='relu'))
python 3.10.7 using tensorflow-cpu==2.10.0
show the error so w can see
this
show the whole error
Traceback (most recent call last):
File "d:/real_Python/projects/test/test/main.py", line 127, in <module>
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
File "D:\real_Python\projects\test\lib\site-packages\tensorflow\python\trackable\base.py", line 205, in _method_wrapper
result = method(self, *args, **kwargs)
File "D:\real_Python\projects\test\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "D:\real_Python\projects\test\lib\site-packages\tensorflow\python\framework\ops.py", line 1969, in _create_c_op
raise ValueError(e.message)
ValueError: Exception encountered when calling layer "conv2d_3" (type Conv2D).
Negative dimension size caused by subtracting 3
from 2 for '{{node conv2d_3/Conv2D}} = Conv2D[T=, 1], explicit_paddings=[], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true](Placeholder, conv2d_3/Conv2D/ReadVariableOp)' with input shapes: [?,2,2,64], [3,3,64,64].
Call arguments received by layer "conv2d_3" (type Conv2D):
• inputs=tf.Tensor(shape=(None, 2, 2, 64), dtype=float32)
aha
you have too many layers of convolution and maxpooling
the size of the intermediate values is too small to go through a conv layer
would this work? ```py
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
instead
just take away the two last model.adds
new error though
tensorflow's default behavior is to do no padding, meaning each convolution reduces the dimensions by something like kernelsize/2 + 1
@wooden sail the code is: ```py
model.compile(optimizer='adam',
loss=losses.sparse_categorical_crossentropy(from_logits=True),
metrics=['accuracy'])
the new error is: Traceback (most recent call last): File "d:/real_Python/projects/test/test/main.py", line 132, in <module> loss=losses.sparse_categorical_crossentropy(from_logits=True), File "D:\real_Python\projects\test\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler raise e.with_traceback(filtered_tb) from None File "D:\real_Python\projects\test\lib\site-packages\tensorflow\python\util\dispatch.py", line 1170, in op_dispatch_handler result = api_dispatcher.Dispatch(args, kwargs) TypeError: Missing required positional argument
what could i have miseed??
thats what it says
unless its lying
anyone have any idea on this?
@wooden sail or @floral hollow any thoughts on this?
Ok. So that is a tabular data - regressive model.
Prepare the data:
- Do some EDA (exploratory data analysis) with statistics, boxplots, categorization, etc.. to understand what is important and what is not.
- Perform Data Cleaning (Fill the missing data, eliminate outliers, clean\unite the categories, etc.)
- Make a data preparation for the model (labeling/encoding/embedding/binning/PCA analysis)
The choice of the model depends on the size of your dataset.
Based on the data size estimate the model. Try several models, starting with Linear/Tree-based (LGB)/XGB/Catboost and then shallow NNs. If nothing works, try more exotic (FM/FFM/DNNs) and/or Ensembles of these models.
My humble guess - One of the Random Forest models will give you a good result, if your dataset is not very large.
But how can I make the model predict years(integers) cos when I use regressive models they result is horrible resuts which are no way close to expected and also include float numbers
And thanks for such an elusive answer, the way you've put your thoughts into words gives a good headstart
Also this is not a classification task, so I felt tree based models can't be used(I'm a beginner, so I'm open for any disagreements and looking for it)
I see no problem with float numbers, been rounded to integers. 🙂
Most of the bad results are usually coming from the bad datasets. garbage-in -> garbage-out. This is why we need to make lots of efforts to clean the data.
tree-based models are good for regression tasks as well.
If you have some time, download Orange-Canvas software (you have it in Anaconda as well) and play with your dataset. It has very visual approach for modeling and visualization of results. Very useful as a first step. 🙂
yes i thought initially about the rpunding part.. but the problem is that when it had to predict 2018 or something of this sort it gives 3319.222 which on rounding is also not beneficial
garbage-in -> garbage-out
wow loved this
I'm done with eda and preprocessing
the model building is something i was struggling
I'll try looking at orange software (i do have anaconda)
Thanks a lot.
Loved the way you are conveying things
🙂
When you are free let me know how did you get into the field of ML(looking for your first steps and resources that helped you in this journey)
(No hurries)
i am dealing with a problem statement that requires me to convert speech (with lot of disturbances) to text
what algorithms can i start with?
How can I get only key values from a dictionary?
Hello
I am trying to use open cv to open an image
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
img = cv2.imread # from the tutorial I am watching , the guy is using imread , but in my case that function does not exist
dictionary.keys()
im newb sry
Hey, how dataset for Q&A chatbot should looks like? Im using Bert and I have those columns so far in csv file: context, questions, answers, starting_point. Have I missed something?
well, the OpenAI GPT-3 chatbot was trained on the davinci corpus, both text and code
for the base model, they used code-davinci-002 and then extended it with text-davinci-002 and text-davinci-002
woops, text-davinci-003
woops, I was wrong there, those are the model names
they used the commoncrawl corpus. it can be found here => https://commoncrawl.org/
I'm pretty sure something is wrong with my backpropagation code
def backpropagate(self, output, y):
# this outputs a scalar
output_delta = (output[2] - y) * output[2]* (1-output[2])
# multiply the delta times the input to produce a 3-vector
output_gradient = np.matmul(
np.reshape(output_delta, (1,1)),
np.reshape(output[1], (1,3)))
# this outputs a 3-vector
hidden_delta = np.matmul(output_delta, self.output_layer.weights) * output[1] * (1-output[1])
# reshape the input from (,10000) to (1,10000)
output[0] = np.reshape(output[0], (1,10000))
# this returns a (3,10000) matrix
hidden_gradient = np.matmul(np.reshape(hidden_delta, (1,3)).T, output[0])
# update the output bias
self.output_layer.biases -= output_delta * self.lr
# update the output weights
self.output_layer.weights -= output_gradient * self.lr
# update the hidden layer biases
self.hidden_layer.biases -= hidden_delta * self.lr
# update the hidden layer weights
self.hidden_layer.weights -= hidden_gradient * self.lr```
when I train on two image classes, the predictions move towards one class on every single iteration instead of moving towards the class it's getting trained on
so the loss goes up for one class and down for the other
What is image_class for the first 20? Can you print them?
I set it so it alternates 0,1,0,1
with learning rate at .05 it goes like this
['0 : 0.49880739489615755', '1 : 0.9286612738113579']
['0 : 0.49685552291208046', '1 : 0.5031616959044104']
['0 : 0.49685551278683343', '1 : 0.5031444873024917']
['0 : 0.49685551278678103', '1 : 0.5031444872132199']```
where the number on the left is the class and the number on the right is the prediction
as you can see they're both going lower, so even when it's training on image class 1 it adjusts towards 0
What is the forward for a layer?
what do you mean? the output of the forward pass?
The code for a forward pass on a layer.
def forward(self,x):
return sigmoid(np.dot(self.weights, x) + self.biases)```
this is nn.forward()
def forward(self,x):
output_vectors = []
for layer in self.layers:
# if the first layer
if not output_vectors:
# pass it the input vector
output = layer.forward(x)
else:
# otherwise pass it the previous layer's output
output = layer.forward(output_vectors[-1])
output_vectors.append(output)
return output_vectors```
output_gradient = output_delta * output[1]
yeah, that multiplies the output of the hidden layer times the output_delta
should that be matmul?
yeah
ok, changed it to this
output_gradient = np.matmul(
np.reshape(output_delta, (1,1)),
np.reshape(output[1], (1,3)))```
still the same issue though
Can you simplify this? ```py
reshape the input from (,10000) to (1,10000)
output[0] = np.reshape(output[0], (1,10000))
this returns a (3,10000) matrix
hidden_gradient = np.matmul(np.reshape(hidden_delta, (1,3)).T, output[0])
I could make it a one liner
if that's what you mean
def backpropagate(self, output, y):
# this outputs a scalar
output_delta = (output[2] - y) * output[2]* (1-output[2])
# multiply the delta times the input to produce a 3-vector
output_gradient = np.matmul(
np.reshape(output_delta, (1,1)),
np.reshape(output[1], (1,3)))
# this outputs a 3-vector
hidden_delta = np.matmul(output_delta, self.output_layer.weights) * output[1] * (1-output[1])
# this returns a (3,10000) matrix
hidden_gradient = np.matmul(
np.reshape(hidden_delta, (3,1)),
np.reshape(output[0], (1,10000)))
# update the output bias
self.output_layer.biases -= output_delta * self.lr
# update the output weights
self.output_layer.weights -= output_gradient * self.lr
# update the hidden layer biases
self.hidden_layer.biases -= hidden_delta * self.lr
# update the hidden layer weights
self.hidden_layer.weights -= hidden_gradient * self.lr```
Try a more simple task first, not images.
like what
2 inputs, 1 output
Yes.
so then the hidden layer neurons would have 2 weights
Yes, but you should not have to hard code those values.
yeah i didn't
Just how many neurons per layer.
What is hidden_delta's shape? What is the biases shape?
If I were to make a bot that is designed to answer questions from a quiz on chrome, what packages should I use?, new to ai development
I changed to to the xor problem and it does the same thing:
input:ground_truth:prediction
before training
[0, 0] : 0 : 0.884287006863036
[0, 1] : 1 : 0.8970981298788248
[1, 0] : 1 : 0.9055810664805809
[1, 1] : 0 : 0.9148745573457218
after training
[0, 0] : 0 : 0.882638663530517
[0, 1] : 1 : 0.8955110093882327
[1, 0] : 1 : 0.9040762320307318
[1, 1] : 0 : 0.913433964744165```
all the predictions just got lower
this is making me wonder if I've got like a sign error
Try y - output and += instead of -=.
Hey guys, when extracting features from an image, I can use a neural network with an architecture optimized for feature extraction(VGG19, UNet) or I can use PCA, right? There's no "right or wrong". The difference is just that one option requires training and optimization through time, while the other doesn't?
Also, is there a parameter I can use to determine how much I should reduce the dimensions of my image in a feature extracture network? I know that VGG19 and UNet encoder tend to reduce dimensionality until it gets feature maps with shapes like 8x8x512, but why not stop at 16x16x256? Or at 4x4x1024? Or even 1x1x4048?
Well pca and a feature extraction NN both reduce dimensions, but the way they do it is really different
I think for images, you would really want to use features extracted by such a NN, because they are made for the purpose of finding "useful" features in images
And those feature extraction NNs are normally made by having a encoder-decoder like architecture, or a NN that has some task, and then chopping the last few layers off
The output size that you need is hard to say, and will most likely have to be found with some trial-and-error
Depending on your ANN, it may be able to grow more neurons as needed dynamically. If not, it's a parameter that can be chosen via trial and error or some more complicated way. The correct choice is "big enough but not too big that no significant reduction in number of dimensions is happening."
I see... And a feature extraction might not be simply dimensionality reduction? Using a decoder might also help it?
Depends on data / task.
Not really sure what you mean here
those feature extraction NNs are normally made by having a encoder-decoder like architecture, or a NN that has some task, and then chopping the last few layers off
UNet has a encoder and a decoder, the last few layers are linear layers, if I remember correctly.
But VGG19 is just "encoder" + linear
Unet only has convolutional layers iirc
In the encoder part. The decoder part upsamples
The upsampling is also convolutional though no?
I think it has some conv layers which reduces the channels size and keep height and width, but they're followed by upsampling
I think the original one was conv + upsampling. Nowadays people tend to use tranpose conv I guess
Well anyways, vgg19 has convolutional and then fully connected, you can choose where you want to cut off
Ok, thanks!
I think I'll stick with the smallest dimension as possible and then try using higher dimensions for the features extracted...but not big enough to kill my GPU
It also depends on what you plan to do with the features
For now I'm planning to use for multi-label classification.
But now I'm curious... what if I want to use them for simple classification? How would things work in both cases?
Doesn't change too much, in both cases you would basically make a network that takes the features and it ends up with some values for the output nodes
The difference is the activation function for the final layer
multi-label would be sigmoid, and multi-class (so it's 1 of the classes always) would be softmax
(Oh yes, I get confused with that difference)
But what you are trying to do is basically transfer learning, using a trained model for a similar problem
Often you can use the original model (like vgg19) and then take of the final layer and stick a final layer on with the correct amount of nodes and activation function
And probably freeze the weights for the convolutional layers and maybe some earlier fully connected
And then train on your data
So you only really train some fully connected layers, and it uses feature extractionfrom the pre-trained weights
Nah, I actually want to train my extractor from scratch. I'll just use a similar architecture because...well...because probably the folks tried to extract the crème de la crème using...like...1000 Tesla GPUs...
Besides, I won't be classifying real world images, like VGG was trained on CIFAR, I'll classify games screenshots.
Well vgg probably recognizes very basic shapes and patterns too like roundness, and cornerness and w/e so you could still use some pre-trained layers
but vgg19 is p-retty big, so maybe just doing it from scratch would be easier 😛
yeah same thing with that
wait
it did this after 2000 epochs
[0, 0] : 0 : 0.07133328991870373
[0, 1] : 1 : 0.9867524208847188
[1, 0] : 1 : 0.04641128083308918
[1, 1] : 0 : 0.05997767236743279```
but that's still not solving the xor problem
the two zero predictions are close to zero, and one of the 1 predictions is close to 1
but the other 1 prediction is closer to zero than the zero prediction
I have this list of dictionaries I'm using for a boy with a neural network and I've been trying to do two things make it so I can exclude certain dictionaries and take those excluded dictionaries get a different result with them as shown above. But I'm really stumped and could use some help
Hello guys I have a question: when we have 32 neurons like this, whether it means we have 32 hidden states in the RNN model?
has anyone worked with a custom image classifier? for a live video?
cv2.imread(filepath)
Hi, I'm trying to give a color to whole row if column value exist in a list, but it is not working, below is the code I'm using:
d = {
'ids': ['id1', 'id2', 'id3', 'id4'],
'values': ['Vanilla', 'Chocolate', 'Butterscotch', 'Strawberry']
}
dd = pd.DataFrame(d)
def highlight(row):
if row.ids in ['id1', 'id3']:
return ['background-color: red']
else:
return ''
dd.style.apply(highlight, axis=1)
dd
Can someone help?
Hey everyone, I am new to Huggingface and I'm trying to Fine tune roberta-base on go_emotions dataset for Multi Label classification.
My final train dataset is a dictionary where the labels are one hot encoded.
I am using the Trainer class of Huggingface to train my model and when I run the trainer class an error pops up saying
"Value error: Classification metrics can't handle a mix of multilabel-indicator and binary targets"
I have tried decoding it by changing the shapes in the metrics function but nothing works.
Can someone please help me out with this?
Hey Python community, I hope this is ok to post here!
I’m currently building an academy for AI products. People can join and learn to build and launch a product based on AI.
You’ve probably seen the rush of AI products building on AI models like GPT3 and Stable Diffusion.
Am working with some friends all with backgrounds in machine learning / software engineering. We’re really excited about generative AI. We built a toy proof of concept here (https://use-persona.com/) that emulates reddit accounts.
Our experience here made us think it’s going to fundamentally change how we interact with computers.
We want to make sure as many people benefit from AI as possible. To do that, we’re making it as easy as possible for people to learn how to build and deploy products using these models. This will let you use AI to make a side income or solve a work problem.
The problem we are solving is: You have an idea for an AI-based solution to a problem, but you don’t know how to build it. We solve it by teaching you and dealing with the boring stuff (website, payments etc)
Here’s how we see it working:
- The academy will work as a live online 4-week course. This will be part-time (~10 hours per week)
- You’ll meet others on the program and share ideas
- Most of the learning is done by tackling on tasks specific to your idea. We’ll have weekly deadlines to keep you on track.
- There will be 1-to-1s with us and live sessions each week
- By the end, you’ll have built a product that solves a problem using AI. We’ll can handle hosting and infrastructure for you - but it’s your product and you can move it elsewhere if you choose.
The way we’ll make money is primarily through a 20% revenue share of the products that come out. We’d ask for an upfront deposit to make sure we don’t get our time wasted.
We’re accepting applications: https://pages.viral-loops.com/ai-product-academy
Otherwise, we’d love your feedback!
Hey! Just had a question regarding feature selection and data cleaning. Is selection generally done after cleaning or before?
Mainly want to confirm if my choice here is correct:
I have two feature columns that I do not want to select for training, and these are the only 2 columns that contain null values. I handle null values via removing the data row as a whole. I am thinking, wouldnt it just be better to remove the columns before checking for null values so that I can have more training examples/rows?
It depends. If you know you don't want to use a feature at all, then you don't need to clean it. But even if you don't use it directly, you might use it to drive other features, or something.
Alright thanks! In that case I'll remove them before cleaning (can retrain 500 training examples now)!
Hey i have a very messy CSV file which i imported. I need help with formatting it so the data matches the right headers. Im a beginner at python and ive been trying to solve this for a few hours. Can someone please help me?
can anyone help me with machine learning? i need to classify images according to their sentiments, i have the sentiments ready for each image and i also have the text seperated from each image and stored in a csv, now i have to train the images usingn ML models (from sklearn) but i dont know where to begin
Hey there I have a question on scaling some values, here's the issue:
I have a sparse matrix of tags for games on steam. I wanted to scale the values as frequencies for each game so for instance CSGO might have 90 tags for Shooter and only 10 tags for cooperative, I would want .9 for shooter for .1 for tags (something like that). However I also want to ensure that these weights consider the total amount of votes each game has, so that Shooter X with 9 tags for shooter and 1 tag for cooperate isn't considered just as popular than CSGO....
this might be as easy as not doing anything to my matrix but I was curious on yall's thoughts
@sturdy parrot let's chat here where it's less busy
Thank you for the time, yes.
As for experience, C, python and decent bit of js. I am quite comfortable and confident with python
If you've been writing C + Python for 2-ish years, I'd highly recommend working through the first few chapters of Aurelien Gerón's Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow
It is a great book, and the first chapter walks you through a project where work on housing data from California to try and find houses that are of good value from start to finish
Thank you for the recommendations!
If you prefer courses, I hear the fast.ai courses are very good, but they put a large emphasis on deep learning, which while it is very popular, only gets used in a handful of places
IMO the best way to show that you are competent (aside from job experience) in AI/ML is building your own projects that solve problems you've found in everyday life
I am not actively looking for jobs, I am more of a hobbyist.
I see. If that's the case, then you should definitely feel more free to explore what you like 😄
I have decent chunk of projects but unfortunately they are incomplete due to irl work / studies 😆
Are you more interested in just learning how things work, or building stuff from libraries that already exist?
Of course most people have some interest in both, but I can suggest content based on whether you prefer building vs. learning
I would say both :P
Currently I am more into chat bots, although that seems a bit main stream now.
It is quite mainstream now, haha. But there's nothing wrong with that!
AI/ML is building your own projects that solve problems you've found in everyday life
I have built projects (again, not complete) that has concerned with parsing. Recently I had projects where python code has been take and standard operations like+,-,*are converted toadd,sub,mulfrom theoperatorlibrary, and others whereoverloadsignature from typing are automatically inferred for kwargs from existing ones.
(actually I am in my alt account disguise 🕵️ , you can see my gh here https://github.com/Achxy if you are interested in my projects)
If you're specifically interested in chat bots, I'd highly recommend you look into:
-
spaCy, it's a Python library you can use to build software tools that process lots of language. I've used it professionally and it's wonderful. Here are some tutorials so you can get an idea of the sort of tasks that are commonly done with spaCy: https://github.com/explosion/projects/tree/v3/tutorials
-
The hot talk of the town powering all of these hyped language models are Transformers. Personally I think PyTorch is the best deep learning library for people who want to practice and learn about deep learning at the moment. Their tutorial collection is also very, very good: https://pytorch.org/tutorials/
These libraries are just tools. The point of me suggesting them is that you can learn about the basic concepts while simulataneously learning how to use these tools. Maybe you'll like other tools more, but regardless of whether you're using these tools or other tools, the same concepts will apply
Thank you very much for the time and willingness to help!!
I will be looking to the pytorch tutorials that you sent as I could look into more into general ai things before getting into chatbots in specific.
is there any site yall would recommend to train an image recognition model? i have the code but need a model would be highly appreciated. thanks.
For sure! Always happy to help!
For what it's worth, I had actually really good time learning how to work with text by writing search engines for text documents. I learned most of my NLP knowledge by trying to write a search engine for academic papers. You start off with basic concepts, and you can keep improving the search as you learn more about information retrieval.
If it needs to run on CPU, OpenCV + OpenVINO isn't a bad choice, otherwise I'd go with https://github.com/ultralytics/yolov5
the thing is i have no experience of working with image recognition can u get me started a lil on how to create a model?
What hardware do you have available?
cpu i have gpu i dont sadly
Is this just for learning? Are you trying to train an image classifier on custom data?
I'll give a overview, basically i have 5 pokemons and 8-10 pics of them i wish to create a classifier that names them. later on I'll like to name all the 800 or so pokemon so thats why i asked if whats a good way to train a model
Use google colaboratory then
yea but how do i get started?
Hm... I don't remember a good site with ML courses for free... But maybe coursera might help you if you're a student
You are probably going need a lot more than 8-10 pictures per class to get a deep learning model to perform well on pokemon.
Here is a dataset you can use: https://www.kaggle.com/datasets/kvpratama/pokemon-images-dataset
YOLOv5 comes with lots of documentation and tutorials that you can use to get started, they even have lots of content showing how you can use YOLOv5 in Colab: https://docs.ultralytics.com/
Oh yes, I think Kaggle has some tutorials on Machine Learning for free
nono i just need help as to how to speed up the process of training the ai, i was currently using teachable but it took a lot of time as i had to upload every image by hand
You can use a high learning rate and decrease it over time with a scheduler
I think the Colab free tier has GPUs available, but you have to be careful with time limits. You can definitely bulk upload the dataset to either GDrive or TFHub or any other service and just connect the Colab notebook to the dataset
A high learning rate will decrease your weights towards an optimal point faster. When the weights starts oscilating, you can decrease the learning rate so you can get better weights until they oscilate again
Just be careful to not overfit your model. Try repeating this until you get a learning rate of 1e-5
tysm yall
also can u tell when do many ml coders like to programm in jupyter or colab?
Convenience
and what are thd different block like things in there?
I'm actually personally not a huge fan of these interactive environments, but I have my own workstation with a GPU that I use for my hobbyist ML work
Usually, it's because that's what people who got into Data Science and ML used to learn. And it's also very beginner-friendly.
and what's the block things in those?
the blocks are "cells" which allow to execute Python snippets one cell at a time. A typical python script is passed to the interpreter, translated into bytecode, and then executed in program order.
With these interactive notebooks, you can control the program order by executing different cells at your own discretion
@grave swallow you can learn more about them by reading through the tutorial here: https://colab.research.google.com/notebooks/intro.ipynb
It's called cell. So you can turn a cell into a code cell / markdown / Raw NBConvert
tysm
tysm
i am trainning YOLO and noticed that augmenting data lessed the accuracy
I need help here
Want boundaries of x and y value
anyone used ML on FFT-data so far and can give some recommendation regarding a good model?
Or maybe a tip how i would be able to create a regression plot for n-freq.
Accuracy should not be your only measurement. F1 score is a better test.
Has anyone used Python/ML to do a multivariate correlation (many-to-one) analysis? I need to find where the most significant correlations are to then identify the variables I will use for a customer scoring model ("Expansion Score" something like that). Just curious if anyone has used similar methods and would be open to discussing the pseudo/exo build needed
Hello! I'm trying to get info from the U.S. Census using their API, and Python is throwing an error that I don't quite understand. The relevant portion of the code is:
r = requests.get(base_url, params=predicates) # Making a response object; "predicates" is the format of the request. "base_url" is self-explanatory
df = pd.DataFrame(columns=r.json()[0], data=r.json()[1:]) # Putting it into a dataframe, and using the json file to label the data
df["year"] = year
dfs.append(df) # Just so I know what year this is from, since that's not in the data itself
counties = pd.concat(dfs)
print(counties.head()) # Just to make sure I'm getting the data that I expect
However, I get the following error:
Traceback (most recent call last):
File "/usr/lib/python3.11/site-packages/requests/models.py", line 971, in json
return complexjson.loads(self.text, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.11/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "Main.py", line 36, in <module>
df = pd.DataFrame(columns=r.json()[0], data=r.json()[1:])
^^^^^^^^
File "/usr/lib/python3.11/site-packages/requests/models.py", line 975, in json
raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)
requests.exceptions.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
```
What exactly is it trying to tell me about the 'r.json' bit?
HI all, I am trying to follow this tutorial>
https://thedatafrog.com/en/articles/show-data-google-map-python/
I'm getting stuck on export GOOGLE_API_KEYS=<blah blah>
Its stating export is not recognized as a command
Can anybody advise what I'm doing wrong?
(update: the API request was structured wrong. That will do it)
does the backpropagation derivation change if you calculate loss differently?
yes
after all it's the derivative of the loss that you're computing 😛 naturally if you change the loss, it's derivative changes
ok second question. What is the difference between a delta and a gradient
is this code wrong?
what happens when you give a smaller subset of series a bigger number of values
it doesn't give a traceback though
I will answer this for you, but this is the last time I will answer any question you ask that involves a screenshot that could have been text.
a delta usually means a finite difference. gradients involve infinitesimals
the problem with using .loc in this context is that vt.loc[vt ....'value'] potentially has fewer rows than everything to the right of the =
yep
(I'm not typing it all out, so if you don't know what I'm referring to, put the code in the chat as text.)
but otherwise the terms are synonymous?
that already means they are not synonymous 👀
Generally I saw that they threw an error
if such a scenario arises
But does it just take the first n rows of rhs?
because that would be really bad
so why do we call it gradient descent if it only moves the weights with finite quantities? shouldn't we call it delta descent?
it might be that per_population doesn't have any nans.
you're mixing up many things at the same time here
you use the direction of the gradient to compute a delta for the parameters
how large that delta is is a completely separate question
and whether the algorithm works at all depends on whether the gradient is bounded and the curvature satisfies nice conditions
one has to prove that taking discrete deltas lets you solve the problem of setting the gradient to 0, it's not something one takes for granted
because the two things are not the same
so it is never correct to say that an algorithm "calculates the gradient"
nope. It always has
it computes the gradient, and then uses the gradient to compute a delta
you're gonna have to pick up a math book if you're struggling with the concepts
I'm looking at this from my lecture slides
this calculates the derivative of the error with respect to the weights
yes
is this the gradient or just a delta?
neither, it's a partial derivative
you use partial derivatives to compute the components of the gradient vector
a single partial derivative is a scalar though
ohhhhh the gradient is a vector
so that's the difference
all of the deltas together as a vector make up the gradient
no
it's "a" difference, not "the" difference
a delta can be a vector too
the difference is exactly what i told you
a delta is a finite difference
derivatives, including gradients, are infinitesimal and involve limits
It's a specific thing.
at this point i'd STRONGLY recommend you pick up a book, because you seem to be either struggling with the concepts or trying to interpret/make them up on your own
How things are named requires the specifics which is still locked away in books or at least something like wikipedia, but wikipedia kind of assumes that you can read and understand it, which is hard to do without enough math knowledge in general (e.g. from books).
(Also more of a reference)
*The reason why math has names for every specific thing is because they need the names to be specific in proofs. Which results in A LOT of different terminology.
that is, different terminologies for the same set of ideas?
Yes and no.
that does happen sometimes, depending on how you're interpreting the object
for example you might think of a row vector as just that, a row vector. or maybe as the transpose of a column vector. or maybe as a covector. or perhaps as a linear functional. or an element of the dual space
depending on what you're trying to do with it
(Also there are different spoken languages used, which also use different standard terms like identity vs neutral element)
(And also many things in math map to each other, so same thing, different POV (which is one of things that is really looked for in math))
I'm learning 
*Also, Ducky, I kind of skipped over the derivation for the gradient on purpose because it involves a lot of details that would have just been noise without more linear algebra and multivariate calculus knowledge.
(The actual full proper way has a lot of steps skipped by most posts you will find online / they are all hand-wavy / intuitive explanations)
(Many courses on DL don't do the full thing either)
multivar calc in most engineering degrees doesn't, either
ideally there'd be some discussion about differential forms and jacobians involved
Yeah, I suppose the reason why is because one can often get away with the informal way but it does leave one open to potential mistakes, which is not something I would want to teach for engineers making stuff like the bridges I drive on.
Though I think the idea of a two pass method would be pretty good, the informal way first, to get an overview, then repeat the whole thing again in detail. But for now schools systems have a "do course then done forever with the topic" kind of thing going on.
(Also repetition helps with memory)
(So I guess my answer to formalism vs intuitionism is both (as good of an understanding as possible (be aware when one is leaving holes open by being informal and when that matters)))
@serene scaffold loc works differently
That code was logically correct. I verified
Where should I put my csv data file when using pandas?
trying to following this tutorial but it doesnt indicate where I should place the datafile
https://thedatafrog.com/en/articles/show-data-google-map-python/
Also, I just get a FileNotFoundError and not pointing to where it is looking for the file
@vague sable any file input/output you do will be relative to the "current working directory". which you can figure out by doing import os and print(os.getcwd()). but if you're running python from a command line, it's going to be wherever the command line is in the file tree.
This is my working directory
C:\Users\daneb\anaconda3\envs\geovis
I put the file there and no luck
also, geneva is so weird. the way the boarders of switzerland scoop down and around to include the center--but not all--of its built-up area.
show code
I've not written any code - just following tha tutorial i posted
Im up to this part
but getting the file not found error when doing df = pd.read_csv('dvf_gex.csv')
please make a new cell with this code, and show the result as text. no screenshots
import os
import pathlib
print(os.getcwd())
print(list(pathlib.Path('.').glob('*.csv')))
>>> import pathlib
>>> print(os.getcwd())
C:\Users\daneb
>>> print(list(pathlib.Path('.').glob('*.csv')))
[]
>>>```
so you're using a REPL (read-evaluate-print loop)? the tutorial assumes you're using a notebook.
anyway, your current working directory is C:\Users\daneb and is not C:\Users\daneb\anaconda3\envs\geovis
Thanks, I whacked it in my user directory & it worked lol
YOu are correct - Ill go use a notebook now.
anyway thats just one person's take
but one can hope right
havent heard of github .devcontainers until now but it looks interesting
Hey all,
When I do
export GOOGLE_API_KEY=<your_key>
It doesnt work, so I use set instead of export which seemed to work but now I try to use the key with api_key = os.environ['GOOGLE_API_KEY'] and It fails
What could I be doing wrong
I did yep
and save it in an environment variable
Thats where I think im going wrong actually saving/storing it
you need to save it in the right directory
if you are in a notebook environment, you can check the current directory with the magic command %cd
then you can 'cd' to the correct directory where you saved it
yep
so i gone into the correct directory as far as I am aware
C:\Users\daneb\anaconda3\envs\geovis
once in thsat directory, I have exectured the following
export GOOGLE_API_KEY=<AIzaSyB***********************************>
Stars are just so nobdy gets my key
good practice
I get this error
File "C:\Users\daneb\AppData\Local\Temp\ipykernel_15540\2829495065.py", line 1 export GOOGLE_API_KEY=<AIza********> ^ SyntaxError: invalid syntax
typically API keys use the TOML format https://toml.io/en/
executing all of this from a notebook btw
then idk dude sorry
idk why you need to export it though if you already have the key saved in a separate file
you can just load it in later on when you need it
I've not saved the key anywehre
Im just following this tutorial
which tells me to run the export code
take a look at this https://stackoverflow.com/questions/70372120/how-to-use-api-key-without-directly-using-it-in-the-python-code
Yep awesome this is helping
reading through it all now
thank you
@misty flint got it working, thank you.
Hi! I just want to confirm if I am doing something wrong in my calculations.
I am currently trying to find if two categorical columns are corrolated/related. I have tried to run a correlation test and chi2 test, and I feel as though they are both going against each other result wise:
correlation: ```python
dfCopy = df.copy()
dfCopy['topic'] = dfCopy['topic'].astype('category').cat.codes
dfCopy['Label_bias'] = dfCopy['Label_bias'].astype('category').cat.codes
dfCopy.corr()['Label_bias']['topic']
output: -0.03312874060844754```
chi2 test:```python
method to calculate the p-value via chi squared test.
prints the p-value.
def chi_squared(column_to_compare):
# Create a cross table between topic and label bias.
# This will be used for the chisquared tests.
cross_tabulation = pd.crosstab(index=df[column_to_compare],columns=df['Label_bias'])
print(cross_tabulation)
chi_squared = chi2_contingency(cross_tabulation)
print('p-value is: ', chi_squared[1])
output: p-value is: 5.581641709475194e-09``` So from how i am understanding the data, the corrolation test says the two columns are not correlated - but the chi2 test says that the feature column is dependent to the output column
I am not sure if I am understanding the results incorrectly (ie maybe they are aligning?)
if i want to use python for data science, should i take a generic python bootcamp or just go for a python data science bootcamp?
Does anyone have "Dog Muzzle" image Dataset ?
I posted in #1035199133436354600 too (https://discord.com/channels/267624335836053506/1050740175698919434), but generally I don't understand how PyTorch keeps track of the gradients and which operations are allowed and which aren't. I want to rearrange individual values from one array to another, while keeping the gradient intact. PyTorch throws an error calling it an "in place" operation, which I don't see how it is, if I assign from one array/tensor/matrix to another. "In place" to me is e.g. a "+=" operation, which I'm not using anywhere.
Discord is the easiest way to communicate over voice, video, and text. Chat, hang out, and stay close with your friends and communities.
code: py model.compile(optimizer=keras.optimizers.Adam(learning_rate=0.01), loss=keras.losses.sparse_categorical_crossentropy(from_logits=True), metrics=['accuracy']) error on loss=keras.losses.sparse_categorical_crossentropy(from_logits=True),
error: ```
File "D:\real_Python\projects\draw_guess_game\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "D:\real_Python\projects\draw_guess_game\lib\site-packages\tensorflow\python\util\dispatch.py", line 1170,
in op_dispatch_handler
result = api_dispatcher.Dispatch(args, kwargs)
TypeError: Missing required positional argument
Hi guys, I'm new to Data Science and I'm interested in Process Mining and more specifically in the Alpha Miner algo. I have however a slight problem to understand what the highlighted numbers correspond to. Could someone enlighten me ?
Hello, sorry for the stupid question, but I can’t understand how I can get the number of unique devices in each group and name the Users column + the second column with the number of events, let’s say majong_first_tap and also their number in the group (in the Impressions column). the second screenshot is what i want to get
grouped = all_data_df.groupby('level')
print(grouped['device_id'].nunique())
this is the number of unique users in levels 0, 1, 2... but how else to display the number of some event in the same table, let's say the number in the zero level of the rubber_challenge_item_loaded event
like this
|level|Users|Event count|
| 0 | 400 | 200 |
Hi guys. My apologies if this isn’t the correct channel. But I’ve got an issue here I can’t seem to figure out. I could probably achieve this via Anaconda but how would I go about displaying the tables within this pdf
best mAP doesnt corrospond to my best f1 score
is that fine?
i dont know about mAP, but read that best epoch is decided based on max mAP
they measure different things, understand exactly what each measures and choose based on which you prefer for your application
while building MCTS tree, do we add all the possible actions for all pieces in the board?
hi
I recently started getting into ml stuff and I am pretty much a beginner. My end goal is to build a recommender system for my app but I am not sure if I should learn pytorch or go for scikit-learn / tensorflow
I have some experience with scikit-learn but none with tensorflow.. I think investing my time in pytorch will be better ?
is your app web or mobile
mobile
ok one is def better than the other for mobile apps
there was a chart somewhere i posted a long time ago
i cant remember off the top of my head
lol
one?
oh XD
pytorch vs tensorflow
ah i found it
you dont need SOTA models for this use case
so -> N
then Y for mobile
only you can answer the next part
thanks
yep. and your specific options are between Tensorflow Lite and Pytorch Live
gl and let me know how it goes if you end up trying one
ohh
I also read about torchrec somewhere
anyways I will search pytorch lite 👍
also I have decided to watch this lecture from FCC while implementing stuff by myself
https://youtu.be/V_xro1bcAuA
Learn PyTorch for deep learning in this comprehensive course for beginners. PyTorch is a machine learning framework written in Python.
✏️ Daniel Bourke developed this course. Check out his channel: https://www.youtube.com/channel/UCr8O8l5cCX85Oem1d18EezQ
🔗 Code: https://github.com/mrdbourke/pytorch-deep-learning
🔗 Ask a question: https://githu...
I think should be enough before getting into deeper stuff like recommendation system etc
Which database should I learn for data science
MySQL or MongoDB?
You should learn both.
Hi everyone! I'm trying to see if I can differentiate between random subsamples of populations with drastically different magnitudes. Is there any approach or technique to identify a difference in distribution in such a case?
you can try a kolmogorov smirnov test
Hey @lapis sequoia!
You either uploaded a .txt file or entered a message that was too long. Please use our paste bin instead.
Hi! is there anyone here that uses matplotlib?
i want a deep reinforcement ai which:
-
is standing on a block (orange)
-
can only move up, right or diagonally. if it moves diagonally it gets -1 point.
-
can move only once
-
if it steps in (red) -1 point.
-
if it stays in one place, it gets -1.
-
if it reached green it gets +1 point.
what will the ai do ???
???? help me...
Thinking about it it will learn to move diagonally and recieve 0 as a total reward
how to scrape user requests from a website using python(i am new to data science)?
So there’s a exhibition today for a jewlery company that used GAN art and it’s the oddest thing ever seeing members of the public watching a explanation of ML
so I recently just implemented prioritized experience replay for my DDQN and I now have this strange looking reward graph
the reward seems to be increasing then decreasing?
and also this loss graph as well
the snake agent seems to never learn to get the apple but instead crashes into the wall
when I turned off prioritization it seemed a lot more successful
why is this?
you could solve this without neural networks, q learning could solve this
I am working a personal project based o COCO car damage challenge. There is COCOEvaluator that returns the AP metrics based on model prediction. I am trying to create a custom metric that explains how many damages or percentage of damages (based on IOU) the model finds across the test dataset. Note that I have replaced every class of the problem to one class "Damage". Is there a way to get the TP/FP and TN/FN of the prediction to begin with, using detectron2 model outputs?
Anyone with knowledge in stats and analysis using python dm me
Is it best practice to look at code submissions in kaggle? Or develop the solution on my own? Still struggling to understand the data sets
which dataset?
Competition ones
i mean for inspo its always good to be able to compare urself with someone else
that said u should first try it on ur own
Do we need to learn OOP in python in order to be a successful Data Engineer?
yes
Your answer would be yes in case of becoming a data analyst as well?
less necessary for data analysts
Sorry, new to pandas: How do I divide a range of items in a row by another item in the same row (and then use that for all rows in the same dataframe) ?
(e.g. Divide items in row index 1 with index 83 through 102 by the value in row index 1, column index 14, then keep doing that for every row. So, for example, the values in row index 2, column index 83:102 are divided by the value in row index 2, column 14, same with row 3, row 4, etc)
(Some kind of 'for' loop is required. What kind, I don't know)
Hey, sorry to interrupt ~. I have small question on how to better do with project I am currently working on. Basically I have volatility index and company revenue, I know they correlate at about 0.25 but the whole dataset is pretty crappy and it’s more of a cloud rather than clear fit. So how should I go with it to create high scoring prediction from values which don’t correlate well ?
Honestly it’s much faster to do in excel rather than think of pandas logics (it is not Efficient for that task), by the time I was writing that you could have already completed dataset
The data set I'm working with is too big to analyze in Excel. If you try to open the csv, the computer will crash.
(I know, I tried)
Hello, I started HCIA-AI V3 certificate but I can not memorize all methods of machine learning because I did not practice in code, Where can I practice doing simple machine learning project?
If your computer crashes on opening csv than any kind of “for” looping in python on pandas dataframe will crash too, since it’s extremely slow and inefficient. Have you tried cloud based excel/google sheets ?
Hm, no I haven't. That might be an option.
I figured a for loop would take a very long time, but I'd be able to work with the data at the end of it. Excel just taps out when you try to open the file.
(Sorry I should have been more precise: the computer doesn't crash. Excel crashes)
Well you can just for loop over the indexes, not pandas df. And access values of each element through index loop like df.iloc[] and do what you wanted with them
But this will be slow too
Slow is fine. I just want to be able to work with it.
Could someone explain what a target variable is? I'm trying to learn decision trees in python and I was confused by the use of "target" in the code terminology
help plz
not likely.
yeah the thing is about making an ai code is that you'll also have to make sure the code it wrote not only works without erroring but also that it does the thing you want it to
and in order to make sure it does the thing you want it to do, you'll have to have people (the aforementioned software engineers that it's "replacing") check it
I've been using codex quite a bit lately just seeing how it does things and most of the time the code it writes is pretty wrong
It's useful in accelerating my workflow (if i have to repeat a function on several variables it can infer that, and hitting tab is faster than ctrl c ctrl v) but not in making its own code, and I think that'll be similar with chatgpt
how hard would it be to train an ai to recognize when a basket ball goes into a hoop
in a fast pace and not get mixed up with other balls
@grim patrol I didn't get a chance to see your ping till now, the function is called for each individual sample so random augmentations would be different each time you run it
I posted this in general but got no response, i think it may have been too advanced to get a quick answer on. If anyone happens to have experience with this I'd really appreciate your help:
I've got a sort of desperate question for anyone with RL experience, is it a bad idea to use Proximal Policy Optimization to learn a backflip (humanoid) in simulation (pybullet in this case)?
I wrote my own PPO with pybullet/pytorch since no implementations seemed to be able to run on my machine. I know PPO is super sensitive to hyperparameters so I've been tuning them with bayesian optimization for the past couple hours but I'm getting nervous since i'm not seeing anything converge. If anyone can tell me whether PPO is simply not a possible solution (to learning a backflip from an expert trajectory) or if it is and I should be patient and wait for tuning to finish, I would be eternally grateful...
Guys how i do for make a slice in a value on columns
example:
str = "joao (test)
valor = str.find("(")
and after i want string = str[:valor]
but in my columns of pandas how i do this?
could someone train an ai to use chat gpt3?
that's what chatgpt is

peeps are gonna feel the real crunch when they close this "research period"
hehehe
also the CEO himself says the compute costs are "eye-watering" so they will have to monetize it somehow

which makes sense. i cant even imagine their aws bill
more like hype phase
bye bye money
I thought microsoft gave them compute?
or was that just for training
the verge says theyre using aws for inference for chatgpt
and the aws bill is ungodly
if someone says a model is a learned what is meant by that
Hey there I had a Q on file management. So far I am in a project where I have gathered the data, cleaned the raw data and this is it.
My question is how should I organize my files? Should I make one for data gathering, data cleaning, EDA, feature engineering, model building? What do you guys typically do?
Stats question:
Talent is a good predictor of a football team winning a game.
Home field advantage is a weaker predictor but still relevant.
Question, how could probability for these attributes be combined? As in, how can I add to the probability of a win if the talented team also has home field advantage.
depends on which kind of model you are building?
also on how you measure "talent"
talent would be a ranking of high quality recruits
so it's a dataset that is already compiled from publications
that does not explains anything about how you are actually measuring it, to be more specific, as a number (or a series of numbers)
"is home field" is a either True or False (boolean), not much to think about there
I guess that turning "talent" into numbers could be ranking them in decreasing order (1, 2, 3, 4, ... with 1 being the "best") or giving them a rating (2.3, 7.8, 10.0 etc with higher being better)
I see what you mean, the talent ranking would be a number
Yeah, just like that. 1 is the best.
now as for which kind of model you are building?
(after you think about what are your inputs and outputs are gonna be, you have to pick a model that can generate that kind of output)
This is out of my depth. I was thinking more along the lines of standard probability calculations, like how you multiply the probability of independent events in order to get the probability that a specific outcome will occur in sequence
someone trained an image transformer on millions of tumblr screenshots and it generated this image
They used this model to do it:
https://github.com/Aleph-Alpha/magma
what I don't understand is if you look at the top of the image, it's actually really accurate generated text, and as you get lower the quality deteriorates more and more. What would cause this? Is the transformer generating the image from top to bottom and somehow losing quality with every line of pixels?
if I had to guess, it almost always sees text in the top part of the image, but it sees different things towards the bottom more often
yeah that sounds plausible
I'm pretty dumbstruck that image transformers can solve nlp problems now, we've come a long way
tbh I forgot what are our rules/policies on self-promotion, but if you are going to, at the very least make it explicit that it is your own post
<@&831776746206265384> ads crossposted on multiple channels
@lapis sequoia we don't allow self promotion that is unrelated to the conversation
do anyone of you know any good courses for data science ?
yo
is this solved yet or do you still need help?
@soft badge if u still need help with this, just @ me
Hello! I tried installing the stable-diffusion library for python using: "pip install stable-diffusion-ai", but got this error (in the picture): Fatal error in launcher: Unable to create process using '"c:\python27\python.exe" "C:\Python27\Scripts\pip.exe" install stable-diffusion-ai': The system cannot find the file specified.
am not sure why its doing this
Anyone have a good guide to pandas in general, looking to up my skills in data management
Specifically using Dataframes*
in albumentations, does anyone know how to get the bounding boxes value after you've applied an augmentation pipeline to an image you want to augment?
this doc shows visualization of the bounding boxes, but how do you actually fetch those bounding boxes in numerical values after augmentation? https://albumentations.ai/docs/examples/example_bboxes/#define-functions-to-visualize-bounding-boxes-and-class-labels-on-an-image
Question: What is the the most important disk access pattern of typical big data applications? Support your answers with arguments.
Professor reply:
The most important access pattern is sequential reads. Adding ‘‘ over
large collections’’ would be even more accurate. This is because the
typical big data applications aim to model the entirety of the data
somehow, and to do that the whole data must be read. Furthermore,
many approaches require repeatedly reading the collection. However,
the specific order of reading the collection is rarely important, allowing
for sequential reads in whatever order the collections happen to be
in.
OpenAI reply:
The most important disk access pattern of typical big data applications is random access. This is because big data applications often involve large and complex datasets that are stored across multiple disks, and that require fast and flexible access for data processing and analytics tasks. Random access allows the application to access any part of the data quickly and efficiently, without the need to sequentially scan through the data.
Random access is particularly important for big data applications that involve iterative and interactive data processing, such as machine learning and analytics. These applications often require fast and flexible access to the data in order to train and evaluate models, run queries, or generate reports. Random access can enable the application to access and update the data quickly and efficiently, which can improve the performance and responsiveness of the application.
In contrast, other disk access patterns, such as sequential access, are less suitable for big data applications. Sequential access involves reading or writing the data in a sequential order, which can be slow and inefficient for applications that require random access to the data. Additionally, sequential access can be limited by the speed of the disk, which can be a major constraint for applications that require high performance.
Overall, random access is the most important disk access pattern for typical big data applications, as it enables fast and flexible access to the data, which is essential for data processing and analytics tasks.
Which one do you guys agree with most? Professor said it is sequential reads, while the openAI said it is random access XD
i did see how positional encoding is done and how its fed into transformer, but i am unable to understand how they are used so that sequence on data is understood
Never trust chatGPT to be correct.
It will tell you the oceans are made of Sprite and the sky is green, all while sounding super confident.
a world we all like to live in
sticky beach, nop
This feels like every recent AI model so far... 
Stares at OpenAI with hatred 
does any one know what go Libraries to use for xgboosting a nlp modle
Hi - I am new to pandas and I have this:
states_medals_summer = athletes[athletes["Season"] == "Summer"].groupby(['NOC'])["Medal"].count()
it seems to be promising, i have some use cases in mind and i'd like to try it out but currently don't have the hardware needed
Hey guys Im new to ML trying to learn decision trees for an internship and I get the concept of the trees and how they evaluate which splits are the best and which kinda qualifying attribtued of the vector are most important. However, the code implementation in python still really confuses me
But when I do this:
states_medals_summer[states_medals_summer["NOC"] == "ALG"].head()
I get: KeyError: 'NOC'
I first just needed help making sense of this pseudocode
So for this scenario, attributes are all these 10 parameters listed here
And examples are the data we pass in? how do these differ from parent examples
guys what is happen with sklearn?
because i install but is version 0.0 and i can not import on file, anyone know how fix?
Hi I'm very new to yolov5 and computer vision in general(Just started today) and I was following a tutorial on YouTube about the topic. I was kinda understanding everything so far then I went on to real time detections then encountered this error. tbh I have no idea what I'm doing
This is the code btw
sounds like frame is None.
So how exactly do I fix it
vision computation is dificult understand?
So you fixed the error? because previously you got an error on that line
The error isn't fixed, the line with the image variable is completely fine but when I run the video capture code I get this "AttributeError: 'NoneType' object has no attribute 'shape' "
It's now showing me this
try it in a normal python interpreter rather than a notebook
from past experience I've always had issues using opencv's imshow in jupyter
or try plt.imshow instead of cv2.imshow
Hello, I have a problem regarding Keras BiLSTM training
In order to convert my data frame into sequence matrices, I tried pad_sequece but it didn't work
Then I tried keras.utils.timeseries_dataset_from_array to no avail
Then I tried keras.layers.TextVectorization
Again didn't work
This time I converted my data in the df, into arrays
And the again attempted the TextVectorization
And when I run the from_tensor_slices, it gives me an error saying it failed to convert a Numpy array to a tensor due to unsupported object type : BatchDataset
I'm still a student, so sorry if it's an obvious question. Would you please offer some clarification?
I'm trying to train a machine learning model to predict the price of an item based off of its attributes, but I'm unsure as to the architecture of DNN i should be using.
After processing, the data has 7823 inputs with 1 output and anywhere between 200k-20m samples are available.
I used sklearn StandardScaler on both the input and on the output data before passing it to the model.
are you looking at the kaggle competition too?
no this is for something else, what competition? i'm interested lmao
just predicting sales in a list of test shops based on previous data
no cash prize for this one tho
all the variables are in russian
bruh
i also have no idea how to choose the architecture for DNNs
so far, i just follow the general recommendations: spam some filters at the beginning, condense, and slap some regularization tools at the end
Is this enough data to say that ranking is a better predictor of a win than home field advantage? I'm trying to understand if selecting the better predictor when having simple probabilities like this can be any more complex.
maybe a chi square test is applicable here?
it would appear so, but not by much
Thank you!
Lol, I've just got into Stack Overflow and saw a notice about their policy regarding that
Looks like OpenAI just created one of the best scammers so far...and it's not even human
chatGPT is pretty funny
I asked it to come up with myths about cooking steak
and it said that you shouldn't let your steak rest to long because it would over cook it
I suppose that could only be true if you ate it while it was so hot it burnt your mouth
"ViViT: A Video Vision Transformer"
this was the first transformer with "temporal" attention module, right?
for "video" I mean
6 + 7 = 13
This shows that the sum of 6 and 7 is 14. I apologize again if my previous response was incorrect, and I hope this helps clarify the issue. Let me know if you have any other questions.```
very funny lmao its tweaking if u tell it that its wrong
Hi, are there any use cases for using speech-to-text with recommendation systems?
if you want your recommendation system to accept voice input. what recommendation system?
Hi @serene scaffold thank you. I’d like to build one from scratch that accepts voice.
how "scratch" is from scratch? can you use libraries?
I’d like to build it from scratch for practice but will use a speech recognition library and a Python web framework
do you want to train the speech recognition model yourself, or use an off-the-shelf one?
@serene scaffold I’m going to use an off the shelf one.
okay. what about the recommendation system? what's that supposed to do?
I’d like to build a recommendation systems for books. Where people give reviews on books and based on those they’d get other recommendations for books they should read.
can you think of what components that would need to have?
In terms of the features I’d like to have?
what do you know about NLP and recommendation systems in general?
Oh, I see. Currently I’m going through a course on building recommendation systems and NLP. I’d like to build the book system with speech-to-text as my final project. I was just trying to make sure it wasn’t overkill with the speech recognition part and there are use cases for it.
I would focus on just the recommendation system for now. it wouldn't be difficult to add the StT part onto it.
Ok gotcha! Thank you, I’ll do that. It’s been great learning about these systems.
what's the main reason my model loss would suddenly drop like that? I can't quite figure out
Hi, In my new task I am suppose to convert R -> Python. Is there any cheatsheet/documentation to start. Any help is appreciated. Thanks in advance.
does anyone in here know how to wait for an openai completion request to finish processing
that might just be the stupidest question ive ever seen
oh
That type of attitude is not welcome
Let's stay welcoming and respectful.
Why is it a stupid question?
It's not a stupid question. Ignore them.
Can you expand on your question?
Well I'm trying to implement the new openai chat thing into my discord bot, and I'm running into an issue where the bot tries to respond before the AI has finished processing its response. I'm wondering if theres a way to delay the response until the AI finishes processing its response
What does the code look like? I haven't used either libraries
async def ai(self, interaction: discord.Interaction, prompt: str):
with open('../config.json', 'r') as f:
data = json.load(f)
openai.api_key = data['OPENAI_KEY']
response = openai.Completion.create(
model="text-davinci-003",
prompt=prompt,
temperature=0.80,
max_tokens=1500,
top_p=1,
frequency_penalty=0.80,
presence_penalty=0
)
response_embed = discord.Embed(
title=prompt,
description=response['choices'][0]['text'],
color=discord.Color.green())
await asyncio.sleep(len(prompt*2))
await interaction.response.defer(thinking=True)
await interaction.edit_original_response(embed=response_embed)
I have a simple command here that just creates the request from the API and then places the text into an embed and sends it. but the request takes time to finish processing. Right now I have a hard sleep coded in to try to account for that time
hey guys this is the main channel for ai right so I'm trying to make a ai that gets progressively better at aiming/shooting with a bow and my question is where do start when it comes to ai?
What does it mean to evaluate two models then compare their performances with each other?
basically comparison
but what is the difference between 'evaluate' and 'compare'? Because those words mean the same thing but they have them twice in the above statement so it makes me think it is asking for two seperate things
let me explain using an example., let's say you have two designs for a digital multiplier. you are not sure which is better so naturally, you need to compare the two designs. so then how do you compare the two designs in this case? you basically have to look at things like overall delay or speed, implementation size, and say maybe whether they can perform both signed and unsigned multiplication or just one of the two. so these are basically things you can evaluate in order to be able to compare the two things. hopefully that makes sense.
What does df.resample() do?
example of code here:
a = transactions_df.set_index("date").resample("M").transactions.mean().reset_index()
a["year"] = a.date.dt.year
px.line(a, x='date', y='transactions', color='year',title = "Monthly Average Transactions" )```
is it just a group by date then sum the transaction amount or there is a special function here?
guys should i continue ML/AI path because i like ai i have done some basic just on the basis of tutorial but i m not too comfortable with advanced math like i can understand basic like linear regression and things like that but is math must need for it ? or should i switch my path?
you can use AI libraries without knowing the math, but your understanding and problem-solving will be hindered
ML is, after all, math
yeah you can continue
you can learn the maths on the go
